Zomato Restaurant Data Analysis With Mapbox n Plotly

zomato logo

Zomato is an Indian restaurant search and discovery service founded in 2008 by Deepinder Goyal and Pankaj Chaddah. It currently operates in 2 dozen countries.It provides information and reviews of restaurants, including images of menus where the restaurant does not have its own website and also facilitates online delivery.

About the dataset

  1. Restaurant Id: Unique id of every restaurant across various cities of the world
  2. Restaurant Name: Name of the restaurant
  3. Country Code: Country in which restaurant is located
  4. City: City in which restaurant is located
  5. Address: Address of the restaurant
  6. Locality: Location in the city
  7. Locality Verbose: Detailed description of the locality
  8. Longitude: Longitude coordinate of the restaurant's location
  9. Latitude: Latitude coordinate of the restaurant's location
  10. Cuisines: Cuisines offered by the restaurant
  11. Average Cost for two: Cost for two people in different currencies 👫
  12. Currency: Currency of the country • Has Table booking: yes/no
  13. Has Online delivery: yes/ no
  14. Is delivering: yes/ no
  15. Switch to order menu: yes/no
  16. Price range: range of price of food
  17. Aggregate Rating: Average rating out of 5
  18. Rating color: depending upon the average rating color
  19. Rating text: text on the basis of rating of rating
  20. Votes: Number of ratings casted by people

Importing Libraries and Dataset

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from PIL import Image
from wordcloud import WordCloud, STOPWORDS

import plotly as py
import plotly.graph_objs as go
import plotly.figure_factory as ff
import plotly.offline as pyo
pyo.init_notebook_mode()
from plotly import tools

py.offline.init_notebook_mode(connected=True)

print('Done!')
Done!
In [2]:
# Importing the dataset
zomato = pd.read_csv('zomato.csv')

# Viewing 3 random sample from the dataframe
zomato.sample(3)
Out[2]:
Restaurant ID Restaurant Name Country Code City Address Locality Locality Verbose Longitude Latitude Cuisines ... Currency Has Table booking Has Online delivery Is delivering now Switch to order menu Price range Aggregate rating Rating color Rating text Votes
257 18037826 Kebab Xpress 1 Faridabad Shop 23-24, 2nd Floor, Crown Interiorz Mall, S... Crown Interiorz Mall, Sector 35, Faridabad Crown Interiorz Mall, Sector 35, Faridabad, Fa... 77.307448 28.469863 North Indian, Mughlai ... Indian Rupees(Rs.) No Yes No No 2 2.8 Orange Average 49
958 18082232 Chicago Pizza 1 Gurgaon FC-08B, Food Court, 3rd Floor, MGF Metropolita... MGF Metropolitan Mall, MG Road MGF Metropolitan Mall, MG Road, Gurgaon 77.080144 28.480317 Pizza ... Indian Rupees(Rs.) No No No No 2 2.3 Red Poor 29
729 18223957 Foodaucity - Bum Bum Bholey Ke Chole Bhature 1 Gurgaon Near Rapid Metro, DLF Phase 3, Gurgaon DLF Phase 3 DLF Phase 3, Gurgaon 77.093723 28.493809 Street Food ... Indian Rupees(Rs.) No No No No 1 3.3 Orange Average 14

3 rows × 21 columns

In [3]:
# Size of the data
print('Shape of the df : ', zomato.shape)
Shape of the df :  (9551, 21)

Removing duplicates if any

In [4]:
# dropping duplicates
zomato.drop_duplicates(subset = 'Restaurant ID', keep = 'first', inplace = True)

# Size of the data
zomato.shape
Out[4]:
(9551, 21)

Replacing country codes to country names

In [5]:
zomato.replace({
    'Country Code' : {
        1 : 'India',
        14 : 'Australia',
        30 : 'Brazil',
        37 : 'Canada',
        94 : 'Indonesia',
        148 : 'New Zealand',
        162 : 'Phillipines',
        166 : 'Qatar',
        184 : 'Singapore',
        189 : 'South Africa',
        191 : 'Sri Lanka',
        208 : 'Turkey',
        214 : 'UAE',
        215 : 'United Kingdom',
        216 : 'United States',
    }
}, inplace = True)

# renaming col country code to country 
zomato.rename(columns = {'Country Code' : 'Country'}, inplace = True)

Now, that we have converted Country codes to Country names, let's see the countries where zomato operates and also the number of restaurants in that country.

Distribution of Restaurants of zomato (country wise)

In [6]:
# creating a df 'za' to save country and no of restaurants in that country

# country name
x = zomato.Country.value_counts().index.values

# no. of restaurants in a particular country
y = zomato.Country.value_counts().values

# creating a df
za = pd.DataFrame({
    'Country': x,
    '# of Restaurants': y,
})

za.head(3)
Out[6]:
Country # of Restaurants
0 India 8652
1 United States 434
2 United Kingdom 80
In [7]:
choropleth_map = dict (
    type = 'choropleth',
    locations = za['Country'],
    locationmode='country names',
    colorscale = 'Rainbow',
    z = za['# of Restaurants'],
    showscale = False,)

layout = go.Layout (
    title = go.layout.Title(
        text = "Zomato's presence on the planet"
    ),
)

fig = go.Figure(data = [choropleth_map], layout = layout)
py.offline.iplot(fig)
In [8]:
# Of restaurants in India
nor_India = za[za['Country'] == 'India']['# of Restaurants'][0]

# of Indian restaurants
percent = nor_India / (za['# of Restaurants'].sum())

print('Percentage of indian restaurants is: ', percent * 100)
Percentage of indian restaurants is:  90.58737304994241

Inference :

Zomato has most number of restaurants in India(8.6K), followed by USA(434), UK(80) and UAE(60). These numbers are not at all surprising as Zomato is an India based startup.

Since in our dataset about 90% of data is of zomato-India, we will explore the data for Indian restaurants only.


Preparing zomato_India dataset

In [9]:
# Restaurants from India only
zomato_India = zomato[zomato.Country == 'India']

# Viewing 3 random samples
zomato_India.sample(3)
Out[9]:
Restaurant ID Restaurant Name Country City Address Locality Locality Verbose Longitude Latitude Cuisines ... Currency Has Table booking Has Online delivery Is delivering now Switch to order menu Price range Aggregate rating Rating color Rating text Votes
1336 894 Raj Restaurant India Gurgaon 10, Green Woods Plaza, Green Wood City, Sector... Sector 45 Sector 45, Gurgaon 77.059623 28.444858 Chinese, North Indian, Mughlai ... Indian Rupees(Rs.) No Yes No No 2 2.4 Red Poor 92
2516 5451 Al Saad Foods India New Delhi Shop 13, Rajdhani D.D.A. Market, Near Badi Mas... Daryaganj Daryaganj, New Delhi 77.232163 28.643322 North Indian, Fast Food ... Indian Rupees(Rs.) No No No No 1 3.2 Orange Average 12
2616 760 Moets Stone India New Delhi 50, Moets Restaurant Complex, Defence Colony, ... Defence Colony Defence Colony, New Delhi 77.230412 28.573212 Italian ... Indian Rupees(Rs.) Yes Yes No No 4 3.8 Yellow Good 354

3 rows × 21 columns

  • Now that we have only Indian restaurants in our a data, we can drop the Country, Currency from the data set.
  • Also we have see earlier that there were no duplicates, we can drop Restaurant ID as well.
  • Also we can drop Rating color as the same info is given in the Rating text col.
In [10]:
# Lets remove some noise from zomato_India dataset
zomato_India.drop(columns = ['Restaurant ID', 'Country', 'Currency', 'Rating color'], inplace = True)

# shape of dataset
print('Shape :', zomato_India.shape)

# 3 random samples
zomato_India.sample(3)
Shape : (8652, 17)
Out[10]:
Restaurant Name City Address Locality Locality Verbose Longitude Latitude Cuisines Average Cost for two Has Table booking Has Online delivery Is delivering now Switch to order menu Price range Aggregate rating Rating text Votes
5562 Kabbaba New Delhi A2, Shop 9-12, JDS, DDA Community Center, Pasc... Paschim Vihar Paschim Vihar, New Delhi 77.101442 28.670041 North Indian, Chinese, Fast Food 500 No Yes No No 2 3.3 Average 31
7766 Cafe Coffee Day Noida IT Towers, Plot 24, Sector 16-A, Near Sector 1... Sector 16 Sector 16, Noida 0.000000 0.000000 Cafe 450 No No No No 1 0.0 Not rated 2
8031 Chicago Pizza Noida Glued Reloaded, Dynamic House, Next to HP Petr... Sector 41 Sector 41, Noida 77.360931 28.561454 Pizza 600 No No No No 2 0.0 Not rated 1

EDA

Okay, now that we have our data for India ready and also we have lat and lon of all the restaurants, let's mark these restaurants on the map of India and see the spread of zomato across the nation.

In [11]:
mapbox_access_token = 'pk.eyJ1IjoiYXZpa2FzbGl3YWwiLCJhIjoiY2p4MDhzYjI0MTg1bjQwcG05cjZqNjRtaiJ9.9MKul4M02Wp2TV3Fx-TwoQ'

data = [
    go.Scattermapbox(
        lat = zomato_India['Latitude'],
        lon = zomato_India['Longitude'],
        mode = 'markers',
        marker = go.scattermapbox.Marker(
            size = 5,
            color = '#cb202d'
        ),
        text = zomato_India['Restaurant Name'],
    )
]

layout = go.Layout(
    autosize = True,
    hovermode = 'closest',
    mapbox = go.layout.Mapbox(
        accesstoken = mapbox_access_token,
        bearing = 0,
        center = go.layout.mapbox.Center(
            lat = 26.52,
            lon = 78.37
        ),
        pitch = 1,
        zoom = 4,
        style = 'light'
    ),
    title = go.layout.Title (
        text = 'Zomato restaurants in India'
    ),
)

fig = go.Figure(data=data, layout=layout)
pyo.iplot(fig, filename='Multiple Mapbox')

Panning around the map we can clearly see that, except for a few states zomato has its presence in all the major cities of the country.

Also, one can clearly see that the national capital region (NCR) has maximum number of restaurants tied up with zomato.

Top 10 Indian cities for zomato

In [12]:
cities10_name = zomato_India.City.value_counts()[:9].index
cities10_value_log = np.log(zomato_India.City.value_counts()[:9])

data = [go.Bar(
            x = cities10_name,
            y = cities10_value_log,
            text = cities10_name,
            marker=dict(
                color=['red', 'orange',
               'green', 'blue','violet',
               'rgba(204,204,204,1)','rgba(204,204,204,1)',
               'rgba(204,204,204,1)','rgba(204,204,204,1)',
               'rgba(204,204,204,1)']),
        )]

layout = go.Layout(
    title = 'Top 10 citites for zomato',
    xaxis = dict(
        tickfont=dict(
            size=14,
            color='rgb(107, 107, 107)'
        )
    ),
    yaxis = dict(
        title='Log10 of Number of Restaurant',
        tickfont=dict(
            size=14,
            color='rgb(107, 107, 107)'
        ),
        titlefont=dict(
            size=16,
            color='rgb(107, 107, 107)'
        ),
    )
)

fig = go.Figure(data = data, layout = layout)

pyo.iplot(fig, filename='color-bar')

Inference :

As seen in the map above, zomato has its presence in almost all the states of India, but NCR seems to be its fav and this fact is later proved from the barplot in which top 5 cities for zomato are from NCR.

Hence, so far we have established the fact that India is top market for zomato and in India, NCR's market has a large presence of zomato


Segregating Zomato_India data as zomato_ncr and zomato_india (Not including NCR)

In [13]:
# ncr
zomato_ncr = zomato_India.loc[(zomato_India.City).isin(['New Delhi','Gurgaon','Noida','Faridabad'])]

# rest of india
zomato_india = zomato_India.loc[~(zomato_India.City).isin(['New Delhi','Gurgaon','Noida','Faridabad'])]

Does Price have an affect on rating?

Distribution of Avg price for 2 people NCR & Non-NCR

In [14]:
# Non NCR
zomato_india['Average Cost for two'].describe()
Out[14]:
count     730.000000
mean      873.616438
std       485.830011
min         0.000000
25%       500.000000
50%       800.000000
75%      1200.000000
max      3600.000000
Name: Average Cost for two, dtype: float64
In [15]:
zero = zomato_india[zomato_india['Average Cost for two'] == 0]
len(zero)
Out[15]:
9

Min is Rs.0 !!, I think, 0 here represents the absence of data. since only 9 values are missing, I am replacing 0 by the mean price

In [16]:
# replacing 0 by mean price
zomato_india['Average Cost for two'].replace({
    0: zomato_India['Average Cost for two'].mean()
}, inplace = True)

zomato_india['Average Cost for two'].describe()
Out[16]:
count     730.000000
mean      881.301826
std       476.783479
min       100.000000
25%       500.000000
50%       800.000000
75%      1200.000000
max      3600.000000
Name: Average Cost for two, dtype: float64
In [17]:
# NCR
zomato_ncr['Average Cost for two'].describe()
Out[17]:
count    7922.000000
mean      600.310528
std       599.587540
min        50.000000
25%       300.000000
50%       450.000000
75%       650.000000
max      8000.000000
Name: Average Cost for two, dtype: float64
In [18]:
trace0 = go.Bar(
    x = zomato_india['Average Cost for two'].value_counts().index,
    y = np.log(zomato_india['Average Cost for two'].value_counts().values),
    text = (zomato_india['Average Cost for two'].value_counts().values),
)

layout = go.Layout(
    barmode ='group',
    shapes = [
        # Line reference to the axes
        {
            'type': 'line',
            'xref': 'x',
            'yref': 'y',
            'x0': zomato_india['Average Cost for two'].mean(),
            'y0': 0,
            'x1': zomato_india['Average Cost for two'].mean(),
            'y1': 8,
            'line': {
                'color': 'red',
                'width': 3,
                'dash': 'dashdot',
            },
        },
    ],
    annotations = [
        dict(
            x = zomato_india['Average Cost for two'].mean() + 5,
            y = 7,
            xref = 'x',
            yref = 'y',
            text = 'Average Price = 881.0 Rs',
            showarrow = True,
            arrowhead = 3,
            ax = 100,
            ay = 0,
        )
    ],
    title = go.layout.Title(
        text='India (Not including NCR)'  + '<br />' + 'Hover on bars for No. of Restaurants',
    ),
    xaxis = go.layout.XAxis(
        title = go.layout.xaxis.Title(
            text = 'Average Price in Rs. for Two People',
            font = dict(
                family = 'Courier New, monospace',
                size = 18,
                color = '#7f7f7f'
            )
        )
    ),
    yaxis = go.layout.YAxis(
        title = go.layout.yaxis.Title(
            text = 'Log10 of Number of restaurants',
            font = dict(
                family = 'Courier New, monospace',
                size = 18,
                color = '#7f7f7f'
            )
        )
    )
)

data = [trace0]

fig = go.Figure(data = data, layout = layout)
pyo.iplot(fig, filename='grouped-bar')

In [19]:
trace0 = go.Bar(
    x = zomato_ncr['Average Cost for two'].value_counts().index,
    y = np.log(zomato_ncr['Average Cost for two'].value_counts().values),
    text = (zomato_ncr['Average Cost for two'].value_counts().values),
)

layout = go.Layout(
    barmode ='group',
    shapes = [
        # Line reference to the axes
        {
            'type': 'line',
            'xref': 'x',
            'yref': 'y',
            'x0': zomato_ncr['Average Cost for two'].mean(),
            'y0': 0,
            'x1': zomato_ncr['Average Cost for two'].mean(),
            'y1': 8,
            'line': {
                'color': 'red',
                'width': 3,
                'dash': 'dashdot',
            },
        },
    ],
    annotations = [
        dict(
            x = zomato_ncr['Average Cost for two'].mean() + 5,
            y = 7,
            xref = 'x',
            yref = 'y',
            text = 'Average Price = 600.0 Rs',
            showarrow = True,
            arrowhead = 3,
            ax = 100,
            ay = 0,
        )
    ],
    title = go.layout.Title(
        text='India (NCR Only)' + '<br />' + 'Hover on bars for No. of Restaurants',
    ),
    xaxis = go.layout.XAxis(
        title = go.layout.xaxis.Title(
            text = 'Average Price in Rs. for Two People',
            font = dict(
                family = 'Courier New, monospace',
                size = 18,
                color = '#7f7f7f'
            )
        )
    ),
    yaxis = go.layout.YAxis(
        title = go.layout.yaxis.Title(
            text = 'Log10 of Number of restaurants',
            font = dict(
                family = 'Courier New, monospace',
                size = 18,
                color = '#7f7f7f'
            )
        )
    )
)

data = [trace0]

fig = go.Figure(data = data, layout = layout)
pyo.iplot(fig, filename='grouped-bar')

Inference :

  • 75% of restaurants in NCR cost about Rs.650 while for rest of India it is Rs.1200
  • Average price for 2 in India is Rs.881 (Outside NCR), and Rs.600 (in NCR)
  • Min is Rs. 100(Outside NCR) and Rs.5o (in NCR)
  • Max is Rs. 8000(NCR) and Rs. 3600(Outside NCr)
  • We can here conclude that there are more eco options @NCR on zomato than the rest of India.

Price vs Rating

Here we will consider only the restaurants which have been rated and have atleast 50 votes.

In [20]:
# Rated India (Not including NCR)
zomato_india_rated = zomato_india[zomato_india.Votes > 49]

# Rated India (NCR only)
zomato_ncr_rated = zomato_ncr[zomato_ncr.Votes > 49]
In [21]:
pvr_india = go.Scatter(
    y = zomato_india_rated['Aggregate rating'],
    x = zomato_india_rated['Average Cost for two'],
    mode = 'markers',
    marker = dict(
        size = 6,
        color = zomato_india_rated['Aggregate rating'], #set color equal to a variable
        colorscale = 'Viridis',
        showscale = True
    ),
)

pvr_ncr = go.Scatter(
    y = zomato_ncr_rated['Aggregate rating'],
    x = zomato_ncr_rated['Average Cost for two'],
    mode = 'markers',
    marker = dict(
        size = 6,
        color = zomato_ncr_rated['Aggregate rating'], #set color equal to a variable
        colorscale = 'Viridis',
        showscale = False
    ),
)

fig = tools.make_subplots(rows = 1, cols = 2,subplot_titles=('Non-NCR', 'NCR'))

fig['layout']['yaxis1'].update(title = 'Rating (Outside NCR) on scale of 5')
fig['layout']['yaxis2'].update(title = 'Rating (NCR) on scale of 5')

fig['layout']['xaxis1'].update(title = 'Avg Price for 2')
fig['layout']['xaxis2'].update(title = 'Avg Price for 2')

fig.append_trace(pvr_india, 1, 1)
fig.append_trace(pvr_ncr, 1, 2)

fig['layout'].update(height = 600, width = 800, title = 'Price Vs Rating')
pyo.iplot(fig, filename = 'simple-subplot-with-annotations')
This is the format of your plot grid:
[ (1,1) x1,y1 ]  [ (1,2) x2,y2 ]

Inference :

  • For restaurants outside NCR, we can say that restaurants have atleast a rating of 3, but in NCR we see all range of ratings.
  • There is no linear relationship between price n ratings, as in for all price ranges the ratings are good n bad.

What are the most value for money restaurants in NCR?

In [22]:
zomato_ncr_votes = zomato_ncr_rated[zomato_ncr_rated.Votes >= 500]
zomato_ncr_rating = zomato_ncr_votes[zomato_ncr_votes['Aggregate rating'] > 3.9]
zomato_ncr_eco = zomato_ncr_rating[zomato_ncr_rating['Average Cost for two'] < 700]
In [23]:
data = [
    go.Scattermapbox(
        lat = zomato_ncr_eco['Latitude'],
        lon = zomato_ncr_eco['Longitude'],
        mode = 'markers',
        marker = go.scattermapbox.Marker(
            size = 5,
            color = '#cb202d'
        ),
        text = zomato_ncr_eco['Restaurant Name'] + '<br />' + zomato_ncr_eco['Locality Verbose'],
    )
]

layout = go.Layout(
    autosize = True,
    hovermode = 'closest',
    mapbox = go.layout.Mapbox(
        accesstoken = mapbox_access_token,
        bearing = 0,
        center = go.layout.mapbox.Center(
            lat = 28.59,
            lon = 77.22,
        ),
        pitch = 1,
        zoom = 9,
        style = 'light'
    ),
    title = 'Best Value for money Restaurants in NCR'
)

fig = go.Figure(data=data, layout=layout)
pyo.iplot(fig, filename='Multiple Mapbox')

Some of the best value for money bakeries in NCR

In [24]:
zomato_ncr_bakery = zomato_ncr_eco.sort_values(by = 'Cuisines')[:5]
In [25]:
data = [
    go.Scattermapbox(
        lat = zomato_ncr_bakery['Latitude'],
        lon = zomato_ncr_bakery['Longitude'],
        mode = 'markers',
        marker = go.scattermapbox.Marker(
            size = 5,
            color = '#cb202d'
        ),
        text = zomato_ncr_bakery['Restaurant Name'] + '<br />' + zomato_ncr_bakery['Locality Verbose'],
    )
]

layout = go.Layout(
    autosize = True,
    hovermode = 'closest',
    mapbox = go.layout.Mapbox(
        accesstoken = mapbox_access_token,
        bearing = 0,
        center = go.layout.mapbox.Center(
            lat = 28.59,
            lon = 77.22,
        ),
        pitch = 1,
        zoom = 9,
        style = 'light'
    ),
    title = 'Best Value for money bakeries in NCR'
)

fig = go.Figure(data=data, layout=layout)
pyo.iplot(fig, filename='Multiple Mapbox')

Fast food chains' outlets in NCR

In [26]:
fast_food_chains = ["Domino's Pizza", "Momo's King", 'Wow! India', "Dunkin' Donuts", 'Subway', "McDonald's",
                   "Pizza Hut", "Pizza Hut Delivery", "KFC", "Burger King", 'Chicago Pizza', 'Burger Point',
                   'Gopala', 'Yo! China', 'Ovenstory Pizza', 'RollsKing', 'Mad Over Donuts']

zomato_ncr_fast = zomato_ncr.loc[zomato_ncr['Restaurant Name'].isin(fast_food_chains)]

data = [
    go.Scattermapbox(
        lat = zomato_ncr_fast['Latitude'],
        lon = zomato_ncr_fast['Longitude'],
        mode = 'markers',
        marker = go.scattermapbox.Marker(
            size = 5,
            color = '#cb202d'
        ),
        text = zomato_ncr_fast['Restaurant Name'] + '<br />' + zomato_ncr_fast['Locality Verbose'],
    )
]

layout = go.Layout(
    autosize = True,
    hovermode = 'closest',
    mapbox = go.layout.Mapbox(
        accesstoken = mapbox_access_token,
        bearing = 0,
        center = go.layout.mapbox.Center(
            lat = 28.59,
            lon = 77.22,
        ),
        pitch = 1,
        zoom = 9,
    ),
    title = 'Fast Food Centers in NCR'
)

fig = go.Figure(data=data, layout=layout)
pyo.iplot(fig, filename='Multiple Mapbox')

fast food centers are widely spread all over the NCR !


What are the types of cuisines that zomato offers all over India ?

In [27]:
# Download the image
!wget --quiet https://www.redbytes.in/wp-content/uploads/2018/09/zomato-logo-AD6823E433-seeklogo.com_.png
    
# save mask to alice_mask
zomato_mask = np.array(Image.open('zomato-logo-AD6823E433-seeklogo.com_.png'))

zomato_cuisines = []
for c in zomato_India['Cuisines']:
    listc = c.split(',')
    zomato_cuisines += listc
    
zomato_cuisines_text = ''
for text in zomato_cuisines:
    zomato_cuisines_text += (text + ' ')
    
stopwords = set(STOPWORDS)

# instantiate a word cloud object
zomato_wc = WordCloud(background_color='white', max_words=2000, mask=zomato_mask, stopwords=stopwords)

# generate the word cloud
zomato_wc.generate(zomato_cuisines_text)

# display the word cloud
fig = plt.figure()
fig.set_figwidth(8) # set width
fig.set_figheight(8) # set height

plt.imshow(zomato_wc, interpolation='bilinear')
plt.axis('off')
plt.show()

North Indian, Fast Food, Chinese, Mughlai, South Indian, Street Food are the most famous cuisines that zomato offers to its coustmers in India.


Table booking vs Rating

In [28]:
zomato_india['Has Table booking'].describe()
Out[28]:
count     730
unique      2
top        No
freq      665
Name: Has Table booking, dtype: object
In [29]:
zomato_ncr['Has Table booking'].describe()
Out[29]:
count     7922
unique       2
top         No
freq      6876
Name: Has Table booking, dtype: object
In [30]:
x0 = zomato_ncr_rated['Has Table booking']
y0 = zomato_ncr_rated['Aggregate rating']

trace0 = go.Box(
    y = y0,
    x = x0,
    marker = dict(
        color = '#FF851B',
    ),
    boxmean=True,
    boxpoints = 'outliers',
)

layout = go.Layout(
    title = "Table Booking Vs Rating"
)

data = [trace0]
fig = go.Figure(data = data, layout = layout)
pyo.iplot(fig)

Inference :

  • Restaurants with online table booking have slightly higher ratings in general.
  • Most of the restaurants do not have online table booking facility.

Where can I get awsm fine dine in NCR?

  • Have online table booking facility.
  • Rating of more than 4
  • Rated by more than 200 people.
In [31]:
top_fine_dine = zomato_ncr.loc[(zomato_ncr['Has Table booking'] == "Yes") & (zomato_ncr['Aggregate rating'] > 4) & (zomato_ncr['Votes'] > 200)].sort_values('Aggregate rating', ascending = False)

data = [
    go.Scattermapbox(
        lat = top_fine_dine['Latitude'],
        lon = top_fine_dine['Longitude'],
        mode = 'markers',
        marker = go.scattermapbox.Marker(
            size = 5,
            color = '#cb202d'
        ),
        text = top_fine_dine['Restaurant Name'] + '<br />' + top_fine_dine['Locality Verbose'],
    )
]

layout = go.Layout(
    autosize = True,
    hovermode = 'closest',
    mapbox = go.layout.Mapbox(
        accesstoken = mapbox_access_token,
        bearing = 0,
        center = go.layout.mapbox.Center(
            lat = 28.59,
            lon = 77.22,
        ),
        pitch = 1,
        zoom = 9,
    ),
    title = 'Top Fine Dines in NCR'
)

fig = go.Figure(data=data, layout=layout)
pyo.iplot(fig, filename='Multiple Mapbox')

Online delivery vs Rating

In [32]:
zomato_india['Has Online delivery'].value_counts()
Out[32]:
No     620
Yes    110
Name: Has Online delivery, dtype: int64
In [33]:
zomato_ncr['Has Online delivery'].value_counts()
Out[33]:
No     5609
Yes    2313
Name: Has Online delivery, dtype: int64
In [34]:
x0 = zomato_india_rated['Has Online delivery']
y0 = zomato_india_rated['Aggregate rating']

trace0 = go.Box(
    y = y0,
    x = x0,
    marker = dict(
        color = '#3D9970',
    ),
    boxmean=True,
    boxpoints = 'outliers',
)

layout = go.Layout(
    title = "Online Delivery Vs Rating (India not including NCR)"
)

data = [trace0]
fig = go.Figure(data = data, layout = layout)
pyo.iplot(fig)
In [35]:
x0 = zomato_ncr_rated['Has Online delivery']
y0 = zomato_ncr_rated['Aggregate rating']

trace0 = go.Box(
    y = y0,
    x = x0,
    marker = dict(
        color = '#FF4136',
    ),
    boxmean=True,
    boxpoints = 'outliers',
)
layout = go.Layout(
    title = "Online Delivery Vs Rating (India only NCR)"
)

data = [trace0]
fig = go.Figure(data = data, layout = layout)
pyo.iplot(fig)

Inference :

  • Online booking doesn't impact the ratings of the restaurants for restaurants in NCR, but the it does for restaurants outside NCR.

Do restaurants with more number of votes have lesser rating?

In [36]:
pvr_india = go.Scatter(
    y = zomato_india_rated['Aggregate rating'],
    x = zomato_india_rated['Votes'],
    mode = 'markers',
    marker = dict(
        size = 4,
        color = zomato_india_rated['Aggregate rating'], #set color equal to a variable
        colorscale = 'Rainbow',
        showscale = True
    ),
)

pvr_ncr = go.Scatter(
    y = zomato_ncr_rated['Aggregate rating'],
    x = zomato_ncr_rated['Votes'],
    mode = 'markers',
    marker = dict(
        size = 4,
        color = zomato_ncr_rated['Aggregate rating'], #set color equal to a variable
        colorscale = 'Rainbow',
        showscale = False
    ),
)

fig = tools.make_subplots(rows = 1, cols = 2,subplot_titles=('Non-NCR', 'NCR'))

fig['layout']['yaxis1'].update(title = 'Rating (Outside NCR) on scale of 5')
fig['layout']['yaxis2'].update(title = 'Rating (NCR) on scale of 5')

fig['layout']['xaxis1'].update(title = 'Number of Votes')
fig['layout']['xaxis2'].update(title = 'Number of Votes')

fig.append_trace(pvr_india, 1, 1)
fig.append_trace(pvr_ncr, 1, 2)

fig['layout'].update(height = 600, width = 800, title = 'Votes Vs Rating')
pyo.iplot(fig, filename = 'simple-subplot-with-annotations')
This is the format of your plot grid:
[ (1,1) x1,y1 ]  [ (1,2) x2,y2 ]

Inference :

  • Surprisingly it is the opposite of what I thought, for both outside and in NCR, restaurants with more than 4K votes have a rating of around 4 or more.

Summary :

  • Zomato has 90% of its eateries in India
  • And in India most of them are from Delhi and nearby cities (NCR).
  • Infact the top 5 citites for zomato in India are all from Delhi NCR.
  • Average price for 2 is lower in NCR than other other parts of the nation.
    • One can even get a meal for Rs 50 in NCR
    • 75% of resturants of NCR serve for Rs. 650(for 2 or less) and outside NCR the number is Rs. 1200 for the same
  • There was no linear relation found between price and rating, for all price range there were good and bad ratings.
  • Among the many cuisines served by zomato, North Indian, Fast Food, Chinese, Mughlai, South Indian, Street Food are the most popular.
  • There are lots of fast food centers in NCR.
  • Not many restaurants have a facility of online table booking.
    • Restaurants that have online table booking have a slightly higher rating than the restaurants that do not offer online table booking.
  • Ouside NCR, eateries with online delivery have higher rating than those who don't have online delivery.
    • But in NCR there is no affect on rating because of online delivery.
  • Eateries with 4K+ votes have a rating greater than 4.