Analyzing and Predicting Locations of Dunkin’ Stores

Source: https://secretchicago.com/free-coffee-and-donuts-dunkin-august/

Introduction

As a native Texan, a Dunkin’ location was always a rare sight. However, coming up to Philadelphia for college, I was surprised to see a Dunkin’ almost at every block.

At a broader level, what can their store locations tell us about their customer base and what they view as attractive locations to serve this base? Based on this can we determine locations where Dunkin’s can expand to? By analyzing store data and how that maps to Census demographics and competitor prevalence, I hope to create a model of what Dunkin’ has deemed to be a valuable location.

In this data project, I plan on complete the following:

1. Exploratory Analysis — where are Dunkin’s currently located?

2. Predictive Analysis — what factors affect the placement of Dunkin’ stores and can a model predict current locations?

3. Competitor Analysis — how do marketing tactics differ between Dunkin’ and its major competitor, Starbucks?

Exploratory Analysis

Dunkin’ Locations

I wanted to first analyze where exactly Dunkin’s are located. I knew that they were less prevalent in the South and on the West Coast where preferences tend to lean towards Starbucks, but I was curious how sparse they actually were.

The 2017 location data was found in a Storybench (a program by Northeastern’s School of Journalism) article, which was initially sourced from Andrew Ba Tran’s Github.

The dataset included the longitude and latitude values for each location, so using the usmap package I was able to plot store locations and quantity of stores per state on the map.

Dunkin’ locations across the US
Amount of Dunkin’ stores in each state

From this map, we can see that most Dunkin’ locations are clustered along the East Coast especially in the Northeast and Mid-Atlantic regions. As you travel westward, franchise locations begin to dwindle and the Northwest region has zero Dunkin’ locations.

Dunkin’ Donuts was started in Quincy, Massachusetts in 1950, and so it makes sense majority of locations are in Massachusetts and surrounding areas. However, even though it has been almost 70 years, Dunkin’ has failed to deeply penetrate the West Coast market.

Next, I was curious which states have the higher number of Dunkin’ locations per capita. I gathered the estimated 2017 state population data from the Census Bureau and calculated the number of people/Dunkin’ location for each state.

People per Dunkin values for each state

From this map, we can see that California and Minnesota have the largest People per Dunkin’ value probably due to the minimal amount stores and sizable populations in both states. For example. California is the most populated state but only has 3 Dunkin’ locations, therefore its People per Dunkin’ value will be a lot. On the opposite end, Massachusetts takes the prize with a staggering one Dunkin’ for every 6054 people!

Google Search Trends

Google search hits over time for ‘Dunkin’ and ‘Dunkin Donuts’
Comparing Google search interest for ‘Dunkin’ between states

Looking at Google search interest for both ‘Dunkin” and “Dunkin’ Donuts (their rebranding was phased in early 2019), we can see that the interest in the brand has significantly grown since 2004. This is probably due to a myriad of factors but we can assume that, despite changes in consumer taste preferences, Dunkin’s commitment to its mission, trend awareness, and impactful marketing has contributed to its success.

Looking at the Google search interest by state, we see that the data tends to mirror the map of the Dunkin’ locations above — more hits along the East Coast (especially in the Northeast) and hits dwindle as you move westward. This makes sense since search interest increases in locations where there are more stores (i.e. to look up store hours, special offerings, etc).

Predictive Analysis

I wanted to look into what factors affected the placement of store locations. Therefore, I wanted to build a logistic regression-based prediction model based on the 2017 Dunkin’ location data and analyze factors such as population, number of households, income per capita, household median income, family median income, and number of Starbucks stores within the same zip code. I was curious which factors were not only statistically significant but also which had +/- relationships. Population, number of households, income per capita, household median income, family median income by county were from 2010 United States Census Data and Starbucks location data was from a dataset on Kaggle.

In terms of prediction, I tested out new zip codes that don’t have a Dunkin’ franchise as of 2017. Using the regression model from above, I was able to predict the likelihood that a zip code has a Dunkin’ location and look at the highest predicted probabilities of the zip codes that don’t. Then, I used Google Maps to check out the zip codes that have the highest predicted probability and see if there is a Dunkin’ store currently within the boundaries. I completed this analysis on three states: New York, Massachusetts, and Texas.

New York

New York had the largest quantity of Dunkin’ locations within state borders and so I thought it would be a good state to first analyze. First, I completed a logistic regression to see which factors were statistically significant and which had +/- correlations. Surprisingly, the number of Starbucks within the same zip code was the only statistically significant factor and had the highest correlation with the DunkinPresence variable (indicates if the zipcode has atleast 1 Dunkin’).

Logistic Regression Output
Correlation Matrix Output

Then, I created a bar graph to see how many Starbucks stores contributed more to the presence of Dunkin’ within the same zip code and it looks like there is a greater chance of a Dunkin’ being present if there’s only 1 Starbucks.

Bar graph showing relationship between the number of Starbucks and presence of Dunkin’ in NY

Generally in regressions you don’t want to include predictors that are highly correlated. Therefore, I removed the predictors ‘TotalHouseholds’, ‘HouseholdMedianIncome’, and ‘FamilyMedianIncome’ from the model. I then used this model based on the demographic data of Dunkin’ locations in 2017 to see which locations look like “Dunkin’ locations” but don’t have one as of 2017.

Prediction Results

I chose the top result (highest probability/likelihood), which was zip 11794, to further analyze and mapped the area. The model predicted that there was a 99% chance of this zip code having a Dunkin’. So, I searched for nearby Dunkin’s to see if a store opened up between 2017 and 2021, and there was a location right at the edge of the zip code. This gives us the confidence that the model is at least somewhat accurate.

Map and area boundaries for zip code 11794

Massachusetts

After New York, I was curious about Massachusetts since it had the lowest People per Dunkin’ value. Completing a similar regression as above, I found that none of the factors were statistically significant and therefore decided against creating a correlation matrix.

Logistic Regression Output

When I looked at zip codes that the model predicted there would be a Dunkin’, I only got 3 results each with very low probabilities. This suggests that the model is not that useful for Massachusetts Dunkin’ locations and this is likely because Massachusetts already has a high density of Dunkin’ locations.

Prediction Results

Texas

In contrast to Massachusetts, Texas had one of the higher People per Dunkin’ values and also had over 50 stores within state borders. Completing a similar regression as above, similar to New York, only the number of Starbucks within the same zip code was the only statistically significant factor and had the highest correlation with the DunkinPresence variable variable (indicates if the zipcode has atleast 1 Dunkin’).

Logistic Regression Output
Correlation Matrix Ouput

Similar to New York, when I created a bar graph to see how many Starbucks contributed to the presence of Dunkin’ within the same zip code, there is a greater chance of a Dunkin’ being present if there’s only 1 Starbucks.

Bar graph showing relationship between the number of Starbucks and presence of Dunkin’ in TX

Generally in regressions you don’t want to include predictors that are highly correlated. Therefore, I removed the predictors ‘TotalHouseholds’, ‘HouseholdMedianIncome’, and ‘FamilyMedianIncome’ from the model. I then used this model again based on the demographic data of Dunkin’ locations in 2017 to see which locations look like “Dunkin’ locations” but don’t have one as of 2017.

Prediction Results

I chose the top result (highest probability/likelihood), which was zip 75093, to further analyze and mapped the area. The model predicted that there was only a 54% chance of this zip code having a Dunkin’. So, I searched for nearby Dunkin’s to see if a store opened up between 2017 and 2021, and there was 1 Dunkin’ within the zip code and 1 more right outside the boundary. This gives us the confidence that the model is at least somewhat accurate and suggests increasing Dunkin’ popularity within Texas since there were not only 1 but 2 locations.

Map and area boundaries for zip code 75093

Given my hypothesis of an Dunkin’ popularity in Texas, I analyzed Google search interest specifically in Texas in the past 5 years. As I predicted, there has been an increase in interest especially in 2020!

Google search interest in Texas for ‘Dunkin’ over time

Competitor Analysis

Since the number of Starbucks within the same zip code was the only statistically significant factor that affected Dunkin’ store placement in both New York and Texas, I wanted to delve deeper into the competitive landscape of Dunkin’ vs. Starbucks.

Google Search

Google search interest for ‘Dunkin’ and ‘Starbucks’ over time

Based on Google search interest, Starbucks has always had a steady lead above Dunkin’ hits since 2004 and that gap has only widened in recent years. This could partially be due to the relative growth of both franchises. Starbucks has a larger footprint, with over 30,000 locations worldwide, compared to Dunkin’ s 11,300 locations.

Twitter

I analyzed the Twitter pages of both brands to see if there were differing marketing tactics to potentially explain how both chains could successfully coexist in similar zip codes across the US.

Dunkin’

I found that Dunkin’s top 10 words consisted mostly of marketing terms such as “email, assist, dm, perks, information”.

(left) Wordcloud containing top 100 words from Dunkin’ twitter page; (right) frequency of top 10 words tweeted by Dunkin’

An example of Dunkin’s tweet that includes these marketing terms was posted on April 7, 2021 and is shown below.

Chris D’Amico, SVP group creative director on Dunkin’ Donuts at Hill Holliday says, “More than ever, Dunkin’ is a brand that listens to its guests, through multiple channels, at all levels of the organization. (The brand) wants to hear (fans’) stories, retell them, and genuinely interact with them.”

Starbucks

On the other hand, I found that Starbucks’ top 10 words mainly consisted of positive feelings such as “favorite, cheers, happy and love”.

(left) Wordcloud containing top 100 words from Starbucks twitter page; (right) frequency of top 10 words tweeted by Starbucks

An example of a tweet from Starbucks that includes these positive feelings was posted on April 7, 2021 and is posted below.

Jonah Disend, founder and CEO of innovation agency Redscout says, “As a marketer, Starbucks is a cultural crusader that has been extremely successful at creating a lifestyle brand. With its own lingo, menus and services, Starbucks didn’t set out to create and market a chain of coffee cafes. They focused on creating a lifestyle experience with the much-heralded ‘third space.”

Tale of Two Different Marketing Styles

The success of Dunkin’ and Starbucks’s varying marketing styles stems from the nature of their customers. TrueLens conducted an analysis back in 2013 that compared social data from both chains and reported that Dunkin’ drinkers tended to be “social moms, sports fanatics, family travelers,” while Starbucks drinkers were “college age, early adopters, music enthusiasts.”

Although a lot has probably changed since 2013 for both chains, I think I think this speaks how Dunkin’ and Starbucks can coexist within the same zip codes — they target different demographics of coffee-drinkers.

A common game theory principle — Nash equilibrium— sums this up perfectly. The Nash equilibrium states a player can achieve the desired outcome by not deviating from their initial strategy. Therefore, Starbucks and Dunkin’s — no matter how close they are in proximity to one another — can coexist if they maintain their brand-defining marketing strategies that attract different customers.

Therefore, it makes sense that the number of Starbucks within a similar zip code was the only statistically significant factor that affected the placement of Dunkin’ stores. If there is already at least one Dunkin’ present in an area, it would make sense to also have at least one Starbucks’ as well to satisfy the needs of the coffee-drinking demographic that Dunkin’ does not cater to and vice versa.

Conclusion

Not surprisingly, we can see that most Dunkin’ locations are clustered along the East Coast especially in the Northeast and Mid-Atlantic regions. As you travel westward, franchise locations begin to dwindle and the Northwest region has zero Dunkin’ locations.

Out of factors such as population, number of households, income per capita, household median income, family median income, and number of Starbucks stores within the same zip code, the number of Starbucks was the only statistically significant factor in both New York and Texas.

When analyzing the relationship between Dunkin’ and Starbucks, a Twitter analysis revealed varying marketing styles that contribute to their different target demographics. Based on Nash equilibrium, Starbucks and Dunkin’s — no matter how close they are in proximity to one another — can coexist (and benefit one another!) if they maintain their brand-defining marketing strategies that attract different customers.

Overall, the analysis presented doesn’t tell us specifically where Dunkin should open a location, but rather provides a model for us to narrow down good locations for Dunkin’ stores especially in states with less Dunkin’ stronghold (i.e. Northwest). This data serves as a platform for setting up Dunkin’ franchising strategy and attaining growth/expansion goals across the United States.

Data Used

  • 2017 Dunkin’ Locations from Andrew Ba Tran’s Github
  • 2017 Starbucks Location from Starbucks on Kaggle
  • Population and Demographic Data by Zip Code from Census Bureau (2010, 2017)
  • Twitter (user profiles: DunkinDonuts, Starbucks)
  • Google (search terms: Dunkin, Starbucks )

About the Author

Tanvi Kongara is a sophomore at the Wharton School at the University of Pennsylvania, studying Healthcare Management and Business Analytics. This data project was conducted for the course OIDD245: Analytics & The Digital Economy.