Researchers exploit a crack in the data to answer the question of just how much Yelp reviews affect consumer behavior.

Customers wait in line for a sandwich at the highly reviewed Bakesale Betty in Oakland, California. (bgreenlee/Flickr)
Over the past half-decade, Yelp has become the king of the crowd-sourced-reviews hill, with some 27 million reviews and an average of 70 million unique visitors per month. But despite its prominence, its effects have been difficult for researchers to pin down. How much do Yelp ratings affect a restaurant's business?
Anecdotal evidence -- the great tailor whose positive Yelp reviews led to the business's expansion -- is common, but sussing out a more widespread effect is hard. It's a chicken-or-the-egg problem. Did a restaurant fail because its reviews were bad or because its food was bad, garnering negative reviews? Did another succeed because of its winning food and service, or because positive reviews brought customers to its doors? How can you ever know?
In a new paper in The Economic Journal, Michael Anderson and Jeremy Magruder of Berkeley demonstrate a tool for cracking that nut wide open. They find that the difference of a half-star in a Yelp rating can make a huge difference for the rate at which a restaurant is booked up with reservations on a given night. How do they know that they are seeing the effect of the rating and not the quality of the restaurant itself? They use a statistical tool known as "regression discontinuity."
Here's how it works. When you look at a Yelp page for a restaurant, you will see the average of its reviews in a number of stars at the top -- e.g. 3.0, 3.5, 4.0, 4.5, and 5.0. But of course those discrete rankings represent a much smoother distribution. In other words, when Yelp takes the average of its rankings, it gets numbers that are not 3.0 or 3.5, but that fall in between. Thus two restaurants can have very, very similar averages, say 3.24 and 3.26, but be rounded into different half-star categories, 3.0 and 3.5.
This feature of the data gives the researchers the material they need to be sure that the effect they're seeing comes not from the quality of the restaurant but the bright stars of the Yelp review.
Their study of nearly 400 restaurants in San Francisco during the late-summer/early-fall of 2010, found a large difference in reservation availability for restaurants whose average rounded up versus those whose average rounded down. For example, here is what the distribution of 7 pm availability looks like, between restaurants rounding to 3.5 stars and those rounding up to four:

As you can see, there is a big drop in availability for restaurants right around the point at which reviews shift from rounding down to rounding up, not a gradual decline as you would expect if consumers were responding only to the change in quality. (The researchers note that the drop actually observes a few tenths of a point before the threshold, but they explain this simply: "Average ratings drift over time," they write. "A restaurant currently just below the threshold is thus likely to have been above [it] in the preceding months.") Overall, they see effects of varying sizes for different reservation times and different ratings thresholds. The seven o'clock time slot sees the most significant effects: "Here," they describe, "moving from 3 to 3.5 stars is associated with being 21 percentage points more likely to have sold out all 7:00 PM tables and moving from 3.5 to 4 stars makes restaurants an additional 19 percentage points more likely to have sold out all tables."
The researchers make some back-of-the-envelope calculations to estimate the potential effect slight changes in Yelp reviews could have on profits, which the gauge to be an additional $816 per week for a bump up the review metrics. That level of specificity is necessarily a bit thin, but the fact remains: Very slight changes in Yelp reviews can result in big swings for the restaurateurs below.
