World Cup Data Science

The current FIFA World Cup does not only fascinate football fans from all over the world, but also inspires many people to build fancy data visualizations - and well, so did we. However, simply visualizing was too boring, so we decided for a world cup data science project, called it ETHWisdom and make it available at http://web.sg.ethz.ch/world-cup.
Below I paste a post we wrote for http://bigdatamundial.betterdecisionsforum.com/, which by the way hosts a wealth of super interesting big data world cup projects. So check it out.

The whole world is in football frenzy. With favourites like England, Spain, Italy and Portugal already out at the group stages, while “underdogs” like Costa Rica, Chile and USA progressed further, this world cup seems to be the most unpredictable in recent history. But how unpredictable is the tournament really? Could it be that the collective football intuition of many people (experts and laymen alike) is better than most individual guesses? This is what our project at http://web.sg.ethz.ch/world-cup/ set out to study.

To this end, we collect data of roughly one thousand international students and researchers at ETH Zürich, the Swiss Federal Institute of Technology, betting on single match outcomes. Each individual bet is a single prediction or guess for a particular match outcome. An individual is awarded 4 points if his prediction guesses the exact result, 3 points if it only guesses the correct goal difference, 2 points if he guesses only the winner and 0 points otherwise. We provide a set of visualisations that allow to easily gain an insight into the collective intuition for each game.

However, there is more than simply visualisation. A social phenomenon known as wisdom of crowds postulates that the aggregate guess of many individual opinions can often be more accurate than each individual guess alone. Importantly, this mechanism is also the driving force behind speculative markets known as prediction markets. These exchanges are used in various contexts, such as predicting stock prices, presidential elections, box office success of films, or sales forecasting.

But, are there events whose nature is so unpredictable that even the wisdom of crowds fails? Is football one example? An affirmative answer would imply that there are situations in which asking the crowd, as in a prediction market, is not the optimal strategy.

We think these are exciting questions. The verdict is still out and we hope to gain more insights as the tournament progresses. It will be interesting to see how the performance of the crowd improves (or not) over the course of the world cup, as the audience gets to know the teams better. You are invited to join us in this little experiment and to check out our analysis here.