The best Data Science students in the world compete in the Data Science Game this month!

For the third year in a row, the Data Science Game features Data Science students from around the world competing in this prestigious competition. This year, about 340 teams, representing more than 250 universities from 40 different countries faced a real-life demanding and innovative business challenge in the qualification phase. To stand out in this competition and win their ticket to the finals, students had to conceive and implement predictive models related to Big Data issues.


An algorithm to tell whether you will like the music stream… or not!

Who has ever dreamt of laying back and listening to the right music, without any scrolling or browsing, and still getting the perfect tune? But what defines this perfect tune?

This year, the qualification challenge of the Data Science Game dealt with music recommendation, a research question still challenging music services today. Our sponsor, Deezer, is a music streaming app, also available on the web. It has more than 43 million tracks in its catalogue and is available in more than 180 countries through a free limited service and a premium offer.

For this online challenge, the 340 teams focused on Flow, Deezer’s own music recommendation radio. Flow uses collaborative filtering to provide a user with the music they want to listen to at the right time. But if they dislike the played song, in spite of the recommendation system, they can still skip it by pressing the ‘Next song’ button. This new data is then fed into the algorithm to improve its accuracy. In this challenge the competitors had to predict whether users were likely to “like” a given song and listen to it for more than 30 seconds, or skip it before the 30 seconds’ mark.

Great success of XGBoost and ensembling methods

The majority of top 40 teams used XGBoost with ensembling methods. These machine learning techniques are known to be particularly efficient to model complex phenomena in the context of Big Data infrastructure.

Thanks to these models, the top 20 teams scored very highly, with scores between 66% and 69%. The three top scoring teams, who are heading to the finals, all come from Russian universities – Moscow State University, Higher School of Economics and Skoltech, but the competition is far from over. For the top 20 finalist universities, it will take more work and energy to succeed in the final round taking place during September.

After 47 days of competition and 5,593 algorithms submitted, the 20 finalists are:

Rank University Country
1 Moscow State University Russia
2 Higher School of Economics Russia
3 Skoltech University Russia
4 IIMC India
5 Toulouse School of Economics France
6 USP Sao Pauloy Brazil
7 IMT Atlantique USA
8 Stevens Institute of Technology USA
9 University of Edinburgh United Kingdom
10 Federal University of Alfenas Brazil
12 Ukrainian Catholic University Ukraine
14 Universidad Nacional de Ingenieria Peru
15 ENSIMAG France
16 St Petersburg University Russia
17 Université Toulouse Paul Sabatier France
18 HSE NN Russia
21 UPMC France
23 Humboldt University Germany
27 USP Sao Carlos Brazil
33 Barcelona Graduate School of Economicsy Spain

On the way to the final phase in Paris

On September 28th, 29th and 30th, Paris will welcome the Data Science Game finals, an international student hackathon focused on Big Data analytics. From the 340 initial teams, twenty groups from around the world will defend their university’s reputation in the 2017 Data Science Game.

This year once again, the Data Science Game finalists can count on its partners’ support, who are key contributors in the field of Data Science. Thanks to Capgemini, a global

leader in consulting, technology and outsourcing services, participating students will have the opportunity to stay in an exceptional historic place: « Les Fontaines » (the Capgemini Group’s University Campus) near Paris.


John Brahim, Head of Capgemini Group’s Insights & Data Global Practice, said

“As artificial intelligence gains adoption and digital shifts from being more than an interaction with the customer to embracing all facets of the value chain, real-time insights and analytics are becoming the defining value part of the equation. The competition at the Data Science Game will epitomize the advent of artificial intelligence as the main frontier for data science, and will help inspire the next generation of data specialists and provide them with the environment to experience first-hand the complexities of solving real business challenges. We hope that this will encourage them to go on to pursue a stimulating career in data analytics.”

The teams will also benefit from the expertise of the Data Innovation Lab, created by the AXA Group in early 2014 with the objective to

create value for its customers based on their data.

“At AXA we strongly believe in the transformational potential of data for our clients and our employees. Three years ago, we created the Data Innovation Lab around a pool of highly skilled and international data scientists and developers. Working in an industrial big data environment, the DIL provides innovative products and personalized services that go beyond the boundaries of traditional insurance. After a successful edition in 2016, we are glad to be, once again, sponsors of the 2017 Data Science Game, contributing to this international student challenge in which team spirit, creativity, and excellence in data science are key success factors. As in previous editions, we are convinced that these two days of interaction between our data scientists and the students will be not only lots of fun, but also a very enriching experience for both.”
Marcin Detyniecki, Head of Data Science and R&D, AXA Data Innovation Lab

For the second year in a row, Microsoft expertise in Data Science and cloud infrastructure will be an asset for our competitors.

“Cortana Intelligence Suite is Microsoft’s fully managed big data and advanced analytics suite. With Cortana Intelligence, students can access a rich set of data science tools including Azure Machine Learning, Jupyter notebooks on R and Python as well as Artificial Intelligence API’s to transform data into intelligent action. With GPU Compute Infrastructure in Azure they will be able to accelerate training their deep learning models in the cloud. At the 2017 Data Science Game, it will be great to see how the next generation of data scientists will use our platform in innovative ways to develop compelling solutions.”
Christophe Shaw, Director, Commercial Software Engineering, Microsoft France

The contest is also being supported by Quantcube Technology, the predictive analytics company, Milliman, the independent actuarial firm, Valeo, the world-leading global automotive supplier, Numberly, expert in digital marketing solutions, NVIDIA, the hardware provider, and other partners: Zelros, Paris-Saclay University and ActInfo.

These two days of competition in September will provide a unique opportunity for the contestants to show their skills in the presence of data specialists. Nurtured throughout the weekend, with advice from Data Scientists from the partner firms, the students will be in the best possible environment to learn and be excited by this competitive environment.

One question remains: will two-time winner Russia be dethroned this year?