Behind the Scenes of our Data Science Hackathon

A Conversation with Foursquare’s Senior Engineering Manager Zisis Petrou

Behind the Scenes Post Image

This summer, Foursquare hosted its first-ever data science hackathon on Kaggle. Participants were challenged to predict which POI updates represented the same POI in the physical world, using a simulated dataset of 1M pairs of updates on a global scale from different sources of varying levels of quality. The top models with the most accurate matches won a cash prize.

The competition was tough, with 22,285 submissions from 1,295 submitters on 1,079 teams from across the world. We sat down with Zisis Petrou to learn how Foursquare’s engineering team played an important role in this competition and identified the winners.

Zisis is a senior engineering manager with the Places Data Science team at Foursquare. His team works on design and implementation of data science and machine learning pipelines to enrich information of Foursquare places-of-interest (POI) products. Examples include prediction of confidence on POI reality, open/closed status, and correctness of geographic and firmographic attributes.

How did the engineering team work together to create this competition?

A cross-functional team of engineers got together at the design stage of the competition to brainstorm and evaluate different scenarios on the objective and nature of the competition. Once the main scope of the competition was decided, a team of engineers closer to the competition objective took over the design of the competition details. This included defining precisely the problem the competition would solve, the dataset, the target variable, and the evaluation metric to score participant submissions. The team went through several iterations to design the above in a way that the competition would offer a snapshot of the challenging problems Foursquare works on, and engage the data science community while keeping onboarding as friendly as possible.

What data sets did you consider when creating this challenge, and how did you decide on this specific problem and data set?

Foursquare owns vast and diverse sets of data around the main pillars of places of interest and visits, as well as enriched data sets around these pillars used for applications like targeting, demand forecasting, or site selection. The team considered multiple datasets, and decided to create a competition around the Places dataset and a data matching problem similar to the ones we work on at Foursquare. There were several advantages on this choice: Places is an easy to comprehend dataset, keeping domain-specific terminology minimum, to allow easy onboarding of candidates; however, data matching can be challenging, including a variety of attributes under the presence of noise, incomplete, or multilingual entries; the problem would be naturally set up in a tabular format that favors use of popular algorithms and parallel processing of large scale data.

Did any of the solutions surprise you in the way they approached the problem and solved it?

We were surprised by the diversity of submitted solutions, especially by solutions that achieved very similar performance but approached the problem from fundamentally different perspectives. While there was a common denominator in many solutions, like leveraging well established language understanding machine learning models or using multi-stage matching architectures, there were also major differences. We had solutions that employed domain-specific knowledge to handcraft highly predictive features, and we also had approaches where advanced embedding techniques were used to generate features automatically in a domain-agnostic way. It was surprising to see vastly different algorithms achieving very similar high performance.

What criteria helped you determine the winners, and how did you come to that consensus?

From the early stages of the competition design we had close collaboration with the Kaggle data science team. We designed thoroughly the format of the competition, as well as the evaluation measure based on which we would score submissions. We defined the evaluation measure in a way that would be comprehensive to the participants, not discouragingly hard but not easy to saturate, get close to perfect, either. Once we formulated the evaluation measure and the test sets, the participant submissions were marked objectively and automatically by the Kaggle pipelines.

What did you see in the winners’ presentations that you might take inspiration from in your own work?

The winning solutions had surprising diversity and very creative elements. It was inspiring to follow the process that each winning team approached the problem; the assumptions they made; the way they used feedback from early submissions to iterate and improve their solutions; the way they used insights other participants shared in forum discussions and built on those insights; the persistence of some teams that worked from the launch to the end of the competition; and the way they combined various libraries, and machine learning models and algorithms in creative, and sometimes counterintuitive ways. Such out-of-the-box thinking was the element that gave the extra boost to some of the winning solutions, and it is a great source of inspiration.