The comprehensive, industry-leading approach to building accurate and reliable Places data
For years, Foursquare has been at the forefront of providing accurate and reliable location data. And today, our Places product is the leading independent location-based platform, a position we sustain by creating, maintaining and refreshing a massive global dataset of over 200M POI.
FSQ Places by the numbers
Places, the backbone of our offerings at Foursquare, caters to diverse consumer applications, ranging from APIs and consumer apps, and Targeting to Attribution products. All our technologies are anchored in precise ‘point of interest data’ (POI) – enter the Places Engine.
The Places engine fundamentally changes how we create, update, and validate POI data. It allows us to combine a breadth of programmatic data sources with the depth of human verified sources to deliver a curated and accurate dataset for all of our products and services at Foursquare. Now, we are excited to share some updates with you all.
With the new Places engine, we are taking advantage of coverage sources like web crawls, listing syndicators, and data partners, and combining them with the human power of Foursquare’s Superuser community and their high levels of app engagement to surface the most accurate and nuanced dataset available.
At Foursquare, we have always relied on a combination of programmatic data sources and user-generated content to populate our Places database. While this methodology has served us well, we have the opportunity to improve the accuracy and depth of our dataset even further by adding a human layer distinct from 3rd party.
Places Engine is reinventing operational sustainability by connecting all the systems that make our data flow. In this pattern, data will move from app user to knowledge graph and back again, starting the virtuous cycle all over.
Innovating for the future: utilizing the human element
A human element has always played an important role in our data process at Foursquare. Now, Places Engine will add human moderation to the very core of our data process and validation.
Prior to Places Engine, humans created user-generated content and provided inputs for POI in our Places database, which weren’t viewed much differently than our 3rd party and crawl data. With this methodology, we looked for consensus across all sources, which meant sources were generally treated the same, without prioritization.
Looking at a fictitious example… ‘The Donut Pub’
Let’s imagine that two different humans verified that ‘The Donut Pub’ is located at 740 Broadway, but three programmatic sources (crawl and third-party data vendors) said it was called ‘Donut Pub’ and located at 10 Astor Pl.
In this case, Places would prioritize the three programmatic sources, rather than the two human inputs. Historically, Places would deliver the location that has greater consensus (10 Astor Place) to clients, which could mean the next time a user tries to navigate to a ‘Donut Pub’ in New York City, they may arrive at the wrong location if it is in fact located at 740 Broadway.
The lesson here is that a consensus-based approach fails to fully capture the human experience, which is the ultimate ground-truth source for nuances of the world.
Places Engine will solve for this conundrum by adding a distinct human layer separate from 3rd party data sources, providing Foursquare with a view of the world that is uniquely valuable, while generating higher-quality data that can’t be found anywhere on the interwebs. Places Engine enables Foursquare to leverage the humans generated data more effectively, all while maintaining access to the breadth of knowledge generated by programmatic data.
What makes the Places Engine unique?
There are three main components to the Places Engine. The Reporter, the Voter, and the re-centering of infrastructure around the idea of voting. When working together, these three components align all Foursquare products under a single dataset.
Report an Edit
SUs and each 3P source will report individually
Acceptance and Rejections Thresholds are set dynamically based on the POIs importance to FSQ
SUs and the collective 3P sources will vote
In ‘The Donut Pup’ example, a CityGuide user may see ‘Donut Pub’ at 10 Astor Pl. and decide to try it out. Upon arrival, they will struggle to find it, realizing it isn’t a store at all. In reality, ‘The Donut Pub’ is located 5-minutes away on foot at 740 Broadway. The user can submit edits to the name and address in the CityGuide app.
Should their proposed edit receive sufficient corroboration, Places will begin geotagging ‘The Donut Pub’ at 740 Broadway, guaranteeing the next time a user tries to navigate there, they’d arrive at the right spot to get that donut.
Human moderation will set off a series of cascading benefits across Foursquare and all of its clients. With Places Engine, Foursquare will accelerate the delivery of high-value data improvements to our Places dataset, while also increasing the speed in which our dataset adjusts to real-world changes.
Places Engine will provide an advantage of first-party data associated with POIs, granting Foursquare even more data to improve its breadth of products and experiences. Additionally, Foursquare can now leverage its highly dedicated Superuser community as a comprehensive and scalable layer of human moderation. This layer of moderation will generate depth and accuracy for our data that wasn’t available in recent years. Additionally, Place Engine will better position Places to adjust based on customer feedback using a feedback API, ultimately giving Foursquare the ability to improve the accuracy and freshness of our data.
So how does the Places Engine work?
There are two central concepts introduced with the Places Engine – Reporting and Voting. These two concepts feed into our core infrastructure.
There are two types of Reporters – User Reporters and Programmatic Reporters.
- User Reporters: Use FSQ apps and propose edits directly to a POI while in-app.
- Programmatic Reporters: Represent third-party data sources like crawled sites.
Each time a source is ingested (for the first time or after a period of time), the data will be compared to the existing dataset. If there is a POI match between the source and Foursquare’s data, or there are differences between the existing value and what the new source believes to be true, the edit will enter a voting queue. If there is no match, a POI will be generated based on the known characteristics of that POI.
Both reporter types are assigned a “Reporting Power” that corresponds to the source’s credentials, reporting history, quality audit results, and more. A source labeled “Reporting Power” makes it easier for Superusers and sources with a proven track record of suggesting accurate edits to turn their proposals into published edits quickly.
For customers, this means a more accurate and quicker-to-adjust dataset. Additionally, this means sources and superusers submitting low-quality information will require greater corroboration from programmatic sources for an edit to be published.
User and Source corroboration is gained through voting.
There are two types of voters contributing to Places Engine – Superusers and Programmatic Voters..
Superusers: Leverage on-the-ground knowledge to vote on proposed edits for POIs in their city, region, or even country.
Programmatic: Utilizes a number of complex algorithms to examine all 3rd party data associated with the POI and synthesize a single value for all possible attributes.
Similarly to Reporters, Voters are also assigned “Voting Power.” Voting Power is dynamic as it is consistently evaluated and calibrated based on factors like credentials, voting confidence, voting history, and more. This results in voters with a known advantage or history of accurate votes having a greater impact on what is accepted or rejected.
If a Reporter does not have a strong enough “Reporting Power” to approve their proposed edit automatically, the proposed edit will then enter a Voting Queue. In the Voting Queue, the Programmatic Voter and the Superuser Voters will vote on the edit until it is either accepted or rejected. If the record is accepted, the record is updated, and the Reporter Power is positively adjusted. In addition to this, Voter Power will also be adjusted positively for those who contributed to the edit’s acceptance or negatively for any user that didn’t support the edit.
In instances where there are discrepancies with a 3rd party proposal, human moderation will have the final say in validating the inputs. To do this, a human will review and approve inputs or suggested edits to our Places database. The need for human moderation is determined by the importance of the POI that the input is related to, and the level of importance is derived from the number of check-ins received at each location.
Adding this human layer provides a more nuanced approach of mapping the world around us. For example, a hotel may have a different level of importance than a coffee shop, and so we want to ensure that these distinctions are accurately reflected in our database.
The new Places Engine also allows us to be more proactive in identifying and correcting inaccuracies in our dataset. Our moderators will constantly monitor and update our database to ensure it remains as accurate and up-to-date as possible.
In addition to improving the accuracy of our dataset, Places Engine also allows Foursquare to be more efficient in our data curation process. By relying on a combination of programmatic data sources and human moderation, we can now surface the most relevant information in a manner that’s streamlined.
The new Places Engine represents a significant step forward for Foursquare and the quality of the location data industry. In combining the scope of programmatic data sources with the depth and nuances provided by human sources, we can provide the most accurate and refined dataset available.
Foursquare’s Places Engine will fundamentally change how POIs are created, validated, and delivered to customers. The increase in human moderation will enable Foursquare to provide updates at a speed that aligns to how quickly things change in the world. Clients will benefit from added confidence in their data, this is because Places Engine penalizes or rewards each input based on the accuracy of their information, ultimately leading to a more accurate dataset to be used for important decision making, such as trade area analysis, risk assessment, and more.
As always, Foursquare’s commitment to accuracy and reliability remains top of mind in everything we do. We are excited to see the Places Engine’s impact on the world around us, and we will continue to provide the most accurate and reliable location data for our clients.