Spatial Join

Spatial join combines attributes from one dataset to another based on their spatial relationship. To perform spatial join, specify geometry columns in both datasets and select the type of spatial join you want to perform. Once the operation is complete, you will receive a new joined dataset.

Perform a Spatial Join

On a dataset with a geometry column, click โ‹ฎ More Options >> Spatial Join to open the spatial join configuration panel.

The Spatial Join button.

The Spatial Join button.

Spatial Join Configuration

The Spatial Join panel contains three sections:

  • Target Dataset - configure the dataset you will join against.
  • Join Operation - select a supported join operation between the source and target dataset.
  • Join Dataset - select columns and specify aggregation rules (how to aggregate results when joining multiple rows). The supported aggregation rules include: Count, Sum, Mean, Max, Min, Deviation, Variance, Median, Percentile(P05, P25, P50, P75, P90)

Target Dataset

In the Target Dataset section, select a dataset containing a geometry column, select the geometry column, then select/deselect columns to include in the join operation.

Join Dataset

In the Join Dataset section, select a dataset to join, select a geometry column in that dataset, then specify aggregation rules.

Join Operation

Supported spatial join operations in Foursquare Studio:

Join OperationDescription
IntersectsTrue when geometries share any portion of space.
TouchesTrue when geometries have at least one point in common, but their interiors do not intersect.
WithinTrue when geometry A is completely inside geometry B.
OverlapsTrue when geometries share space but are not completely contained by each other.
EqualsTrue when geometries represent the same geometry.
CrossTrue when geometries have some, but not all, interior points in common.

Based on the geometry types you selected in both datasets, Foursquare Studio suggests an appropriate join type. However, you may select any combination of supported operations on the geometries of the datasets provided.

Click the Join Type dropdown to select a Join Type:

TargetJoin with
IntersectsTouchesWithinOverlapsEqualsCrosses
PointPointโœ”๏ธโœ”๏ธโœ”๏ธ
MultiPointโœ”๏ธโœ”๏ธ
Line/MultiLineโœ”๏ธโœ”๏ธโœ”๏ธ
Polygon/MultiPolygonโœ”๏ธโœ”๏ธโœ”๏ธ
MultiPointPointโœ”๏ธโœ”๏ธ
MultiPointโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธ
Line/MultiLineโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธ
Polygon/MultiPolygonโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธ
Line/MultiLinePointโœ”๏ธโœ”๏ธ
MultiPointโœ”๏ธโœ”๏ธโœ”๏ธ
Line/MultiLineโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธ
Polygon/MultiPolygonโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธ
Polygon/MultiPolygonPointโœ”๏ธโœ”๏ธ
MultiPointโœ”๏ธโœ”๏ธโœ”๏ธ
Line/MultiLineโœ”๏ธโœ”๏ธโœ”๏ธ
Polygon/MultiPolygonโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธโœ”๏ธ

Aggregation Rules

When running any join operation, you must select an aggregation method for each selected column. Available methods depend on the data type of the column.

Aggregation options in Foursquare Studio.

Aggregation options in Foursquare Studio.

The following aggregation methods are currently available in Studio:

Aggregation MethodColumn TypeDescription
Countstring, datetime, number,Counts the number of non-null values in the specified column.
Modestring, datetime, numberFinds the most frequently occurring value(s) in the specified column.
Maxdatetime, numberReturns the maximum value from the specified column.
Mindatetime, numberReturns the minimum value from the specified column.
Uniquestring, numberReturns the count of distinct values in the specified column.
SumnumberComputes the sum of all numeric values in the specified column.
MeannumberCalculates the average (mean) value of all numeric values in the specified column.
MediannumberReturns the middle value in a sorted list of numeric values in the specified column.
VariancenumberMeasures the variability (spread) of numeric values in the specified column from their mean.
DeviationnumberCalculates the standard deviation of numeric values in the specified column.
P05numberReturns the value below which 5% of the data falls in the specified column.
P25numberReturns the value below which 25% of the data falls in the specified column.
P50numberReturns the value below which 50% of the data falls (same as the median) in the column.
P75numberReturns the value below which 75% of the data falls (same as the median) in the column.
P95numberReturns the value below which 95% of the data falls in the specified column.

Spatial Join Examples

Spatial join is one of the most popular geoprocessing tools in GIS. Here are some examples of practical uses of applying a spatial join using Foursquare Studio:

Example 1. Aggregate the parking infractions data for 157 neighborhoods in Philadelphia.

The parking infractions data include millions of points that each one represents the location of a parking violation ticket. Using spatial join, we can aggregate the points to each neighborhood based on the geographic locations.

In this example, we demonstrate how to compute the total dollar amount fined via parking violations for all neighborhoods in Philadelphia:

  1. Select the Spatial Join option from the dataset "Neighborhoods_Philadelphia.json".

  2. In the Spatial Join panel, select the Join Dataset: "Parking Violations.csv". Foursquare Studio will automatically detect the geometry type "Point" and geometry column: "lat" and "lon".

  3. Ensure the Join Operation selection is Intersects, which means all the points within or touch the polygon are considered as valid spatial join.

  4. Click on the field selector of fine column under the Attribute Columns(1) label, and choose Sum as the aggregation rule in the pop-up menu. By doing so, we can compute the sum of the fine value of all points joined with each neighborhood.

Spatial join the parking infractions data with 157 neighborhoods in Philadelphia.

Spatial join the parking infractions data with 157 neighborhoods in Philadelphia.

The result of the spatial join is a new dataset with a column COUNT issue_date, which represents how many data points (tickets) are in each polygon (neighborhood), and a column SUM fine, which represents the total amount of fine from all data points in each neighborhood. Then, we can create a thematic map using the new column SUM fine to show the spatial distribution of the infractions data at the neighborhood level (see Figure 3).



The result of spatial join the parking infractions data with 157 neighborhoods in Philadelphia.

The result of spatial join the parking infractions data with 157 neighborhoods in Philadelphia.

Example 2. Spatial join the building footprints with 25 Zip code areas in San Francisco.

In this example, we are going to spatial joining 25 zip code areas with 177,023 building footprints in San Francisco, so we can compute how many buildings and the total area they occupy in each zip code area.

  1. Select the Spatial Join option from the dataset "sfzip.json".
  2. In the Spatial Join panel, we select the Join Dataset: "sf.buildings.json". Studio studio will automatically detect the geometry type "Polygon" and geometry column: "_geojson".
  3. Ensure the Join Operation is Intersects, which means all the polygons (buildings) intersect with the zip code area are considered as valid spatial join.
  4. Click on the field selector of mblr (San Francisco property key) column under the Attribute Columns(2) label, and choose Count as the aggregation rule in the pop-up menu. By doing so, we can count how many building footprints intersect with each zip code area.
  5. Click on the field selector of area (The area of building in square meters) column under the Attribute Columns(2) label, and choose Sum as the aggregation rule in the pop-up menu. By doing so, we can compute the total area of the building footprints intersect with each zip code area.
Spatial join the building footprints with zip code area in San Francisco.

Spatial join the building footprints with zip code area in San Francisco.

The result of the spatial join is a new dataset with a column COUNT mlbr, which represents how many building footprints in each zip code area, and a column SUM area, which represents the total area of the building footprints in each zip code area. Then, we can create a thematic map using the new column SUM area to show the spatial distribution of the total area of building footprints at the zip code level (see Figure 5).

The result of spatial join the building footprints with zip code area in San Francisco.

The result of spatial join the building footprints with zip code area in San Francisco.