These steps will provide a PostgreSQL database with spatial capabilities through PostGIS, ready for developing location-aware scripts. PostGIS provides hundreds of spatial functions that can be used for complex geometry analysis directly within your database.
# macOS
brew install postgis
# Ubuntu/Debian
sudo apt install postgis
psql -d
createdb census_spatial;
psql -d census_spatial
CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology;
psql -d census_spatial
CREATE TABLE county_boundaries (
geoid VARCHAR(5) PRIMARY KEY,
name VARCHAR(100),
state_fips CHAR(2),
county_fips CHAR(3),
geom geometry(MultiPolygon, 4269)
);
wget https://www2.census.gov/geo/tiger/TIGER2023/COUNTY/tl_2023_us_county.zip
unzip tl_2023_us_county.zip
shp2pgsql -s 4269 -I tl_2023_us_county.shp county_boundaries | psql -d census_spatial
using Pkg
Pkg.add(["LibPQ", "DataFrames", "GeoInterface", "CairoMakie", "GeoMakie"])
using LibPQ, DataFrames
conn = LibPQ.Connection("dbname=census_spatial")
# Query example
counties = execute(conn, """
SELECT geoid, name, ST_AsGeoJSON(geom) as geometry
FROM county_boundaries;
""") |> DataFrame
Here’s a basic example of creating a choropleth map using the data:
using Plots
using JSON
# Assuming you have a DataFrame 'data' with GEOID and values to plot
function create_choropleth(data, geom_df)
# Merge geometry with data
merged = leftjoin(geom_df, data, on=:geoid)
# Parse GeoJSON and create plot
plot()
for row in eachrow(merged)
geom = JSON.read(row.geometry)
coords = geom["coordinates"][1][1] # Assuming simple polygons
x = [p[1] for p in coords]
y = [p[2] for p in coords]
plot!(x, y, fill_z = row.value, legend = false)
end
current()
end
Add spatial indices for better query performance:
psql -u **TK**
CREATE INDEX countyboundaries_geom_idx
ON county_boundaries USING GIST (geom);
Consider adding additional indices based on your query patterns:
psql -u **TK**
CREATE INDEX countyboundaries_geoid_idx
ON county_boundaries (geoid);
Regular maintenance tasks to keep the database performing well:
psql -d tiger
-- Analyze table statistics
ANALYZE county_boundaries;
-- Vacuum to reclaim space and update statistics
VACUUM ANALYZE county_boundaries;
Useful spatial queries for analysis:
-- Find adjacent counties
SELECT b.geoid, b.name
FROM county_boundaries a
JOIN county_boundaries b
ON ST_Touches(a.geom, b.geom)
WHERE a.geoid = '06037'; -- Los Angeles County
-- Calculate county areas
SELECT geoid, name,
ST_Area(ST_Transform(geom, 3857))/1000000 as area_km2
FROM county_boundaries;
-- Simplify geometries for faster rendering
SELECT geoid, name,
ST_SimplifyPreserveTopology(geom, 0.01) as geom_simplified
FROM county_boundaries;
# On Ubuntu
sudo apt install proj-bin
Chapter 23 used the county-level shapefile and 2024 election data to find how small counties voted to produce a map showing the division among Democratic, Republican and third-party/blank votes/no shows. Not only is the PostGIS method simpler, it is almost 100 times faster than using pure Julia: 3 seconds compared to almost five minutes on a Mac Studio M3 Ultra with 96GB RAM. The difference will be much greater on less well-provisioned workstations.