Geocoding with UDFs
In this tutorial you will learn how to add geocoding to your data using Exasol UDFs.
This tutorial uses database objects in the free public demo system hosted by Exasol. To get access to the public demo system, sign up here.
Introduction
The examples in this tutorial make use of GeoPy, a popular Python client for geocoding web services. The public demo system contains a pre-built GeoPy distribution file in the following BucketFS location:
/buckets/bucketfs1/showcase/geopy-VERSION-py3-none-any.whl
In Exasol UDFs, a sys.path.extend
call makes use of corresponding Python modules. This method only works for pure
Python modules that do not contain extensions or code in other programming languages. If this does not work in your database, use the method described in Adding New Packages to Existing Script Languages.
Geocoding example
The following example shows how to determine the latitude and longitude for a given place using a UDF script.
-
Open your schema in the demo database.
-
Copy the following script into your client and execute it to create the UDF.
--/
CREATE or REPLACE PYTHON3 SET SCRIPT geopy_geocode_long_lat(placename varchar(200)) EMITS (place varchar(20000), latitude double, longitude double) AS
import glob
sys.path.extend(glob.glob('/buckets/bucketfs1/showcase/*'))
import geopy.geocoders
from geopy.geocoders import Nominatim
geopy.geocoders.options.default_user_agent = "my-application"
# IMPORTANT: "my-application" must be changed to the name of your application
geolocator = Nominatim()
def run(ctx):
while True:
try:
geo = geolocator.geocode(ctx.placename, timeout=None)
place=ctx.placename
lat=geo.latitude
lo=geo.longitude
ctx.emit(place, lat, lo)
except:
ctx.emit(place + " (Unavailable)",None,None)
if not ctx.next(): break
/ -
Test the UDF against a specific address using the
dual
table in the demo database: -
Create a new table in your schema to run the UDF against a table with location data. For example:
-
Execute the UDF against the data set.
SELECT DISTINCT * from(
SELECT
geopy_geocode_long_lat(tz.BOROUGH)
FROM
"TEST_BOROUGH" tz
ORDER BY
1 ASC
)
;Result:
* PLACE LATITUDE LONGITUDE 1 Bronx 40.85045845 -73.8404035580209 2 Brooklyn 40.6501038 -73.9495823 3 Manhattan 40.7900869 -73.9598295 4 Queens 40.6524927 -73.7914214158161 5 Staten Island 40.5834557 -74.1496048
Reverse geocoding
The following examples show how to do reverse geocoding using latitude and longitude.
-
Open your schema in the demo database.
-
Copy the following script into your client and execute it to create the UDF.
--/
CREATE OR REPLACE PYTHON3 SET SCRIPT geopy_reverse_geocode(lat DOUBLE, lo DOUBLE) EMITS (latitude double, longitude double, address VARCHAR(20000)) AS
import glob
sys.path.extend(glob.glob('/buckets/bucketfs1/showcase/*'))
import geopy.geocoders
from geopy.geocoders import Nominatim
geopy.geocoders.options.default_user_agent = "my-application"
# IMPORTANT: "my-application" must be changed to the name of your application
geolocator = Nominatim()
def run(ctx):
while True:
geo = geolocator.reverse([ctx.lat, ctx.lo])
lat=ctx.lat
lo=ctx.lo
address=geo.address
ctx.emit(lat, lo, address)
if not ctx.next(): break
/ -
Test the UDF against specific coordinates using the
dual
table in the demo database: -
Create a new table in your schema to run the UDF against a table with location data. For example:
-
Execute the UDF against the data set.
Result:
* LATITUDE LONGITUDE ADDRESS 1 40.717982000000006 -73.99393 101, Chrystie Street, Lower East Side, Manhattan Community Board 3, Manhattan, New York County, City of New York, New York, 10002, United States 2 40.705454 -74.013501 Evacuation Day Plaza, Whitehall, Manhattan, New York County, City of New York, New York, 10275, United States 3 40.760014000000005 -73.988322 Riu Plaza Hotel, 305, West 46th Street, Hell's Kitchen, Manhattan Community Board 4, Manhattan, New York County, City of New York, New York, 10036, United States 4 40.760525 -73.991272 645, 9th Avenue, Hell's Kitchen, Manhattan Community Board 4, Manhattan, New York County, City of New York, New York, 10036, United States 5 40.743637 -74.003475 130, 9th Avenue, Manhattan Community Board 4, Manhattan, New York County, City of New York, New York, 10011, United States 6 40.714915000000005 -74.009779 75, West Broadway, Tribeca, Manhattan, New York County, City of New York, New York, 10007, United States 7 40.64464 -73.782466 John F. Kennedy International Airport, JFK Access Road, Queens, Queens County, City of New York, New York, 11430, United States 8 40.733792 -73.98675 Hattan Cleaners, 141, 3rd Avenue, Gramercy, Manhattan Community Board 6, Manhattan, New York County, City of New York, New York, 10003, United States 9 40.764456 -73.972851 Harry Cipriani, 781, 5th Avenue, Manhattan Community Board 8, Manhattan, New York County, City of New York, New York, 10022, United States 10 40.758942000000005 -73.974343 34, East 52nd Street, Midtown East, Manhattan Community Board 5, Manhattan, New York County, City of New York, New York, 10022, United States