Mathare Building import
From Map Kibera
(→Find collisions between OSM and the import candidate) |
(→Find collisions between OSM and the import candidate) |
||
Line 256: | Line 256: | ||
We can find the overlapping elements with this query: | We can find the overlapping elements with this query: | ||
mathare#= select g.gid, o.osm_id, from "groundtruthed-wgs84" as g, planet_osm_polygon as o where ST_Intersects(g.the_geom, o.way) ORDER BY g.gid; | mathare#= select g.gid, o.osm_id, from "groundtruthed-wgs84" as g, planet_osm_polygon as o where ST_Intersects(g.the_geom, o.way) ORDER BY g.gid; | ||
+ | |||
+ | This script removes the columns, which are empty: | ||
+ | #!/usr/bin/python | ||
+ | |||
+ | import psycopg2 | ||
+ | |||
+ | conn = psycopg2.connect("dbname=mathare") | ||
+ | cur= conn.cursor() | ||
+ | cur2 = conn.cursor() | ||
+ | cur3 = conn.cursor() | ||
+ | cur.execute("select attname from pg_attribute, pg_type where typname='planet_osm_polygon' and typrelid=attrelid ;") | ||
+ | for record in cur: | ||
+ | column = record[0] | ||
+ | print column | ||
+ | query = cur2.mogrify("SELECT count(*) from planet_osm_polygon WHERE \"" + column + "\" IS NOT NULL" ) | ||
+ | cur2.execute(query) | ||
+ | |||
+ | for c in cur2: | ||
+ | if c[0] == 0L: | ||
+ | query = "ALTER TABLE planet_osm_polygon DROP COLUMN \"" + column + "\" CASCADE;" | ||
+ | print query | ||
+ | cur3.execute(query) | ||
+ | print repr(cur3.statusmessage) | ||
+ | conn.commit() | ||
+ | cur.close() | ||
+ | cur2.close() | ||
+ | cur3.close() | ||
+ | conn.close() | ||
+ | |||
+ | |||
+ | It turns out only one of the pre existing buildings that overlap carries information that should be kept: | ||
+ | mathare=# SELECT DISTINCT m.shp_gid, o.osm_id, o.name, o.amenity, o.religion | ||
+ | FROM planet_osm_polygon AS o, mathare_building_import AS m | ||
+ | WHERE (o.name IS NOT NULL OR o.religion IS NOT NULL or o.amenity IS NOT NULL) | ||
+ | AND ST_Intersects(m.the_geom, o.way); | ||
+ | shp_gid | osm_id | name | amenity | religion | ||
+ | ---------+----------+--------------------------------+------------------+----------- | ||
+ | 517 | 96051625 | St. John's & St. Paul's Church | place_of_worship | christian | ||
+ | (1 row) | ||
== Buildings on top of other == | == Buildings on top of other == | ||
== Generate the OSM data to be imported == | == Generate the OSM data to be imported == |
Revision as of 06:34, 17 May 2011
In this page, I (Sébastien) will describe how to import the data of the building extraction of Mathare that we've got from AAAS. See Primož's blog for the initial analysis.
Changing the SRS
The data was made available as shapefiles in WGS84 Web Mercator Auxiliary Sphere projection, which is known as Web Mercator, spherical Mercator, Google Mercator and is referenced by various ids: EPSG:900913, EPSG:3785 (obsolete), EPSG:3857, ESRI WKID 102100... In QGIS, it's known as Google Mercator, EPSG:900913, ID 3644.
Before working with the data, we need to change the Spatial Reference System (SRS) to be compatible with OSM, that is moving to non-projected WGS84.
In QGIS: menu Vector -> Data Management Tools -> Export to new projection. Then choose the output CRS to be WGS84, EPSG:4326, ID 3452.
The two files that we're going to import are GroundTruthed-wgs84.shp and BuildingsOnTop-wgs84.shp
Setup PostGIS database
PostGIS is relatively easy to install on Ubuntu. I might describe it further later. Anyway, the current environment is PostgreSQL 8.4, PostGIS 1.5.
I found this script from GeoDjango very convenient to create a postgis db template:
$ ./create_template_postgis-debian.sh $ createdb -T template_postgis mathare
Import the shapefiles to the db
Each shapefile will be imported a separate table with shp2pgsql:
shp2pgsql -s 4326 -c -g the_geom -I -S -W UTF-8 -N skip Groundtruthed-wgs84.shp | psql mathare shp2pgsql -s 4326 -c -g the_geom -I -S -W UTF-8 -N skip BuildingsOnTop-wgs84.shp | psql mathare
Import OSM data to the db
In this case, we're only interested in buildings from OSM.
Get fresh data from OSM, keep only ways tagged with 'building' and their associated ways:
osmosis \ --read-api left=36.8425655 right=36.881876 top=-1.2480512 bottom=-1.2696752999999998 outPipe.0=1 \ --way-key keyList=building inPipe.0=1 outPipe.0=2 \ --used-node inPipe.0=2 outPipe.0=3 \ --write-xml file=mathare_building.osm inPipe.0=3
Import it to the database:
osm2pgsql -c -d mathare -l -s mathare_building.osm
The following tables will be created:
- planet_osm_line
- planet_osm_nodes
- planet_osm_point
- planet_osm_polygon
- planet_osm_rels
- planet_osm_roads
- planet_osm_ways
Tagging scheme
Every geometry imported will be tagged with:
source=AAAS_satellite_extraction_for_Mathare_2011
Shapefile column | OSM tags |
---|---|
id | ignored, since not used in shapefiles |
designated | see table below |
type=* | building:type=* |
name=* | name=* |
Shapefile Designated=* | OSM tags |
---|---|
Toilet | building=yes amenity=toilet |
NGO/CBO | building=office office=ngo |
Administration | building=office office=government |
CBO | building=office office=ngo |
Self-Help Group | building=yes note=Self-Help Group fixme=Find an appriopriate tag for "Self help group" |
Medical Clinic | building=yes amenity=hospital health_facility:type=medical_clinic |
Business | building=commercial |
Petrol Station | building=yes amenity=fuel motor_vehicle=yes |
House | building=residential |
Apartment Building | building=apartments |
Hardware Shop | building=commercial shop=hardware |
Church | building=church amenity=place_of_worship religion=christian |
Bar | building=yes amenity=bar |
Bathroom | building=yes amenity=shower |
Private Toilet | building=yes amenity=toilet access=private fixme=mathare preset scheme |
Garage | building=garage |
Public Toilet | building=yes amenity=toilet access=public |
Market Place |
building=yes |
Cyber Cafe | building=yes amenity=cyber_cafe |
Keys:
- access
- amenity
- building
- building:type
- fixme
- health_facility:type
- motor_vehicle
- name
- note
- office
- religion
- shop
- source
Preparing the database for the new data
DROP TABLE IF EXISTS mathare_building_import; CREATE TABLE mathare_building_import ( shp_gid integer, osm_id integer, layer integer, "access" text, "amenity" text, "building" text, "building:type" text, "fixme" text, "health_facility:type" text, "motor_vehicle" text, "name" text, "note" text, "office" text, "religion" text, "shop" text, "source" text, the_geom geometry, CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2), CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'POLYGON'::text OR the_geom IS NULL), CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 4326) ) WITH ( OIDS=FALSE ); CREATE INDEX build_the_geom_gist ON mathare_building_import USING gist(the_geom);
Processing the data
INSERT INTO mathare_building_import ( shp_gid, "building", "building:type", "access", "amenity", "fixme", "health_facility:type", "motor_vehicle", "name", "note", "office", "religion", "shop", "source", the_geom ) SELECT g.gid, ( CASE WHEN g.designated IN ('NGO/CBO', 'Administration', 'CBO') THEN 'office' WHEN g.designated IN ('Business', 'Hardware Shop' ) THEN 'commercial' WHEN g.designated = 'House' THEN 'residential' WHEN g.designated = 'Apartment Building' THEN 'apartments' WHEN g.designated = 'Church' THEN 'church' WHEN g.designated = 'Garage' THEN 'garage' ELSE 'yes' END ) as building, g.type as building_type, ( CASE WHEN g.designated = 'Public Toilet' THEN 'public' WHEN g.designated = 'Private Toilet' THEN 'private' ELSE NULL END ) as access, ( CASE WHEN g.designated IN ('Toilet', 'Private Toilet', 'Public Toilet') THEN 'toilet' WHEN g.designated = 'Petrol Station' THEN 'fuel' WHEN g.designated = 'Medical Clinic' THEN 'hospital' WHEN g.designated = 'Church' THEN 'place_of_worship' WHEN g.designated = 'Bar' THEN 'bar' WHEN g.designated = 'Bathroom' THEN 'shower' WHEN g.designated = 'Cyber Cafe' THEN 'cyber_cafe' ELSE NULL END ) as amenity, ( CASE WHEN g.designated = 'Self-Help Group' THEN 'Find an appriopriate tag for "Self help group"' WHEN g.designated IN ('Public Toilet', 'Private Toilet', 'Toilet', 'Bathroom') THEN 'MapMathare WATSAN-preset scheme missing' ELSE NULL END ) as fixme, ( CASE WHEN g.designated = 'Medical Clinic' THEN 'medical_clinic' ELSE NULL END ) as health, ( CASE WHEN g.designated = 'Petrol Station' THEN 'yes' ELSE NULL END ) as motor_vehicle, g.name as name, ( CASE WHEN g.designated = 'Self-Help Group' THEN 'Self-Help Group' ELSE NULL END ) as note, ( CASE WHEN g.designated = 'Administration' THEN 'government' WHEN g.designated in ( 'NGO/CBO', 'CBO') THEN 'ngo' ELSE NULL END ) as office, ( CASE WHEN g.designated = 'Church' THEN 'christian' ELSE NULL END ) as religion, ( CASE WHEN g.designated = 'Hardware Shop' THEN 'hardware' ELSE NULL END) as shop, 'AAAS building extraction from satellite imagery "MM128982" for Mathare 2011' as source, g.the_geom FROM groundtruthed84 as g;
Find collisions between OSM and the import candidate
Very few buildings have already been mapped in OSM in Mathare (139 as of 2011.05.09). Yet, we should be careful whilst importing the dataset from aerial imagery extraction. While we can discard the geometry of the existing buildings, we shall preserve their attributes.
Basically, we shall check for geometries that overlaps and apply the attributes of the OSM element to the imported one.
Note: I can see two tricky corner cases, which I don't need to worry about right now since this import does not feature them. However, overlooking these issues in another import will likely result in a loss of data:
- nodes of the buildings bear attributes
- multipolygons
We can find the overlapping elements with this query:
mathare#= select g.gid, o.osm_id, from "groundtruthed-wgs84" as g, planet_osm_polygon as o where ST_Intersects(g.the_geom, o.way) ORDER BY g.gid;
This script removes the columns, which are empty:
#!/usr/bin/python import psycopg2 conn = psycopg2.connect("dbname=mathare") cur= conn.cursor() cur2 = conn.cursor() cur3 = conn.cursor() cur.execute("select attname from pg_attribute, pg_type where typname='planet_osm_polygon' and typrelid=attrelid ;") for record in cur: column = record[0] print column query = cur2.mogrify("SELECT count(*) from planet_osm_polygon WHERE \"" + column + "\" IS NOT NULL" ) cur2.execute(query) for c in cur2: if c[0] == 0L: query = "ALTER TABLE planet_osm_polygon DROP COLUMN \"" + column + "\" CASCADE;" print query cur3.execute(query) print repr(cur3.statusmessage) conn.commit() cur.close() cur2.close() cur3.close() conn.close()
It turns out only one of the pre existing buildings that overlap carries information that should be kept:
mathare=# SELECT DISTINCT m.shp_gid, o.osm_id, o.name, o.amenity, o.religion FROM planet_osm_polygon AS o, mathare_building_import AS m WHERE (o.name IS NOT NULL OR o.religion IS NOT NULL or o.amenity IS NOT NULL) AND ST_Intersects(m.the_geom, o.way); shp_gid | osm_id | name | amenity | religion ---------+----------+--------------------------------+------------------+----------- 517 | 96051625 | St. John's & St. Paul's Church | place_of_worship | christian (1 row)