A little while ago Marcus Blake from the Australian Bureau of Statistics asked the OSM community about the potential use of some ABS data. As I mentioned on the list I think it is good that at least some government departments are making their data available under free licenses and that they engage with with the community to sort out any technical details about the data.
As described by the ABS, the ASGS is essentially data describing geographical areas.
Working out which if any structures should be incorporated into OSM and how needs careful consideration, and I’ve posted some of my thoughts to the list. In the mean time, since the data does contain some landuse information I’ve been looking into how best to use this information to aid in mapping. A blind import is not an option in my opinion, but I thought it would be handy to see the data as a base map when mapping.
I did try using ogr2osm to convert the data to the OSM xml format to load into JOSM (I even got the translateAttributes function for ogr2osm working for this dataset), but due to the nature of the data, I think a simple raster underlay works well. I tried two approaches in parallel.
- shp -> osm (using ogr2osm) -> postgres (using osm2pgsql) -> raster tiles (using mapnik).
- Using GeoServer to serve a WMS which can be loaded into JOSM.
Option 2 seemed to require less set up time. Simply used the GUI to load the shapefile, and apply a style. Then load the WMS into JOSM.
One caveat, if you want to load a WMS services from GeoServer into JOSM, I found the URL should look something like:
Slightly unrelated but if you are using tomcat or jetty locally but only occasionally (like I do), I find it is best to use sysv-rc-conf (eg. sudo sysv-rc-conf tomcat6 off) to disable the tomcat or jetty daemon from running at boot, whilst still allowing you to start it (sudo service tomcat6 start) when you need it.
If anyone is interested in getting such data in JOSM and would like more details, just let me know.
I should have made a post about this a while ago, but I didn’t want a half complete post, and the scope of my project kept expanding!
Part 1: Scraping
I found two huge repositories of old digitised maps of Australia, many of which are in the public domain. The National Library of Australia and Parish Maps from the Department of Lands NSW. Unfortunately they didn’t really have a nice documented RESTfull API for the use of the images and metadata. My first step was to extract as much information as I could and convert it into an intermediate format. Most of my code and documentation for doing this is at https://github.com/andrewharvey/govscrape in those two respective folders. Unfortunately it’s not as easy as running one command from my repo to download and parse all the data. My goal was to get the data to my machine, not write a robust system that anyone could run to get a clone of the nla and pmap repositories.
Part 2: Georeferencing
It would be great if I could push out an easy to use API for the data I collected from the scrape stage, but I don’t have the resources (let me know if you are willing to help out with server resources to host these old public domain maps). Even without a nice interface to the data, I could still play around with it and to see what use I could make of it. I dabbled into using these maps as a source of data for OpenStreetMap. I only got through a few of the maps, I put this on hold as I figured it would be easier (especially for others) to do this if they were georeferenced. I tried out both http://warper.geothings.net/ and QuantumGIS, but both had way to much lagging. So I rolled out my own solution which was just a bunch of scripts which used Inkscape and a hacked libchamplain demo as the GUI. The code and documentation for this is at https://github.com/andrewharvey/georeferencing-scripts.
The georeferencing data that I have made so far (it’s a big task!) is at https://github.com/andrewharvey/georeferencing-data.
Part 3: Sharing
From the data and code from the last step, I’m able to push out these old maps in several formats. I used gdalwarp to convert the maps into Transverse Mercator (well actually I don’t really know what they are, but this seems to work), from here I can use gdal2tiles.py (…finally understanding the difference between OSM Slippy map tilesnames and the OGC TMS… take note that gdal2tiles.py produces TMS format tiles which differs from OSM style as it has the y axis going bottom to top, see http://groups.google.com/group/maptiler/browse_thread/thread/aa89fc726b8f7261/8bdc39d7829cc80c) to push out an OSM slippy map like tile directory, I can push out a KML GroundOverlay, or you could probably use a WMS server to push it out through WMS. I really wanted to leave it open.
I would post a Google Earth one too, but its too much effort to get a free background in there for the screenshot. I’m not convinced that this display of the data is user friendly. Having control of the transparency of the overlay is a must. Maybe one day, someone will crop out all the non-map parts of the parish maps so we can get a single whole of NSW parish map slippy map.
I suppose now I need to focus on the infrastructure. It should be really easy for a user to browse the available maps and view them either as a KML, an OpenLayers overlay. I should also plug this into the meta-data I scraped and have stored in CSV like files.
The problem I have with distribution right now is that many of the maps need warping and that means I need to host the warped image somewhere. Some could probably be georeferenced from their source image using just translate, scale and rotate, and hence should be able to use the source image from the government server to serve the georeferenced imagery. But the work flow I’ve set up so far, relies on using gdalwarp, and hence having access to the warped image.
A little while back I sent the RTA an email to try to claify the copyright license of some of their data so I could determine if I could use it. Like this data feed, http://livetraffic.rta.nsw.gov.au/data/traffic-cam.json. The link to the copyright license is broken. This is the response I got,
The RTA supports and encourages open access information, and is determined to grow its range of traffic resources to provide developers with access to live updates and traveller information feeds.
Request for licensing agreements are assessed against the following considerations:
- Consumer benefit
- Legal constraints
- Road safety
- Technical capacity
- Availability of data
If you would like to apply for a license agreement for the RTA’s Live Traffic content, please submit a proposal for your service and how you would like to use the RTA’s content, including the above considerations. Your proposal will then be assessed by the RTA.
What a load of garbage, if you trully “support and encourage open access information” then you would release these data feeds under a public domain like license. Saying “we will make a case by case decision once you tell us your proposal” is just going to hinder the innovative use of that data. People will create their own datasets independent of you, which of course this could go either way. The peoples dataset could be either better or worse quality than yours, but either way you would save people some work if you helped out by releasing your data under a free and open license. You are a government department, not a company.
So a while back Geoscience Australia stuck this notice on their website,
Unless otherwise noted, all Geoscience Australia material on this website is licensed under the Creative Commons Attribution 3.0 Australia Licence.
Great, another source of free government data. So today when I find a use for some data held on their site I was very dissapointed to see,
“Please note: Any organisation or individual wanting to use the Gazetteer data in a similar capacity to the Online Place Name Search or any other online application, will require an Internet user licence. (See the Licence Fees and Order Form below).”
So much for the CC-BY license. The latter notice is an “otherwise noted” which says, this data is NOT CC-BY because if you want to use it for such an such purpose you need to get another license to allow you to do that.