Old Maps Online
I should have made a post about this a while ago, but I didn’t want a half complete post, and the scope of my project kept expanding!
Part 1: Scraping
I found two huge repositories of old digitised maps of Australia, many of which are in the public domain. The National Library of Australia and Parish Maps from the Department of Lands NSW. Unfortunately they didn’t really have a nice documented RESTfull API for the use of the images and metadata. My first step was to extract as much information as I could and convert it into an intermediate format. Most of my code and documentation for doing this is at https://github.com/andrewharvey/govscrape in those two respective folders. Unfortunately it’s not as easy as running one command from my repo to download and parse all the data. My goal was to get the data to my machine, not write a robust system that anyone could run to get a clone of the nla and pmap repositories.
Part 2: Georeferencing
It would be great if I could push out an easy to use API for the data I collected from the scrape stage, but I don’t have the resources (let me know if you are willing to help out with server resources to host these old public domain maps). Even without a nice interface to the data, I could still play around with it and to see what use I could make of it. I dabbled into using these maps as a source of data for OpenStreetMap. I only got through a few of the maps, I put this on hold as I figured it would be easier (especially for others) to do this if they were georeferenced. I tried out both http://warper.geothings.net/ and QuantumGIS, but both had way to much lagging. So I rolled out my own solution which was just a bunch of scripts which used Inkscape and a hacked libchamplain demo as the GUI. The code and documentation for this is at https://github.com/andrewharvey/georeferencing-scripts.
The georeferencing data that I have made so far (it’s a big task!) is at https://github.com/andrewharvey/georeferencing-data.
Part 3: Sharing
From the data and code from the last step, I’m able to push out these old maps in several formats. I used gdalwarp to convert the maps into Transverse Mercator (well actually I don’t really know what they are, but this seems to work), from here I can use gdal2tiles.py (…finally understanding the difference between OSM Slippy map tilesnames and the OGC TMS… take note that gdal2tiles.py produces TMS format tiles which differs from OSM style as it has the y axis going bottom to top, see http://groups.google.com/group/maptiler/browse_thread/thread/aa89fc726b8f7261/8bdc39d7829cc80c) to push out an OSM slippy map like tile directory, I can push out a KML GroundOverlay, or you could probably use a WMS server to push it out through WMS. I really wanted to leave it open.

Overlay from public domain map, http://nla.gov.au/nla.map-rm2795. Background CC BY-SA 2.0 OpenStreetMap Contributors, http://www.openstreetmap.org/

Parishmap as backgrop in JOSM. Data CC BY-SA 2.0 OpenStreetMap Contributors, http://www.openstreetmap.org/. Background public domain map PMapMN04/14015601.

Overlay from public domain map, PMapMN04/14015601. Background CC BY-SA 2.0 OpenStreetMap Contributors, http://www.openstreetmap.org/
I would post a Google Earth one too, but its too much effort to get a free background in there for the screenshot. I’m not convinced that this display of the data is user friendly. Having control of the transparency of the overlay is a must. Maybe one day, someone will crop out all the non-map parts of the parish maps so we can get a single whole of NSW parish map slippy map.
I suppose now I need to focus on the infrastructure. It should be really easy for a user to browse the available maps and view them either as a KML, an OpenLayers overlay. I should also plug this into the meta-data I scraped and have stored in CSV like files.
The problem I have with distribution right now is that many of the maps need warping and that means I need to host the warped image somewhere. Some could probably be georeferenced from their source image using just translate, scale and rotate, and hence should be able to use the source image from the government server to serve the georeferenced imagery. But the work flow I’ve set up so far, relies on using gdalwarp, and hence having access to the warped image.
Hello Andrew,
I am the author of GDAL2Tiles and MapTiler, which were created as part of the project “Old Maps Online”: http://help.oldmapsonline.org/. I am working for some time on a set of tools for georeferencing of the scanned maps.
You can try our tools at: http://www.georeferencer.org/. It is a free online service. There is an introduction video too (unfortunately a bit raw in this moment). The tools directly support all NLA maps and we can add support for more maps easily.
We provide comfortable online georeferencing tool, KML for original imagery, and we have also WMS which runs as a proxy for original image, tiles for Google Maps and OSM, etc. There is a MapAnalyst tool too – which provides the cartometric accuracy visualization.
Most of the functionality is done already – we are now stabilizing the system, and improving the look&feel of the service and general usability. New cool design is on the way… We hope to officially announce the service for general public next year.
BTW you can download the control points you create in our system – and use them with gdalwarp or other tools, if you have access to the imagery. We plan to export an API based on OpenLayers – so other people can extend the functionality and create new cool functionality for already georeferenced maps.
There is also a search engine specifically designed for searching in large collection of georeferenced maps: http://www.mapranksearch.com/.
Feel free to contact me if you have any questions or comments to our tools.
Just to add a sample:
if you are standing at NLA on the “Zoom” view – just use the Georeferencer bookmarklet.
For example:
http://www.nla.gov.au/apps/cdview?pi=nla.map-nk6485-sd
Will bring you to:
http://www.georeferencer.org/map/3T58S5MtCIBQYZAPGlqa7X/
Regards,
Klokan
Hi Klokan,
Good to hear from you. (gdal2tiles.py is very handy! Thanks for releasing it.) I haven’t had a chance to dig too deep into the “Old Maps Online” project, but on the surface it seems like a fantastic service.
For me and the aims I mentioned in this post, it seems like the project would definitely be of help in sharing this data and making it useful. Now that I’ve already (mostly) done Part 1 (Scraping of the scanned maps) and got the work-flow set up nicely for Part 2 (Georeferncing), I’ve just got Part 3 (Sharing) left. Part 3 is a big task. The interface at http://www.georeferencer.org/map/3T58S5MtCIBQYZAPGlqa7X/201010041111-JGPUNOz/ is very much like what I hoped to get set up to make the NLA of PMap maps useful. Is the interface and backend to http://www.georeferencer.org free software?
Web based referencing services (http://www.georeferencer.org/ and http://mapwarper.net/) seem great for the casual user who only wants to do a bunch of images, or would like to just donate a little bit of time and not have to do heaps of work to set everything up. But for me the latency is too much.
About this one http://www.georeferencer.org/map/3T58S5MtCIBQYZAPGlqa7X/ how many other NLA maps are georeferenced (no one wants to go through the painstaking task of manually georeferencing if they don’t have to)? Is there an index? Can I download the GCPs in more machine readable format?
Moving forward would you be interested in hosting many more of the NLA (and also the other pmap ones from my post) on http://www.georeferencer.org, if I could provide in an easy to parse format, the links to the raw sid files, and the GCP referencing data for those files. I have made these available, but they aren’t too easy to find and use.
Regards,
Andrew
Hi Andrew,
to answer your questions:
> Is the interface and backend to http://www.georeferencer.org free software?
The interface is build with OpenLayers. We contributed with patches to add the missing functionality into the library. It was accepted and now you can find our modifications in the official OpenLayers tree.
The backend for storing control points is build with AppEngine. The mapping services behind are based on patched GDAL tools, MapServer and IIPImage. We have released most of the components as free software too – patches went back to the original projects.
The complete application is designed as a “cloud service”. As such is is easy to customize (logos, custom domains, etc) or embed in another web applications, but not easy to install on your own hardware – and you probably do not need to do that. If you really want to install your own copy of georeferencing interface then you better use MapWarper or Metacarta Rectifier. But it takes you time, and you need to upgrade regularly for new features and fixed security bugs.
We are keen to maintain one online instance of the Georeferencer – this means that the newly developed functionality is exposed automatically to all projects who are using Georeferencer. It is similar centralized model like GMail has.
If you choose to use Georeferencer web service – you can just implement GCP extraction and run your own mapserver with applications you want to develop. Georeferencer is going to provide you the georeferencing interface and plenty of other tools for free. You can create your own next to that.
If you create a cool service on top of georeferenced maps we are keen to make it official – available for all maps which are inside of Georeferencer, and link you from the front page of every map.
> How many other NLA maps are georeferenced?
All of the publicly available NLA maps are supported by the system. Not many are georeferenced. But anybody can help and participate on the georeferencing via the Georeferencer online interface.
> Is there an index? (of NLA maps)
No, beacuse we do not have official allowance from NLA in this moment. But you can always get the indivudual maps from Georeferencer if you want.
Were you speaking with people from NLA about similar applications?
To go the official way is always better.
> Can I download the GCPs in more machine readable format?
Yes. Indeed:
http://www.georeferencer.org/map/3T58S5MtCIBQYZAPGlqa7X/201010041111-JGPUNOz.json
If you say it is slow and with big latency – you probably mean WMS service / tiles. You are free to use GCPs data from Georeferencer and set-up your own mapserver on your hardware infrastructure – to have the WMS locally and fast. But then you need access to the imagery (which you seems to have anyway already) and you need to host the images on your server.
Please let me know if you will go this way.
> would you be interested in hosting many more of the NLA (and also the other pmap ones from my post) on …
We are interested in hosting the GCPs. We do not host the imagery – these are always linked from the original Internet location. I have seen your data at https://github.com/andrewharvey/georeferencing-data
We have a REST API where it would be possible to post the data like you have, but in this moment it is not stable enough and we are busy on another project, so there is no time to improve the API till the end of the year. But it is going to happen.
Right now you can write a simple application where you create HTTP form and POST your data in “gcps” attribute into the correct URL. Otherwise you can just create new control points with Georeferencer – and download them in JSON format for use in your workflow.
This way the georeference is collaborative – other people can help you and you are sharing your work on control points with community.
Just to say it clearly:
With Georeferencer we are not downloading the images, we are not hosting the images on our servers, we don’t need to do that technically.
We are cooperating with libraries and allowing to open their data to GIS community officially, if there is an interest. The decisions must come from the side of people who are scanning the maps.
Have a look at some of the applications of Georeferencer tools:
http://geo.nls.uk/maps/georeferencer/
http://www.nationaalarchief.nl/georefereren/
We are keen to participate also with National Library of Australia or other libraries in Australia. Officially.
I am looking forward to see what cool application you create with our Georeferencer. I hope it is interesting tool for you…
Regards,
Klokan (Petr Pridal)
Klokan,
>All of the publicly available NLA maps are supported by the system. Not many are georeferenced. But anybody can help and participate on the georeferencing via the Georeferencer online interface.
This was my goal, and part of the reason why I put the GCP data for the ones I’ve done on github, so anyone help out by forking then issuing a merge request. Sure this isn’t really good, because the casual user won’t be able to do this, and shouldn’t have too. I envisioned a nice web interface where you could add GCPs and also post (and see the revision history) amendments to this.
>> Is there an index? (of NLA maps)
>No, beacuse we do not have official allowance from NLA in this moment. But you can always get the indivudual maps from Georeferencer if you want.
>Were you speaking with people from NLA about similar applications?
>To go the official way is always better.
I meant like, how can I find out which ones are done already, and (assuming they are under a compatible license) so I can merge these into my collection?
I have not spoken to people from NLA. In my opinion, scanned copies of public domain maps are public domain, the policy taken by wikimedia. Not sure what you mean by “official way” though.
>> Can I download the GCPs in more machine readable format?
>Yes. Indeed:
>http://www.georeferencer.org/map/3T58S5MtCIBQYZAPGlqa7X/201010041111-JGPUNOz.json
Cool. Just a tip, it would be great if there was a link to this somewhere in the HTML page.
>If you say it is slow and with big latency – you probably mean WMS service / tiles. You are free to use GCPs data from Georeferencer and set-up your own mapserver on your hardware infrastructure – to have the WMS locally and fast. But then you need access to the imagery (which you seems to have anyway already) and you need to host the images on your server.
>Please let me know if you will go this way.
Just generally I found OpenLayers to be slow, my method of georeferening is using the local old map, and a local cached OSM mirror.
>> would you be interested in hosting many more of the NLA (and also the other pmap ones from my post) on …
>We are interested in hosting the GCPs. We do not host the imagery – these are always linked from the original Internet location. I have seen your data at https://github.com/andrewharvey/georeferencing-data
I see, the problem I found with doing this is that gdalwarp actually warped the images sometimes, and even if it say just rotated it, I wasn’t sure how to get a world file out from these GCPs using gdal to use the original image. I saw your tool gcps2wld or something like that, but I couldn’t get it to work.
>We have a REST API where it would be possible to post the data like you have, but in this moment it is not stable enough and we are busy on another project, so there is no time to improve the API till the end of the year. But it is going to happen.
>Right now you can write a simple application where you create HTTP form and POST your data in “gcps” attribute into the correct URL. Otherwise you can just create new control points with Georeferencer – and download them in JSON format for use in your workflow.
>This way the georeference is collaborative – other people can help you and you are sharing your work on control points with community.
I see. I might look into this later, but I’m short on time at the moment.
>Just to say it clearly:
>With Georeferencer we are not downloading the images, we are not hosting the images on our servers, we don’t need to do that technically.
>We are cooperating with libraries and allowing to open their data to GIS community officially, if there is an interest. The decisions must come from the side of people who are scanning the maps.
>We are keen to participate also with National Library of Australia or other libraries in Australia. Officially.
I don’t represent the NLA, better ask them.
Regards,
Andrew