Categories
Leaflet maps olpc openstreetmap

Offline Solution for OpenStreetMap (OSM)

I’ve recently became involved with XSCE (School Server Community Edition) on their “Internet in a box” project to allow OpenStreetMap (OSM) maps to be available offline. Some of their deployments in remote schools around the world do not have a consistent internet access. So the idea is to download and store a set of knowledge resources (Wikipedia, videos from Khan Academy, OSM maps, etc) on a server, which will then provide those resources while being offline to laptops connected on the internal network.

Here are the constraints that need to be considered

  • The laptops that will be visualizing the maps are very underpowered. They are often XO laptops from the One Laptop per Child OLPC project.
  • The server, while not being as underpowered as the laptop, are typically quite limited as well on the HD, RAM and CPU.
  • Server handle other tasks than providing maps so this can’t be using entirely the hardware available
  • Server specs are not consistent from a deployment to the other (but they do have in common that they must run the XSCE software)
  • Deployments’ needs are rarely the same, they can be in any region of the world and each of them might not want the same level of map details for the same countries
  • Server is typically configured by a volunteer that has internet access, before it is deployed in remote locations. While they do have IT knowledge, this need to be simple enough.
  • Map does not need to be updated every week, but it needs to be relatively recent. If the server gets internet access once in a while, it needs to be able to update the maps relatively easily

The solution chosen is shown on that architecture diagram.

Since the server specs are limited, the map tiles needs to be pre-rendered before they make it to the XSCE internet in a box server. They cannot be rendered on the fly from the native OSM solution which uses a PostgreSQL database with PostGIS because it requires too much resources and would require to provision a different database for each deployment.

The pre-rendered tiles are stored into a MBTiles file, which is a format created by Mapbox that allows to stores efficiently millions of tiles in a SQLite database (which is then stored in a single file). It is efficient because it avoids duplicate tiles, which is frequent with large area of water. This also simplifies deployment because all you have to do is to move few files around instead of potentially copying millions of PNG tiles stored directly on disk.

To allow saving previous HD disk space, there will be a global planet OSM MBtiles (that does not zoom above level 10, which only zoom up to the city level) and then each country will be available for download as a separate pre-rendered MBTiles file (for zoom level 11 to 15). So for example, if the deployment is in Nepal, they could decide to download on the server the planet MBtiles file to get the map of the whole world, and then only specifically download the higher-zoom file for Nepal, to allow to zoom up to the street level. Downloading the whole world at zoom level up to 15 would require way above 1TB of HD space, which we can’t handle. This is why we want to get a high zoom level only for the countries that are needed by the deployment and based on how much HD space they have to spare.

To serve the MBTiles on a web server, there are a few options like TileStream (node.js) and TileStache (python). I chose TileStache, because it supports composite layers, which allows to serve multiple MBtiles file at the same time. TileStream only supports serving one MBTiles at a time, which would require to merge multiple MBtiles together, which is possible, but complicates deployment and makes it harder if we want to add/remove only specific countries later on. TileStache can serve tiles on WSGI, CGI and mod_python with Apache. XSCE also happens to already run multiple tools with Python and use WSGI with another tool, so the integration was easier (click here for details on the integration).

Then all you need is a simple HTML page, that will load Leaflet as a client side javascript library and will be configured to query  the Tilestache tile server located on the local network.

This solution is entirely based on raster tiles, instead of vector tiles. While vector tiles offers significant savings in terms of disk usage, they require much more CPU usage to render on the frontend and newer browsers, which is impossible with the type of hardware that we have (XO laptops).

The big remaining question is, where are those tiles being rendered, where are they stored and how can they be downloaded on demand by the XSCE server? This is a topic for a further blog post!

Categories
maps openstreetmap

Merging multiple MBTiles together

I’ve started to use TileStream as a node.js server to serve pre-rendered tiles saved in a .mbtiles file. This server can then be used as a tile layer by Leaflet to build a frontend HTML page that will show a map.

As I said in a previous blog post, MBTiles is a format created by mapbox that allows to stores millions of tiles in a SQLite database, which can be useful if you want to build an OpenStreetMap solution that will store tiles offline or store it in your own server.

There are many alternatives to TileStream, such as TileStache (in python). One benefit of TileStache is that it supports composite layers, which would allow you to serve multiple MBTiles at the same time on the same map. One reason you might want to do that would be if you have a MBTiles file that contains the map tiles of the whole OSM planet with zoom levels from 0 to 10. And then you could have a specific country in another MBTiles file that has the zoom levels 11 to 15. By combining both, you allow a seamless experience where someone could zoom up to 10 on any part of the world and then zoom up to 15 in a specific country.

While TileStream does not support this, you could instead decide to merge the two MBTiles file together, to a single file. This single file can then be served by TileStream.

As part of the MBUtil project, a patch bash shell script is provided that allows to do exactly that, available here: https://github.com/mapbox/mbutil/blob/master/patch It is as easy as executing the script, while providing the “source” and “destination” arguments to merge the two files (the destination MBtiles will become the merged file). Example:

./patch.sh Nepal013.mbtiles Nepal-1415.mbtiles

This script could also be used if you wanted to update an existing larger MBTiles file, with a newer MBTiles file (that might contain newer tiles for a specific region).

While this script will merge the two set of tiles together, it will not update the metadata of the MBTiles file. For example, if my destination file was a MBTiles file that contained the Nepal region from zoom level 14-15, and I merged it with the zoom levels 0-13, the metadata in the destination file will still mention that the minzoom and maxzoom are 14 to 15. I downloaded DB Browser for SQLite (Mac/Windows/Linux), open my merged MBTiles file, went to the metadata table, and from there it’s easy to update the minzoom to 0. (This step might not be required, this depends how strict your MBTiles implementation is, but this is a good practice to have your metadata match the actual data in the MBTiles)

Screen Shot 2015-08-06 at 09.57.22

After those steps, TileStream was able to serve the single merged MBTiles file across all zoom levels.

Categories
maps openstreetmap

Running Maperitive on MacOS X

I was looking for a way to run maperitive, which is a software that allows you to convert OpenStreetMap (OSM) data files in .pbf and .bz2 (that you can find on Geomatrik) to a Mbtiles file format. Mbtiles is a format created by mapbox that allows to stores millions of tiles in a SQLite database, which can be useful if you want to build an OSM solution that will store tiles offline.

Maperitive is a .NET application, that should typically works with mono on Linux and on Mac and will work natively on Windows. However, the author does not provide official support for Mac (installations for Linux are here).

I first downloaded Maperitive (2.3.33 at the time of this writing)

I first tried to install mono with homebrew and then launch the executable maperitive.exe located inside the maperitive directory
mono maperitive.exe
However I got this error
Unhandled Exception:

System.TypeInitializationException: An exception was thrown by the type initializer for System.Drawing.KnownColors —> System.TypeInitializationException: An exception was thrown by the type initializer for System.Drawing.GDIPlus —> System.DllNotFoundException: libgdiplus.dylib

However, I got lucky and found a recent solution in this github ticket

First had to install brew cask

brew install caskroom/cask/brew-cask

Installed mono-sdk with brew cask

 brew cask install mono-mdk 

And then opened the installer to install mono-mdk in OS X

 open /opt/homebrew-cask/Caskroom/mono-mdk/4.0.2/MonoFramework-MDK-4.0.2.macos10.xamarin.x86.pkg 

(At the time of this writing, this was Mono 4.0.2, adjust the path from the previous command in consequence)

After that, all I had to do was type


env PATH=/Library/Frameworks/Mono.framework/Commands:$PATH mono Maperitive.exe 

And maperitive launched!

I didn’t end up using maperitive (it does everything in RAM and does not scale well if you have a big OSM file), but I thought I might give the solution here.

Categories
Crossfilter D3 javascript Leaflet maps

Creating a data visualization tool using D3.js, Crossfilter and Leaflet

I’ve recently completed my first javascript data visualization project at The Sentinel Project for Genocide Prevention to develop a dashboard that tracks indicators of hate crime in Iran.

You can take a look here:
http://threatwiki.thesentinelproject.org/iranvisualization

Screenshot of Threatwiki Data Visualization

Initial requirement

I had a set of data coming from a JSON REST API and I wanted to show the main data in a table, display the geographical coordinates of the data on a map and be able to filter by date and tags.

Technology choices

I chose Crossfilter to be able to filter through the data, D3.js to generate/display all the data and the map itself is created with leaflet.js

Implementation details

For the map generated with leaflet.js, I used CloudMade (which I highly recommend) to get maps from OpenStreetMaps. I added a layer on top of the map to show different regions of the country, which was coming from Natural Earth shapefiles and converted to data with TopoJson and added on the map with D3. Leaflet turned out to be a great tool that I will definitely use again, I used the D3 + Leaflet tutorial from Mike Bostock to learn how to combine the 2 tools together. To add the datapoints at specific locations on the map, I used the marker functionality of Leaflet. I started adding the datapoints manually on the map by generating the proper svg tags with D3 on the canvas but I switched to markers because this will allow me later to integrate with LeafLet Market Cluster. This Leaflet plugin gives you the possibility of combining multiple closely located points on a map, which becomes useful when you have a high concentration of points in a a small region that are hard to distinguish at the normal zoom level. This other tutorial Let’s make a map by Mike Bostock  also come handy when working with D3 and maps.

There was a good learning curve necessary to understand D3.js. Reading code from examples is not always enough, I had to read about the core concepts and the introduction available on their github wiki. However, once you understand the concept, you realize that this is a very powerful tool to generate all kind of visualization either visually (in a canvas using SVG) or just displaying a set of data as text.

To be able to filter by tags, date, and other metadata, I used Crossfilter, a open-source library developed by Square. To learn about Crossfilter, I used this excellent example on their website. I used their example code to create the visual bar chart that is used as a timeline. There is also a great tutorial on the Wealthfront Engineering blog that really helped me to understand the concepts of the library.  Crossfilter was definitely easier to grasp and understand. The most recent version of the tool, 1.2.0., released just a few days ago, is definitely necessary. They introduced the concept of filterFunction, to give you more control on how you filter your data.

For a typical filtering in Crossfilter, you only provide the object to the filter function of a chosen dimension and Crossfilter will do the filtering. You  need to implement a filter function when your data structure gets more complex. Let’s say you have a record that can have one or multiple tags. If you only filter by giving the name of a tag, you will only get back the records that contain very specifically only this 1 tag and ignore all the records that have this tag combined with other tags.

So you want to implement a filterFunction like this:

window.filter = function(tagname) {
 byTags.filterFunction(function (tag) {
  if (tag!==null && typeof(tag)!='undefined'){
   for(i=0; i<tag.length; i++) {
    if (tag[i].title==tagname){
     return true;
    }
   }
  }
  return false;
 });
}

In that case we will go through our array of tag inside the record and return true to the filter at the moment we find the matching tag inside the array, so it wouldn’t matter if we have 1 tag or 10 tags on that record.

To learn more and get involved

If you are interested to look at the code, our project is hosted on Github. You can also look directly at the visualization javascript file. The data for this project is free of access to anyone who wants it, drop me an email if you need information on that.

That visualization for The Sentinel Project for Genocide Prevention has been announced on their blog.

If you have some experience with data visualization, maps, GIS or you just are a data nerd send me an email, we have lots of interesting projects coming up!