Heatmap of cycling accidents in Edinburgh 2005–2017

Spatial data API with GraphQL, PHP and MySQL

Some years ago I bought an internet domain offering a PHP based server backed by a MySQL database. I kept it for years without using it because I had neither the desire nor the time to write or develop. On the other hand, since when I graduated in GIS I have always had the desire to take up and improve the final dissertation work. One of the objectives of the project was to develop a cycling route planner for Edinburgh which took into account past cycling accidents; providing the users with potentially safer cycling routes compared to those of the common cycling route planners, here you can see the presentation web page. Though it required a lot of work and gave a me an interesting insight on the distribution of cycling accidents in the city of Edinburgh, the final product was not ready to be published. I have recently decided to resume some of that work exploring the possibilities offered by APIs. I wanted to design one myself and I wanted to use an innovative approach for spatial data. I took the data I used for that research, updated it and loaded in MySQL to serve it through a GraphQL API. In the next parts of the story I am going to tell you how I did it, so you you could use this as an example to replicate or design some better solutions. This story covers the design of the back end data handling that could serve a mapping application based on it.

If you want to see the live demo of a mapping application built with this API at its core head over here.

For those who already know how to build a GraphQL API and want to have a look at the data you can jump straight to test the API with this end point:

http://www.yomapo.com/edicycle/server/accidents_api.php

click here to see an example of a request (if the link does not work just scroll to the bottom of this page).

The code of this tutorial (both server and demo) is on GitHub. This tutorial assumes you already have a server environment running a PHP interpreter and a spatially enabled database (not necessarily MySQL). In my case both are hosted but you can set up your local ones (i.e. with XAMMP on Windows). The PHP version used in this tutorial 5.36 and MySQL version is 5.6. This tutorial assumes you have some basic knowledge of the GraphQL specification, if you don’t and you want to learn more I can recommend these videos.

What we are going to build

We will build a simple GraphQL spatial data API. We will be able to query geo- referenced cycling accidents occurrences in the city of Edinburgh between 2005 and 2017. The data I used is called STATS19, is public and can be downloaded from here. The source data covers the whole United Kingdom and records a lot of information about every single accident (time, vehicles involved, weather conditions, etc.). For this example I pre-filtered the original source data, keeping information about accidents date and severity only. The source data is stored on a web server hosted database. The client will be able to filter accidents by attributes and spatial location, data will be delivered in GeoJSON and it will be readable by the most common mapping JavaScript libraries (Leaflet, OpenLayers). In this tutorial we will only cover the Read method of the canonical CRUD paradigm.

The tools we will use — GraphQL and MySQL

GraphQL could be considered an API specification, a query language (more exactly a Schema Definition Language) and a server side runtime at the same time. The first definition derives from the fact that despite being based on the HTTP protocol and being totally stateless GraphQL does not use the URIs to define access to resources, it partially uses the HTTP methods and these are not related their operations, as opposed to REST. The representation of the to-be-accessed resource and method are included in the body of the request sent from the client under the form of a graph entity (hence the name) which structure can be compared to a JSON object but remember that the GraphQL specification does not depend on any language, the GraphQL specification is called Schema. Within the schema we can find different entities; object types for describing data, queries for reading resources or mutations for creating/updating them. The GraphQL schema enforces a contract in the exchange of data between the server and the client, which means that the client sends a request that already includes the representation of the structure of the wanted resources and the server responds following that data representation .

Example of the GraphQL DB (Deutsche Bahn) API to query information about train stations — we can see the structure of the retrieved results (right) are a representation of the query schema on the left. Available at https://bahnql.herokuapp.com/graphql

The server runtime quality of GraphQL expresses itself in the processes used by the API to accept, interpret and gather the data to send as a response. The way it’s done depends on the implementation of the server side language, but generally is based on the server side definition of object types that describe how to interpret the client request and the method to access the data. These object types also have functions called resolvers that are the interface with the data storage.

MySQL supports geometry types and some spatial capabilities without additional spatial extensions, useful for data storage of non-specialised spatial web applications. From version 5.7 MySQL can parse and format GeoJSON, the version used in this tutorial does not have this feature, so we will use GraphQL object types to recreate the structure of a GeoJSON object to include in the response.

Set up MySQL and graphql-php

This library is a layer between the PHP server and the native GraphQL schema implementation, and according to description given above will serve as the server runtime and schema definition. We will define the structure of the response and include inside it the queried data we will collect from the database. You can download the library from here, unzip the folder and paste it in the directory where we will store all the files that will make the API work, in my case is . Once that is done, we will create the main PHP file holding the API back-end business logic and which name will be our API endpoint, in my example the name of the file is in a hosted server this API endpoint will eventually show up as . Differently from what happens with REST, this URL will be unique for accessing all the API resources, besides the only method used will be POST. Let’s set up our

In the first line we require a file called , this file is needed for accessing the graphql-php classes we will need later. This file does not exist yet and it has to be manually created inside the library directory we unzipped earlier, in my example is named , the file has to look like follows.

The second require is for a class that wraps all the main database operations using PDO (PHP Data Objects) a PHP database driver for MySQL, already included in the standard PHP distribution. For the outcome of this example I will only include the method of the driver, wrapped into the static function, the file will be called and it has to be placed in the root directory of our application. The file will be like this

At this point your folder structure on the server will look like follows:

api_root_folder/
graphqlPHP/
autoload.php
src/
...
accidents_api.php
DB.php
...

Shaping the API structure(I) — Object Types

The code that follows will fill the white space in file and it will define the looks of both our input, the query, and our output, the GeoJSON object. In GraphQL the concept of data type is central, being a strongly typed query language we will have to define types for every single piece of information/property which will be used by the runtime environment to process the query and the output of the API. In GraphQL we have built-in standard types such as , , but we can also define our own custom types as nested objects of standard types, becoming instances of . This is what we are going to do here, being the GeoJSON a nested object with children objects (which has children too) and . Let’s first have a look at the structure of the GeoJSON Point feature we will return from our API, declaring the type for each single non-nested entity.

//accident GeoJSON feature{                                   //object
type: "Feature", //string
geometry: { //object
type: 'Point', //string
coordinates: [ //list of floats
-3.56, //float
56.36 //float
]
}
attributes: { //object
year: 2017, //integer
severity: 'Slight' //string
}
}

What we will do next is to “translate” this structure into the GraphQL schema using graphql-php and classes. Graphql-php offers support for lists that we will use to contain our features and custom types. Other than a list of GeoJSON objects, our response will contain an additional list of type objects calledholding a summary of the count of accidents for each year considered, let’s get the code in.

To create Type we use the inline syntax of graphql-php — for more info see the docs

A particular observation is worth pointing out is the property that could be potentially misinterpreted as a property of the output object. It is not the case, this is just a property required by graphql-php, will not become the name of the property in the output GeoJSON, these are defined in the

Shaping the API structure(II) — Schema Definition

As I said earlier GraphQL is a Schema Definition Language which means that the way the client accesses the resources of the API is specifying a specification of the schema, which is a query and a representation of the response at the same time. If in the previous part we defined the building blocks of the response, in this part we will define the query, matching fields and their types with the data from our storage. Let’s see how it’s done.

The query definition differentiates itself from other instances of class because we have two more properties, and . The first are the arguments passed by the client when making a request, just like the notation used for parameters in REST APIs. Those arguments will allow the client to specify accidents by year, severity and location. Also, the type is declared and it is a list for searching multiple years and severity intensities in the same query. The geometry filter is also a list of floats (coordinates), the client will provide and we will create a bounding box from those coordinates to spatially query our data. The field describes our resolver function, which is where the DB data is retrieved and transformed in the output GeoJSON. To return data from the database I created two stored procedures and accepting the same arguments we saw earlier . I will leverage MySQL spatial capabilities converting server side the coordinates in WKT format and using the spatial function on the geometry column of my database table.

Once we returned all rows from the stored procedure above we will iterate over them and for each one we will match the value of table row name with the property defined in the object of our schema. This last part is the one which seemed to me more ‘magic’ and harder to understand about field resolving, it works like this. We have a resolver for every field in the query and we will define the type for each field before we resolve, with the property. In our case types are 1 . list of and 2 . list of. When resolving the fields, as long as the field names we want to fill with data will match those defined in the property for that same field, we only have to replicate the structure of the object we want to return. Graphql-php will know at every level of nesting that we are actually trying to fill with data that particular field for that particular . Sounds complicated to understand at first but after spending some time observing the code and trying it out it will become familiar.

Formatting functions

These functions transform the “user friendly” parameters in “database friendly” parameters. Accident severity in fact is stored on the database as coded values (1: fatal, 2: serious, 3: slight). The parameter year is stored in the database as a string containing the complete date of the accident (i.e. ) but I prefer to manipulate it front end as an integer. In addition, the array of coordinates passed from the client is parsed into a WKT type as it’s accepted by the spatial function in the database, the code is the following.

Query Execution

What we have seen so far are all graphql-php classes to define our API data types, query and fields resolving. We haven’t yet instructed the library to turn these definitions into the actual data to return to the client, merging the query with the input parameters. This process takes place within the method, let’s see it in our code.

Before executing the query we wrap the object type in the graphql-php class, this is what will make it different from the others objects types defined before, the library will look into its fields and populate them with using the provided in the function. How do graphql-php get these ? These will parsed from the client query object, the argument in the example above, and saved in the variable as a list. Finally the query is executed,, and are sent to the graphql-php engine which will work out the results and return them as a PHP array that can be converted to GeoJSON and echoed back to the client.

Query Testing

An example of a valid query against the API is as follows:

As we can see the query object includes explicitly all the fields we want to return, if we omitted one of them the query would selectively not return it. The query will return serious and slight casualties accidents falling into the bounding box having the coordinates specified. In this case query arguments are hard coded for clarity but will be parametrised when we use the query for a web application, for example firing a request following an user-driven event (we will see this later in the front-end implementation). If we wanted to test the API we could use GraphiQL, a UI-based GraphQL client that allows us to format the query and see the response side-by-side. Let’s see it for our API with the query above.

Conclusions

This is how a simple spatially enabled API is built, it was exciting and a bit challenging. The hosted environment put the constraints on the back-end stack and forced me to confront PHP which is language I didn’t know very well. From the experience acquired with this project I can say it’s great, despite the notation could seem unusual compared to other server side languages, it’s really easy to pick up and I could define it “slick” and “predictable” when it runs, I surely will study it more in the future. Same for MySQL, discovering its spatial capabilities made it even more interesting and I think it can be the right choice for many non-too- specialised spatial data web based applications. The tool I struggled a bit more with was graphql-php, I found the documentation not very suitable for PHP beginners plus the PHP notation surely doesn’t help with that. If you want to start using the tools seen in this story and you currently develop mostly in Javascript/Typescript I would recommend to rely on the GraphQL documentation for those languages first, build something simple and then move to graphql-php once your basis are solid.

The design of this API is just a component of a complete yet simple full stack application I am working on, to finally accomplish the objective of publishing my former academic work. If you liked this please show your appreciation and continue following, see you!!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store