Publishing OpenData with AppNow

OpenData

OpenData logo by @fontanon

Open Data is a philosophy to make data public & accessible on the internet. Governmental data, expenditures and investments, economic indicators, resources exploit, anonymous medical information, weather information, genomics, or universe exploration to cite a few. I.e. Data.gov & Data.gov.uk are good sources of governmental data from US & UK government respectively.

This movement is a prerequisite for other ones like:

  • Open Government to add transparency to the work of our politicians or
  • Open Science where science facts and research papers are freely shared on the net for anyone to conduct a further research today or in the future on the current knowledge. That what Newton referred as “If I have seen further it is by standing on the shoulders of giants” and it is one of the best way to boost the human knowledge and scientific progress.

In these days we are living, one can argue we are surrounded by tons of data, and more and more our capacity to process it is rapidly saturated and filtering mechanism are more needed than ever to reduce the noise over the signal.

On other hand, not all the relevant data is published. Not all is in digital form, not all is available for others. Providing fast and cheap publication mechanism can help to spread openData with valuable one.

What we can do as Developers?

Open Data requires software, in particular two types of software:

  1. Software to publish the relevant data for other
  2. Software to find, combine, process and visualize relevant facts about the former.

Today we are deep a little in the first type of software: SW for publication, letting the second one for another blog post.

Publication. We as developers provide tools to make it easy for end users to publish the relevant data they have access to be published online.

It was a few weeks ago in a twitter conversation with Pedro Gonzalez @pgonyan talking about possible AppNow Uses Cases when he sugested to study its potential for OpenData projects.

We loved the idea and we started thinking: How fast we can enable people to publish data online with AppNow?
TL;TR: 3 minutes, 48 seconds. (see the video).

Well, that was the fast response for the ones that do not need the details. For the REST, let’s explain it with more detail.

AppNow is a minimalist model-driven tool to derive a backend from a Class Model. It exposes a textual DSL plus a projectional counterpart editor.
Once the model is ready, the application is:

  • generated to the MEAN stack (MongoDB, Express, AngularJS and NodeJS)
  • deployed into a PaaS like heroku.

In less than two minutes, any user can go from the model spec to the running implementation on the cloud and share it with others with minimal or no development skills.

We think this tool well suited for prototyping a backend in seconds and consume it from mobile devices, for example.

Spreadsheets: where a lot of valuable data life in

Spreadsheets are ubiquitous in the enterprise and home computers. In absence of enterprise applications, business users employ spreadsheets to track, simulate, create budgets, or control assets, make predictions or gather raw data from many sources for further processing.

Per se, spreadsheets are a gems for opendata. Take them to the cloud using Office 365 or Google Spreadsheets and you have a good starting point to share the data with third-parties.

However, spreadsheets are just a set of sheets, composed by a table of cells. That is: the data formats and structure inside the spreadsheet is semi-formal, semi-structured in the way that data can be interpreted in many ways and we need to capture the intention of the spreadsheet creator.

We have more formal ways to store data for further processing. Developers uses to call them databases. Independently if they are relational, graph or document base, databases helps to organize data in types, tuples and relate each data piece with others in way is more convenient for data collection, filtering, and processing specially if data arrives in high quantities and/or frequently.

Adding features to AppNow to enable OpenData publication

Arrived to here, we brainstorm about what features to add to AppNow to enable OpenData fast publication from Spreadsheets. And we conclude with the following ones:

  1. Derive a class from a spreadsheet workbook (assuming it contains tabular data).
  2. Import data on the generated app to be able to process the data in the spreadsheet and bring it online.
  3. Export data back to well-known formats like CSV or XML: opendata requires data to survive technology: ANY technology. In this way no app becomes a trap for data.

After a pair of weeks implementing it, you can see the result here: video.

A sample data file representing Income Growth by countries in 2013 was used to publish two datasets:

  • Country list and
  • Income by country

The model derived after the importation is similar to the following one:

class Country
{
    string CountryName;
    string CountryCode;
    string? Region;
    string? IncomeGroup;
    string? SpecialNotes;
}
class Income
{
    string CountryName;
    string CountryCode;
    decimal? GrowthPercent;
}

The importation process inspects the data cells to derive data-types, infer is data is nullable, and provides sensible property names.

After the model is complete a full backend is generated and deploy in the Heroku platform.
See the final result here: http://acme-opendata.herokuapp.com/ using the password: icinetic.

Starting from this input spreadsheet:

  • a backend was derived from the spreadsheet,
  • a full backend was deployed on the Heroku platform
  • the real data was imported from the spreadsheet to the backend
  • the backend exposes a full REST/JSON API documented using the Swagger format: Explore the API

Using (4) and Swagger client proxy generation tools (see http://swagger.io/ for a full list of resources) it is pretty straightforward to enable a client front-end to be connected with this backend.

The data is published in a set of standard open technology, easily reproducible, exposed as REST/JSON in a way it is very easy to work with and consume.

The data publication took exactly 3 minutes and 48 seconds (the length of the video) as anticipated in the spoiler section.
Was it agile enough, isn’t it?

So, this is the end of the story for today.
A good next one would be to build a meaningful frontend application to interpret the data. Anyone?

One comment.

  1. […] Read full story => Pjmolina […]

Post a comment.