Example transit data download service.

Example transit data download service.

Over the last few years, governments, businesses, research organizations, and others have embraced the open data movement to enormous benefits. To steal a phrase from FME World Tour presenter Chris Rado: it makes information available to city agencies, to the public, and to Batman.

This post will explore what open data means, who produces and consumes it, best practices for integrating it in your workflows, and recommendations for sharing your own data with the world.

It’s kind of a big deal.

Open data is free data that anyone can use for any purpose—and public interest in it is exploding. Take a look at this graph from Canada’s Open Government website.

Canada Open Government analytics

Source: Canadian Open Government Analytics

This is the number of Americans who have expressed interest in moving to Canada since—Wait, no, sorry. It’s the number of visitors to the Open Government Portal over the last year.

Why is it so popular? Open data means we have access to information about the places, businesses, and organizations we care about. It means transparency of government, a degree to which George Orwell would be proud. It enables collaboration, innovation, and scientific and technological advancement. (A degree to which Bill Nye must be proud.)

Open data is useful for:

Where can I find open data?

Governments are the main providers of open data. Most western governments are actually mandated to provide it. You can find portals at the city, state, and federal levels, plus via intergovernmental organizations. For example, Surrey Heath in England publishes all their expenditures in this (FME-hosted!) data download portal. In the City of Vancouver portal, you can also find such extremely urgent data as where the nearest food truck is.

quotes

 

CBC: “This is data that’s ultimately been paid for by taxpayers, one way or another”

 

Non-governmental organizations (NGOs) and nonprofits (NPOs) have always supported the democratization of data, and are now producing open data themselves. Crowd-sourced open data and geomapping can ensure that time and money is spent where it’s needed most. For example, within 48 hours of the Nepal earthquake, a crowd-sourced relief effort resulted in the mapping of thousands of miles of roads and tens of thousands of buildings. These maps enabled rescue plans.

A significant number of private companies are offering open data upon realizing transparency is power. Adopting open data practices has helped corporations improve net profits.

Academic institutes are also sharing data, including universities and scientific research organizations. There are complexities around sharing scientific data, but progress is being made towards more transparent biomedical research. Already the Accelerating Medicines Partnership (AMP) has collaborated on data in three disease areas.

This FME workspace reads from the City of Vancouver portal, and then uses the ChangeDetector to check for changes.

This FME workspace reads from the City of Vancouver portal, and then uses the ChangeDetector to check for changes.

How to use open data

It’s easy to consume data directly from a portal in FME Workbench. You can read from a URL or FTP by pasting the link into the reader. Then you can do anything you want with it, like integrate with other sources, manipulate the content and structure, turn it into your face, and/or write it out to whatever format you need it in.

To monitor for changes, send the data through the ChangeDetector to get only the updates.

To turn data into your face… I don’t know. You’ll have to ask Dmitri about that one.

For total automation, you can set up FME Server to pull the data at scheduled intervals.

Tips for creating your own open data portal

Making your data open is trivial. Just make a GeoCities page and put a link to the PDF file, right? Well, creating a nice, useful open data portal takes planning.

quotes

 

Scientific American: “In the case of open data sites, what we want to make are tools that make data understandable to humans, but also, to the search engines that humans use to explore the web.”
City of Surrey Open Data site

Homepage for the City of Surrey‘s award-winning Open Data portal, powered by FME.

 

For a top-notch example, check out the City of Surrey’s portal, which has won the Open Data for Democracy and the Canadian Open Data Excellence 2016 awards. They offer a vast range of data in an easy-to-navigate site, plus they let users draw a polygon on a map to pick the exact area they want to download. And yes, it is powered by FME.

If you want to make your data public, here are a few tips.

1. Update the datasets frequently

The problem with many open datasets is that they’re static and don’t get updated enough. Make sure your data is updated regularly. You should also provide your data as a published feed (e.g. RSS) or API rather than statically downloadable files. This will allow people to consume the endpoint, and if you make updates they will be automatically reflected in the user’s app.

You should also connect your portal directly to your master database rather than duplicating the data across two locations.

You can set up FME to do this for you. Create an FME workspace to synchronize your portal with your database, and use the ChangeDetector to apply just the updated fields instead of reloading entire datasets every time. You’d use FME Server scheduling to run the synchronization process automatically, and FME Server data streaming to provide the feed.

2. Offer coordinate system choices

For spatial data, offer more than one option for the coordinate system. Your end users might want Spherical Mercator (EPSG:3857) for a web mapping application, or WGS84 lat/long (EPSG:4326) for GPS navigation systems, or a precise local projection like State Plane. Give them the freedom to pick the one they want. We recommend offering both local and global projections.

In FME, this is done by making a published parameter in Workbench so at runtime you get the choice. To set the coordinate system on the data, you would use the Reprojector transformer (which uses the CS-Map reprojection engine — but others are available, e.g. Blue Marble, Gtrans, Esri).

3. Ensure the data is good quality

Make sure your data is good quality before making it public. This includes validating geometry, attributes, standards compliance, format-specific issues like XML / JSON structure, and more. Consult our data quality checklist for a thorough guide to geospatial data QA.

In FME, this can be done automatically using validation transformers like the AttributeValidator, GeometryValidator, XMLValidator, Tester, and others.

4. Offer format choices

By definition, open data should be easy for the public to use. Offer a choice with respect to format. Here are our recommendations:

Open Data as a PDF: Looks nice, kinda useless.™

Open Data as a PDF: Looks nice, kinda useless.™

5. Choose the right delivery solution

As for delivering the data, here are a few solutions you can leverage (alphabetically, not necessarily ordered by awesomeness). We have a webinar and an ebook that explore these in detail.

Free the data!

We’re going to be seeing a lot more open data in the world due to overwhelming popularity. Plus, there are no excuses when it comes to the technical side of things. With the rise of the cloud and automation tools like FME, it’s straightforward and cheap (and fun!) to create an open data portal.

Of course, there will be demand for higher quality data, not just more of it. Open data must be easy to find, use, and collaborate on. We also expect to see open data become normalized so it’s easier to compare cities globally.

With free access to data, citizens will be able to use it for amazing things. Canadian Open Data Experience (CODE), for example, was a hackathon focused on using open data to solve problems and increase productivity. NYC BigApps 2015 was another one aimed at building tools to overcome pressing civic challenges.

Further reading / watching:

About Data Data Download FME Cloud FME Server Open Data

Tiana Warner

Tiana is a product marketing manager at Safe Software. Her background in computer programming and creative hobbies led her to be one of the main producers of creative content for Safe Software. Tiana spends her free time writing fantasy novels and riding her horse, Bailey.

Comments

One response to “Guide to Open Data: Using it, Sharing it, and Creating a Portal”

  1. Joecarpenter says:

    Thanks for this post. Its really very helpful in creating a open data portal. Expecting some more posts related to this.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts