It’s said that most data has a spatial component (location matters) and that asking “where” questions can lead to useful insight. Sometimes there can be more than one aspect or way to frame something’s location, size, or shape – and different approaches have been taken to capture these relationships.

In particular, I’d like to highlight the different path GIS and spatial databases have taken in response to this challenge. I’ll then outline some considerations you might face when working with data that spans systems which have taken different approaches.

Examples of Data Spatially Represented in Multiple Ways

  1. The outline of a geographic area could be stored with different levels of detail based on the scale of interest (e.g. as a polygon with many vertices, a polygon with few vertices, or as a single point).
  2. Similarly, you might have a number of alternative representations, (e.g. choosing to represent some lakes as areas, some as points, but none as both).
  3. A road might be represented by both its center line, and by lines representing its edges.
  4. In some cases, you might want to redundantly store the same geometry in different coordinate systems or storage representations.

(This discussion is restricted to vector data, but in the broader sense you could have vector, raster, 3D models, LiDAR, etc. for the same entity or area.)

The Evolution of GIS and Spatial Databases
The traditional GIS model is that each individual entity (called a feature) consists of a single geometry and a set of attributes. These are organized into layers by common themes (e.g. roads, administrative areas), which are often constrained to a single geometric type (e.g. all points, lines, or areas). In this model, the scenarios above might be handled by multiple layers linked together by a common id.

Some of the first approaches to storing spatial data in relational databases followed the same idea, using primitive types to model layers of features, each with a single geometry and a set of attributes. This strategy remains in common use with database systems that don’t otherwise support spatial data, typically as standardized via the first option in OGC’s Simple Feature SQL specification.

Over time, many relational databases have introduced first-class spatial types. This means they treat geometry just like strings, numbers, or dates, which has made it much easier to work with and analyse spatial data in these systems. Another implication is that their tables may contain more than one spatial column – just like they might contain more than one numeric column – and that presents an alternative way to model the four scenarios discussed above.

The idea of modeling features as having more than one spatial representation isn’t unique to spatial databases. For example, the GE Smallworld GIS uses this concept, as does the GML interchange format.

Practical Implications
There are a few things to consider when integrating or converting data from/between formats or systems using these different approaches:

When modeling spatial data which may be usefully represented in different ways, consider how your chosen software best handles this case, and determine if the added value is worth the extra complexity. When using and integrating this data, it is important to keep the different representations and relationships in mind so that you make the most of your investment.

About Data Data Formats Interoperability OGC Spatial Data Interoperability Spatial Databases Transformation

Paul Nalos


3 Responses to “When Features and Geometry Are Not 1:1”

  1. Tiana Warner says:

    I’ve read a lot about the increasingly-popular NoSQL movement lately. Big names like Google, Facebook and Twitter have adopted it and it seems likely that it will continue to spread. Do you think such a non-relational database might be a good solution for storing spatial data? Maybe NoSQL’s “graph databases” would be able to overcome some limitations imposed by relational databases?

  2. Paul Nalos says:

    That’s a great point, Tiana. While not an expert in this area, it’s easy to see there are use cases – like high volume websites with large amounts of frequently updated data – where the NoSQL approaches are getting huge traction. Also, I see a lot of interest on the web for storing and querying spatial data in these emerging storage systems (e.g., with Google App Engine). Maybe it’s time for me to check out Mongo DB and try their spatial querying solution: Exciting times, indeed.

  3. […] This post was mentioned on Twitter by Nathaniel V. KELSO and Nathaniel V. KELSO, Safe Software. Safe Software said: "When Features and Geometry Are Not 1:1" – new blog post from Paul Nalos on It's All About Data […]

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts