My Top 6 Takeaways from SQL Server Spatial 2012
SQL Server’s spatial capabilities have evolved significantly since their introduction in the 2008 release. Given the well documented beta (CTP), the March RTM, and April release, there has been a great deal of coverage of the new spatial features in SQL Server 2012. (See great posts by Alastair Aitchison and Lenni Lobel, and be sure to check out Ed Katibah’s blog. My favorite summary is a 12 minute video by Greg Low.)
I’d like to reflect on the six spatial enhancements to SQL Server 2012 that are of most interest to myself and Safe Software.
My Top 6 Takeaways from SQL Server Spatial 2012
Following the OGC model (like PostGIS), curve support comes with three new types: circular strings (sequences of circular arcs), compound curves (sequences of circular arcs and straight lines), and curve polygons (areas bounded by curved and/or straight segments).
SQL Server 2012 brings three major innovations to the table:
- Full support for existing operations: For example, you can get a curve’s length, or find the shortest line between anywhere on a curve and another geometry.
- A round-earth definition of curves: You can define curves in the geodetic “geography” type – an industry first – and again, all of the spatial operations just work. We see lots of lat/long data with curves inside. With SQL Server, you can now query and manipulate this data without stroking arcs into lines or projecting your data onto a locally-flat coordinate system.
- Buffering with curves: Buffering expands a geometry by a specified distance in every direction – and typically results in lots of round edges. SQL Server 2012 optionally represents the rounds bits as arcs, which makes the results more accurate and compact. As above, this works with both the locally-flat-earth (geometry) and globally-round-earth (geography) types.
The round-earth geography type can now represent shapes that don’t fit inside a hemisphere (very roughly: the ones that can’t be fully seen from any point in space). Taking that idea to the limit, there is also new way to specify “the whole earth”: FULLGLOBE. Bob Beauchemin has a nice pair of posts on the subject.
While there isn’t a lot of data that falls into this case, when you need it, it’s a big deal. It also means you can express concepts like “everywhere except ”, which can make for more expressive queries.
This extra power comes with extra responsibility – the sole difference between “somewhere” and “everywhere else” comes down to coordinate order – so more care is required to get this right.
Creating spatial indices in SQL Server 2008 is hard. Specifically, it’s not obvious how to choose good index parameters (and by good I mean “makes queries faster at all”). This is a big deal for two reasons: (i) For real-world data volumes, you need spatial indices to get reasonable performance, and (ii) If you’re using SQL Azure, you can’t even load spatial data without them.
So I am very excited that SQL Server 2012 lets you skip these parameters (you still need to know the bounds for projected data), and it figures everything out automatically.
One of the biggest hassles with SQL Server 2008’s round-earth geography type is that invalid data could not be loaded at all. SQL Server 2012 lets you load this data and repair it after – a huge convenience.
The new release also adds an improved validation method, IsValidDetailed(), that provides detailed answers to the question “why was my geometry considered invalid?”
As an experiment, I loaded a Shapefile containing country boundaries in lat/long into SQL Server 2008 and 2012. SQL Server 2008 rejects Fiji (because the data breaks it into two pieces along the 180 degree meridian, and multipolygons can’t share a common edge) and Antarctica (for a similar reason, but more generally due to the non-geodetic nature of the input data and issues near the poles). SQL Server 2012 loads all the data, and then provides tools to inspect and repair afterwards. Interestingly, it detects three invalid features, with Russia joining Fiji and Antarctica, perhaps due to precision issues (more on that below).
SQL Server 2012 uses more precision (48 bits instead of 27) for geometric operations, resulting in fewer slivers, less visible rounding of coordinates, and generally better results. In some cases, the improvements are dramatic: In Alastair’s blog, he gives an example of two lines; SQL Server 2008 says they don’t intersect, and SQL Server 2012 says they do. I’ve reproduced his results, and would like to add only one thing: a picture.
At Safe, we’ve spent a phenomenal amount of time battling problems like these, so this result isn’t entirely surprising. I do find it interesting how much easier it is for a person to see the answer, though (at least for this data).
An important aside: The increased precision is about processing, not storage. Coordinates are still stored as IEEE doubles, with their 53 bits of precision. Why do we care? It means the internal representation of existing geometric types (points, lines, …) hasn’t changed.
Relationship to SQL Azure
Today, SQL Server 2012 has more spatial functionality than SQL Azure – Azure is limited to operations that work on the old geometry types. Over time, that role will reverse, with new spatial features debuting in Azure, and eventually making their way to the next major release of SQL Server. This reinforces an ongoing trend – the cloud is getting more important.
Overall, I see SQL Server 2012’s spatial support as a well-thought out, significant evolution from the 2008 release, with improvements for all the major pain points. Are you using SQL Server Spatial? If so, what do you think of the new functionality?