Upon asking myself “what is the difference between a geodatabase and geospatial database?”, and then disappointingly finding no obvious answers online for something that seemed like a simple question, I decided to write this, with the noble intention of saving someone else a little bit of grief and Googling time (and we can now say there are TWO of us who have asked this question!).
While this may seem like a simple question, the answer is less simple than I would have liked (probably why no one wrote about this previously). Plus, I’ll fully admit that getting started in the spatial industry is like learning a new language. Between acronyms, synonyms, colloquialism, and any other -ym or -ism you want to add, I still find myself questioning, “do I get this now?” (don’t worry, smart people fact-checked me – the following writing is factually accurate).
So, join me on this short journey, where we set out to define what a geodatabase is and compare and contrast it with a geospatial database, for glory and the greater good of the internet, and those of us who are just trying our best to learn.
The Origins and History of the Geodatabase
If someone says “geodatabase” and you automatically think Esri, you’ve got half of the definition down.
With a definition based on the component of its name, you can infer that geodatabases probably deal with geographic or spatial data, and have some kind of database component – a structure for storing, organizing and accessing data. By Esri’s current definition, a geodatabase is “a collection of geographic datasets of various types held in a common file system folder”.
Put into my own words, the simplest definition that I could come up with is:
A geodatabase is Esri’s proprietary file system folder for storing and managing geographic datasets.
What Makes a Geodatabase Database-Like Anyways?
Before breaking things down, let’s cover what a geodatabase looks like on your computer versus in Esri’s ArcGIS. On your computer, a geodatabase will be a singular folder with the extension .gdb. If you try to look at the elements in this folder in your computer’s native file browser, they aren’t really human-readable.
A .gdb folder existing on a user’s computer. This shows some of the content of the file once expanded.
But, when you open up this folder in ArcGIS or a compatible database system, a geodatabase will have more accessible content. You’ll find the contents are multiple spatial datasets and corresponding information on how they relate to each other and should be organized. Storing multiple items together makes it easy to access, share, and query multiple related spatial datasets.
Geodatabases were designed to mirror relational databases. If you’re not familiar with what a relational database is, just understand that:
- Data is organized in tables with rows and columns
- Columns have specific attribute properties (rules) that apply to all data entries
- Tables are uniquely identified and have defined relationships with one another
- You can query information in these tables with a language called SQL (Structured Query Language)
Geodatabases do all of the above, but also:
- Have some tables that are user defined and others that are system defined. Traditionally, all tables in a relational database are user defined. This difference goes back to the ‘proprietary’ part of the definition – Esri needs to maintain a specific type of structure for some of the data for it to be read and used within their software.
- Can be exported as a single folder. This means that the data can be easily viewed in any compatible software that supports reading information from a geodatabase.
Storing Data as Tables
Let’s cover the pieces actually making up the geodatabase puzzle. Each dataset stored within a geodatabase is stored as a table. Geodatabase tables are made up of rows of data defined by columns, and store information as numbers, text, dates, BLOBs (binary large objects), or global identifiers (a unique ID to a feature or row). These tables can have relationships to each other, thus giving them database-like capabilities.
Though there are many types of datasets that are geodatabase-compatible, these are some of the typical datasets you can expect to find in a geodatabase:
- Regular Ol’ Tables
Just because it’s a geodatabase doesn’t mean everything will always have a geospatial component. Tables with additional non-spatial data are stored as tables in a geodatabase.
- Vector Datasets (as Feature Class Tables)
In many GIS applications, features are a synonym for points, lines, and/or polygons – the geospatial information. A feature class is a group of features like road networks or mailbox locations that share the same properties like an identical geometry type (ex. all points), shared attributes, and/or existing within the same spatial reference (ex. one coordinate system in 2D space).Within a geodatabase, each feature class has a corresponding feature class table, which stores attributes for that feature. In the table, each row is an individual feature (like a single address) and columns store information about the properties of the feature (like what the postal code is for that address, or when this information was last updated).
- Regular Ol’ Tables
Addresses.gdb open in FME showing the qualities of the feature class PostalAddress. 1) Feature Class 2) Attributes of selected feature 3) Feature (a point) that has been selected 4) the PostalAddress table storing attributes, also showing the point selected.
- Raster Datasets
Geodatabases can also store more complex data like raster imagery. Rasters store data composed of pixels. These pixels communicate value or measurement, and, if georeferenced, have geographic properties that specify where those pixels are located and should be displayed.
Where is a Geodatabase Stored?
Geodatabases are stored in one of two ways:
- as a folder located on a user’s computer, or
- within a relational database, such as Oracle, Microsoft SQL Server, or IBM Db2 (see table below) maintained by the user or their organization.
The storage method selected depends on the kind of geodatabase being used (see table below).
Expanding the definition:
Geodatabase is Esri’s proprietary file system folder for storing and managing geographic datasets – which are held in tables about feature classes or raster datasets and stored on a user’s machine or within a relational database.
What Kinds of Geodatabases Are There?
Here’s a comparison table of the current geodatabase types that exist and some of their qualities.
|Name||Personal Geodatabase||File Geodatabase||Enterprise Geodatabase||Mobile Geodatabase|
|What is It?||One of the original geodatabase types that relies on Microsoft Access for storage.||For users working with minimal collaboration, this is a single file containing many datasets.||For large geodatabases that are constantly being edited, updated and accessed by multiple users in an enterprise.||Stored as a SQLite database, meaning that it provides the best performance for mobile devices, as a single user.|
|Where Does It Store Data?||Multiple personal geodatabases share a single Microsoft Access file (.mdb).||As a single folder, containing individual files of each dataset, which a user can save wherever they want.||In one of the following databases:
||Within a SQLite database, as a single file|
|Size Limits||2GB per Access Database||1 TB (unless scaled for imagery)||Dependent on the database management system (DBMS)||2TB|
|Who Can Access It?||1 person editing, anyone can read||1 person editing, anyone can read; best for information on a local machine||Multiple editors, anyone can read; best for shared data and access||1 person editing, anyone can read; best for information on a local machine|
|Strengths||Can be easily deployed, without a lot of overhead.||Local access, but can be licensed to other users to view.||Data can be versioned, so multiple people can edit without conflict.||Flexibility to be deployed on platforms like mobile open source software based|
|Things to Consider||32-bit Windows only
Not supported in ArcGIS Pro, largely deprecated at this point.
|No versioning of data or multi-user support||Requires fluency in DBMS, and maintaining a deployment.
Versioning can lead to merge conflicts
Only available on some DBMS
|No versioning of data or multi-user support|
What Then, is a Geospatial Database?
A geospatial database is miles easier to define (hahaha, spatial pun. Yeah, I know I’m the only one laughing). Here’s a quick test to determine if what you are using is a spatial database:
- Is it a standard database? Y/N?
- Does it store spatial data? Y/N?
If you answered yes to both of these questions, then great, you’ve identified a geospatial database!
The definition: A geospatial database is a database capable of storing spatial data.
A geospatial database is just a standard database that has been extended to support spatial data. To do this, a database adds the ability for:
- Natively storing spatial data within its existing data model
- A user to write queries within spatial context, instead of just with attributes
- Spatial data to be indexed by the database in order to ensure that queries are performant.
The term “standard database” itself is also a huge generalization. Databases exist in different flavours – we already mentioned relational databases above, but you may have also heard of something like NoSQL or graph databases or any number of others. I won’t get into defining those here but it’s important to understand that regardless of the base flavour, the above list of things that a standard database must do to support spatial data still applies.
The definition: A geospatial database is a database capable of storing, querying, and indexing spatial data in an efficient manner.
Is There a Difference Between a Geospatial Database and a Standard Database?
Yes, there is — though they are the same at the core.
Now, just to be clear, geodatabases can be stored in a geospatial database.
If you look at the kinds of geodatabase chart above, you’ll discover that enterprise geodatabases are reliant on geospatial databases like Oracle, Microsoft SQL Server, IBM Db2, PostgreSQL, or SAP HANA. A geodatabase can rely on a geospatial database, but this will also include storing Esri defined tables as a part of this. It’s not completely free for the user to make their own choices, and that is what differentiates a geodatabase from a geospatial database.
What’s the Difference Between a Geodatabase and a Geospatial Database?
|Also Known As||Geographic Database, Esri Geodatabase||Spatial Database, SDBMS (spatial database management system)|
|What is It?||Geodatabase is Esri’s proprietary file system folder for storing and managing geographic datasets – which are held in tables about feature classes or raster datasets and stored on a user’s machine or within a relational database.||A geospatial database is a database capable of storing, querying, and indexing spatial data in an efficient manner.|
|Who Owns It?||Esri||There is a market full of SDBMS options such as Oracle, MariaDB, Amazon, and Snowflake.|
|Strengths||Integrates with Esri, the leading GIS software.
Organization and structure is managed to some degree.
|Totally flexible deployment in terms of where it can be stored, how involved the management process is, and how much it will cost.
Open source options
|Things to Consider||Proprietary to Esri – not flexible for integrating with other software.
Complicated for multiple users to edit at the same time unless configured as an enterprise geodatabase.
Some storage limitations.
|Varying degrees of spatial support, depending on the DBMS.
Requires setup and maintenance to operate at it’s best. Bad design can lead to inefficient systems.
Why Use One Over the Other?
If you look at the pros and cons above, it really comes down to how you answer:
- Where do you want (or need) your data stored?
- What other software does it need to integrate with? How will it be used?
- How many people are going to need read only access vs editing access?
- What type of data are you going to be working with? Does that database support the features you want to store?
At the end of the day, as long as you understand the differences, it’s a personal choice of what makes the most sense to your business or situation. Kind of like how choosing a coordinate system for your data is more about the greater goal, and there’s no guarantee right or wrong choice.
If you’re not set on using one or the other, or know you’re inevitably going to have to make use of both, tools like FME can help you integrate various types of spatial data with geodatabases or geospatial databases. While migrating data can be a costly annoyance, it’s always an option if you feel the need to switch, and FME can help you do this with ease.
Create workflows to convert and transform your data in a drag and drop interface with FME. No coding required!
You’re even able to automate the process to ensure all your spatial data (and any other kind of data for that matter) is where it needs to be when you need it.
That was a lot, I know, I wrote it. But now that you’ve hit the end of the page, hopefully the information void of knowing the difference between geodatabases and geospatial databases has been filled, and you’re confident in the following definitions:
- A geodatabase is: Esri’s proprietary file system folder for storing and managing geographic datasets – which are held in tables about feature classes or raster datasets and stored on a user’s machine or within a relational database.
- A geospatial database is: a database capable of storing, querying, and indexing spatial data in an efficient manner.
If you want to learn more, I recommend the following resources to check out next:
- The databases and data warehouse integration solutions offered by FME
- The tutorial for Getting Started with Geodatabases
- The free training course on Esri Geodatabases and FME Desktop
- All parts of the blog series Spatial Data in the Cloud: Cloud-Native Relational Databases, NoSQL Databases, and Data Warehouses
Jenna LyonsJenna is a Product Owner at Safe. She originally got a BFA in Art and Design. How does one move from painting to tech? That’s a story for a longer bio, but know that her passion is creating, and you do a heck of a lot of that working in product. In her spare time, she likes cooking, crafting, and backpacking.
Hi Jenna, thanks for the blog!
I was hoping to find out whether there were any performance differences between a geodatabase and spatial database once they get quite big (tables with 10-20mil rows)?
Hi Jessica, thanks for reading!
Great question, but a tough one. Unfortunately, there’s no straightforward answer here because there are a lot of dependencies – one being the type of geodatabase you use. A file geodatabase has a limitation in terms of size, whereas an enterprise geodatabase is basically unlimited in terms of size. This may impact performance, but it will also depend on the data you’re using, where it’s being deployed, how it’s accessed, etc. In terms of performance, it’s less of a geodatabase (which is technically a type of spatial database) vs. spatial database question and more about the use case.
If you’d like to provide specific details on your use case, we’d be happy to dig deeper and run some experiments. Just leave your question on https://community.safe.com/s/questions and our experts will do their best to help you!
Hmmm. I disagree. I think you’re approaching this from the wrong angle (the database). There is no such thing as a geodatabase or a geospatial database or whatever. There is information that has a spatial component (and estimates are that 80% of all data has some form of lication), and you want to store that information as data in a database. Back in 1998 Oracle did that already quite nicely as a datatype, that could and can be used in any table. All large rdbms’ses now do that, and the better GIS applications can use that natively. Where Esri gets it wrong, is that they place demands on how to define your datamodel.
So, no, there should not be a difference between a database, a Geodatabase and a Geospatial Database, but Esri artificially creates that difference (and it goes for tables just the same). Others do that a lot better.