Table of Contents
Data integration means combining information from various sources into something useful. It’s about efficiently managing data and making it available to those who need it. Both a technical and a business process, IBM defines it as “discovery, cleansing, monitoring, transforming and delivery of data from a variety of sources.”
1. Every data type has its strengths
Every data format was designed for a reason. Each one represents information in a way no other format can, with unique attributes, metadata, structure, and schema. Integrating data from different formats adds various levels of specialty to the dataset.
For instance, CAD and GIS data can be integrated to enhance a CAD drawing with specialized GIS information and attributes. The Culpeper Online GIS Mapping System is one example of a CAD-GIS data integration project.
2. Take advantage of specialized applications
Similarly, every application was designed for a reason. (Though this may be hard to believe, given the existence of Shark Fingers.)
Let me modify that to say, every piece of software that works with data represents, analyzes, and transforms information in a specialized way. By integrating data into a format accepted by that application, you’re giving yourself the power to open and use your data in that software.
For example, Cambian Business Services maintains data models from 60 different sources and at least 10 applications. They needed to integrate all of that data into a PostgreSQL database, then redeploy it to the original system. Data integration enables them to freely convert between formats and open their data in its original legacy application. (Read more about the Cambian Business Services data integration project.)
3. Reduce data complexity
April Reeve describes it well: “The number of potential interfaces between applications in an organization is an exponential function of the number of applications. Thus, an organization with one thousand applications could have as many as half a million interfaces …”
Thrashing around in that web of connections will only get you tangled. Forming a data integration plan, on the other hand, is like entering that web with a machete. Data integration is about managing complexity, streamlining these connections, and making it easy to deliver data to any system. This might involve creating a data hub that’s easy to publish to and subscribe to.
For example, Shell Canada unifies complex 2D, 3D, raster, and vector information into an easily readable 3D PDF dataset. Their data integration plan gets information to users that might not have access to specialized GIS software. (Read more about Shell Canada’s 3D PDF data integration.)
4. Increase the value of data through unified systems
Bringing disparate datasets together increases the value of the information. Examples include:
- merging with and leveraging external data (e.g. from vendors);
- combining data of varying natures (structured, unstructured, spatial, tabular, web, raster, big data, etc.);
- applying spatial information to non-spatial data;
- combining databases from different repositories;
- creating a uniform schema for a group of datasets without metadata standards;
- bringing collections of data from different sources into the common GML structure for compliance with INSPIRE (see examples).
Talisman Energy, for example, integrates disparate datasets into a central GIS repository. The datasets are often not joined to GIS geometry, have a rigorous updating schedule, and may come from internal or external sources. For them, data integration is essential for efficient visualization and unified data access. (Read about Talisman Energy’s integrated GIS database.)
5. Make data more available
Centralizing your data makes it easy for anyone at your company (or outside of your company, depending on your goals) to retrieve, inspect, and analyze it.
Easily accessible data means easily transformed data. People will be more likely to integrate the data into their projects, share the results, and keep the data up to date. This cycle of available data is key for innovation and knowledge-sharing.
For example, Alpine Shire Council integrates a range of complex source data from varying formats, including digital elevation models, Esri Shapefiles, and more. Calculations are applied to the integrated data to yield spatial and non-spatial results, which are made accessible in real time via an iPad app. (Read about the BAL Plan app.)
6. Easy data collaboration
Teamwork! (Cue cheesy stock photo.)
With accessibility comes easier collaboration. Anyone who works with your data will find it easier to use brain power now that they can actually use the data in the format they require. Whether collaboration involves sharing among internal teams and applications, or across organizations, integrated data is more complete because it has more contributors.
For example, the State of Indiana needed to combine specialized data from 92 counties in the form of points, parcels, streets with address ranges, and boundaries into an existing online GIS portal. Their data integration plan resulted in a non-invasive, easy way for all counties to collaborate on the data portal, despite each having a separate data management system. (Read about how Indiana harmonizes data in a central database.)
7. Understanding data means smarter business decisions
Integrated data means transparent processes within your company. By giving people the flexibility to use your data in whatever system, you’re giving them the opportunity to better understand the information. It’s much easier – and more informative – to navigate through organized repositories that contain a variety of integrated datasets.
Applying location intelligence to your dataset, for instance, makes it spatially comprehensive and offers new levels of insight around that dataset, which leads to better decision-making.
For example, the Skogskyrkogården data integration project combines databases, Esri Shapefiles, and JSON source data into one AutoCAD output that enables them to plan 100 years into the future.
Integrating with other data types can also mean conforming to interoperability standards to save on business costs. Further, open source data formats offer many advantages for businesses.
8. Data integrity
Data integration technology should cleanse and validate the information passing through. Obviously, we all want our data to be robust and high quality. An integration strategy ensures data is free of errors, inconsistencies, and duplication.
For example, the BC Transit system is made up of disconnected information on bus stops, vehicles, schedules, routes and ridership, and streets. These singular systems have made central transit information vulnerable to inaccuracy. A data integration strategy ensures BC Transit’s critical data is accurate and of high quality. (Read about BC Transit’s data integration and validation process.)
9. Make your data live
An integrated data solution makes it easy to keep information up to date. One input can propagate across all integrated systems, keeping your data current. In fact, your data can even be real-time if a server or cloud solution is part of the integration strategy.
For example, The Weather Network integrates non-spatial sensor data and metadata and outputs it to spatial applications like Google Earth. From there, real-time KML updates and email alerts can be leveraged. (Read about the Pelmorex Lightning Detection Network’s real-time data integration.)
You might have heard of a data silo – a fixed repository of information. Like a farm silo, its contents are isolated from the outside world. Did you know the feed inside a farm silo is often fermented? Did you know the air and liquid emitted during fermentation is toxic? Yeah. Think about that metaphor.
Even if data doesn’t need to be accessed regularly, siloed data is still inefficient to work with, impossible to collaborate on, and runs a high risk of conflicting with external data. By actively integrating your data, you avoid the dreaded silo and gain the boundless potential outlined above.
Any data type can live in a wide variety of sources. Developing an integration strategy—both the technical and business aspects—is critical to ensuring your organization’s data reaches its maximum potential. Learn more about data integration and how FME technology can help your organization tackle data integration – whether your data is big or small, spatial or tabular, structured or unstructured, open or proprietary, or all of the above.
Tiana WarnerTiana is a Senior Marketing Specialist at Safe Software. Her background in computer programming and creative hobbies led her to be one of the main producers of creative content for Safe Software. Tiana spends her free time writing fantasy novels, riding her horse, and exploring nature with her rescue pup, Joey.