Extracting vector features from an image (or “vectorizing a raster”) involves converting groups of raster pixels into polygons. This is a common scenario for GIS and CAD workflows, and a challenging one. Maybe you’d like to digitize a scanned map, convert a DEM to polygons based on elevation, extract text from an image, or otherwise generate polygons from a supplied image.

There are many possible ways to tackle these problems, some easier than others, some more accurate. Here are three ways you can do it by processing your raster in an FME data integration workflow.

1. Trace an image to convert it to CAD or GIS

Potrace is a free tool for turning bitmaps into vector graphics. In FME, you can leverage it via the custom transformer PotraceCaller. Converting raster to vector in this context involves three steps:

  1. Prepare the raster (e.g. identify color ranges in the input JPEG)
  2. Call Potrace
  3. Prepare the output (e.g. generalize polygons, style for AutoCAD)

In the below example, the user has an RGB image of an area and needs to extract CAD polygons of buildings. In the image, buildings are shown in a darker shade. Preparing the raster therefore involves classifying color ranges so the number of colors is reduced — extracting the dark shade representing buildings and leaving everything else as the background. Preparing the output involves smoothing the polygons generated by Potrace so we end up with nice blocky shapes instead of crazy 1000-sided chiliagons.

The red polygons overlaying the image have been extracted using the PotraceCaller.

Try it: Potrace Examples

Follow this step-by-step tutorial and download the FME template made by Dmitri for an example of how to convert an image to a CAD drawing via Potrace. To run it, you’ll need to download Potrace, then in FME open the PotraceCaller parameters and point it at potrace.exe.

An example FME Workspace converting JPEG to DWG using Potrace.

2. Extract text from an image with Optical Character Recognition

Converting an image to text can be done by leveraging Tesseract, a free tool that performs OCR. In FME, this can be done with the custom transformer TesseractCaller.

OCR is useful for digitizing scanned maps and documents, making the data searchable and indexable. As above, this workflow also involves pre-processing steps to define color ranges, and preparing data for the output format. Read more about OCR in FME and download an example.

The text shown in the PNG image on the left has been extracted using the TesseractCaller.

Learn more about converting imagery into usable data in our blog about extracting geospatial data from PDFs.

3. Extract polygons from a raster based on pixel values

This is also known as “classifying” a raster and involves generating a polygon for each contiguous area of pixels with similar values. Like #1, this involves defining color ranges and outputting polygons based on those ranges.

FME has support for a lot of raster and imagery formats, as well as a lot of very powerful functionality for working with rasters. To classify a raster and convert groups of pixels to polygons, send it through the RasterToPolygonCoercer. While this method is simpler since it doesn’t involve downloading the 3rd-party tool Potrace, note it’s more intensive and therefore slower than Potrace. If you have a big raster, it’ll probably be more worth your while to use method #1.

To learn more about classifying rasters, check out this tutorial on the RasterExpressionEvaluator transformer, which can be used to calculate expressions or conditions on each cell in a raster.

This DEM has been converted to polygons using the RasterToPolygonCoercer.

Tip: For a huge raster, consider converting it to a point cloud and using FME’s superfast point cloud processing capabilities to perform your transformation of choice – e.g. group points by their component values and dissolve into polygons.

*

Vectorization is just one way people are incorporating images into their CAD and GIS projects. Basemaps, textures, and 3D models are a whole other area where rasters are able to speak 1000 words in terms of adding context and richness to any dataset. How are you using rasters with your CAD and GIS projects?

About Data Data Transformation DEM Imagery Optical Character Recognition Raster Vector

Tiana Warner

Tiana is a Senior Marketing Specialist at Safe Software. Her background in computer programming and creative hobbies led her to be one of the main producers of creative content for Safe Software. Tiana spends her free time writing fantasy novels, riding her horse, and exploring nature with her rescue pup, Joey.

Related Posts