Anatomy of an FME Project: Parts I and II
Creating Blank Workspaces
IGDS Tag Reading
Geometry Handling Parameter
Backwards Text File Reading
You may not be aware but the US is not the only country with an upcoming federal (ie national government) election. Here in Canada we have one coming up too.
Because my wife works as a campaign manager for a Canadian MP, and because I’ve had enough of them colouring paper maps with crayons (I kid you not!) I persuaded them to buy a copy of FME and have been spending my evenings creating useful ETL projects for them.
One aspect that made a particularly interesting project was election signs. The campaign team records sign locations in an Excel spreadsheet, but had no way of showing this info on a map, or providing maps to the crew who put the signs up around town.
So I threw together a small project to bring them kicking and screaming into the 21st century, and will describe it, stage-by-stage, in upcoming postings under the title “Anatomy of an FME Project”
And by the time the series is complete we’ll find out whether the FME work I did helped them to get the MP re-elected!
Anatomy of an FME Project: Part I: Organisation
Absolutely the first thing you should do in an FME project is organize your data and workspaces into a number of folders.
So I created a project structure along the lines outlined in the FME training manual:
And followed this with a number of subdirectories:
.SourceData : for storing source dataset files
.OutputData : for storing destination dataset files
.FMEworkspaces : for storing complete workspaces
.testing : for storing incomplete and test workspaces
Anatomy of an FME Project: Part II: Geocoding
If we’re to do anything with address data, the first stage must be geocoding: turning the basic address into a spatial coordinate. FME doesn’t have that capability built in, but it does have the ability to send data off to a web service for processing. The transformer that does this is the HTTPFetcher.
A number of geocoding services are available, but I chose Yahoo! as being free, well-documented, and the service used in this example on fmepedia
Basically the technique is to build up a HTTP request string using the Concatenator transformer. The first part is a constant value:
…the rest is a query comprised of street, city and state, where a question mark (?) signals the start of the query, spaces are replaced with + characters, and an ampersand (&) denotes the different fields of the query. For example Safe’s office would be:
The final touch is the requirement to put a Yahoo!-provided user/application ID on the end of the string.
So, in FME terms we start like this:
…of course FME2009 is better here because the updated StringReplacer lets you handle any number of source attributes.
When run, the result of these transformers is an XML string returned as an attribute:
<Result precision="address"> <Latitude>49.140960</Latitude> <Longitude>-122.856749</Longitude> <Address>7601 132 St</Address> <City>Surrey</City> <State>BC</State> <Zip>V3W</Zip> <Country>CA</Country> </Result>
So the next step is to use StringSearchers to find the appropriate Latitude and Longitude figures.
The regular expression we search for is: <Latitude>((-|)([0-9]*)[.]([0-9]*))
This extracts the Latitude (and Longitude when we specify that tag) into an attribute which is then renamed. The match itself is a test of the success of the geocode – lack of a latitude tag assumes a failed process, so such features are output immediately.
Now we’re on more familiar territory. The only task remaining is to turn the latitude and longitude attributes into a true spatial feature, which is done by the 2DPointReplacer and the CoordinateSystemSetter:
Dmitri has a great page for building workspaces for web services – http://www.fmepedia.com/index.php/Building_Web_services_workspace
Creating Blank Workspaces
As if to prove you should never overlook the small stuff, at my last two training classes I was quite surprised to find a number of experienced users weren’t aware of this shortcut. So I thought I should probably pass it on in case there are others who it can help.
The scenario is simple: you have a workspace open, and wish to close it and start a new, blank workspace.
The well-known method of doing this is use File > New on the menubar, and in the subsequent dialog choose the option to “Create a Blank Workspace” then click OK.
The lesser-known method is much simpler: just click on the [X] button to the top right of the workspace canvas.
Below: Simply tick the X button to restart the canvas with a new (blank) workspace.
Note that if there are unsaved changes in the current workspace, you will be prompted whether to save them first. Also, when the canvas is showing a custom transformer, then this button will merely close that transformer; but when the focus in on the “Main” tab, the button will close the entire workspace.
IGDS (DGN) Tags and Level Names
One of the mysteries of FME and DGN reading is, for new users, how to get values off tags. Tags are a DGN method of storing small bits of attribute information.
The answer is in the source settings. Use the “group-by schema” option and tag data will be read from the source data and added as attributes to the new workspace.
However, this option always uses Level Numbers to define the new feature types, and most users will now be using Level Names.
So, in build 5607+ of FME2009, there is now an option to both group-by schema, AND get level names rather than numbers.
Above: Now we have settings for both Numbers and Names:
Geometry Handling Setting
Like a French film without subtitles, the Geometry Handling setting in FME both confuses and delights in equal measure. In order to help clear up confusion, the setting values have been simply renamed from “Rich and Classic” to “Enhanced and Classic”, in FME2009 build 5608+
Below: Geometry Handling is an Advanced Workspace Setting:
Below: Double-click the Geometry Handling setting and these are the potential options.
The key behind the change was to remove the word “Rich” which often confused users with “Rich Geometry” – see this fmepedia page for more info on that subject.
There are two items of interest to note. Firstly the parameter is now shared between Workbench and the FME Viewer; so change the default value in one and it will be applied in the other. The second item is that the default value is still Classic, and will remain so for the 2009 release: if you want enhanced processing then you’ll need to set the value manually.
So now it’s like a French film with subtitles: enhanced and classic.
Bottoms-Up: Reading text files backwards
A user recently wrote to the support team with a problem that – I’m happy to say – we managed to find two solutions for; both of which may be of use to you too.
The scenario is that a whole series of FME batch processes are run; most of which succeed, but some of which fail. How, in a whole set of very large log files, do you find the failures?
The user’s scenario was to read them into FME using a Textline reader and search each line for the value “Translation was Successful”. However, this was very slow and a faster method was desired.
Solution 1: Using existing technology a better solution was this: the “Directory and File Pathname” reader was used to get the name of all the log files in a folder and create a feature for each. This was passed onto an AttributeFileReader where the entire log file was read into an attribute. A simple StringSearcher (ex Grepper) was then used to determine the presence (or not) of the “Translation was Successful” string; i.e. instead of searching one line at a time, we search the entire file at once.
Below: Solution 1 – The user confirms it is “a lot faster”
Solution 2: This involved an update to FME that I think will be useful for other users and situations. The textline reader has been given an upgrade to read files from the bottom up. This would make the above scenario easier, because we know the string will be in the final five lines. So rather than have to read the entire dataset from the top down, we can read just five features from the bottom up.
Below: Now we can read a text file upside-down:
I don’t know if the developers did this deliberately (kudos if so) but the “Start Feature” setting also starts from the bottom of the file. So to read features n-10 to n-15, set:
– Read Bottom Up = Yes
– Max Features to Read = 5
– Start Feature =10
…of course they’ll be in the reverse order, so use a Sorter if you want them back in the correct order.
This Edition of the FME Evangelist…
…was written to the tune of Suddenly I See by KT Tunstall.
A rockin’ tune with heaps of big guitar sound. What more could you want?
Mark IrelandMark, aka iMark, is the FME Evangelist (est. 2004) and has a passion for FME Training. He likes being able to help people understand and use technology in new and interesting ways. One of his other passions is football (aka. Soccer). He likes both technology and soccer so much that he wrote an article about the two together! Who would’ve thought? (Answer: iMark)