Hi FME’ers,

I recently finished updating the FME Desktop training for 2016. The Performance chapter in the advanced manual is always one of the most difficult to put together, but also the most rewarding. It shows how just a few small changes can make a huge difference to a workspace.

This post is just a very quick overview of that chapter, showing the exercise we cover, what changes we make, and how much its performance improves.

EvangelistBanner7

Starting Workspace

Here is the workspace we start with in the training:

Performance2016-StartingWorkspace

At first glance there really isn’t much wrong with it. We’re reading some CSV data (1.7m records) representing cell phone calls, finding out which neighborhood (6 features) each CSV record falls inside, and writing out the ones with poor signal strength to a Shape dataset. In that way we hope to find out which neighborhoods have the worst signal strength.

Running the workspace in FME I get this result:

 INFORM|FME Session Duration: 11 minutes 6.5 seconds. (CPU: 306.7s user, 37.8s system)
 INFORM|END - ProcessID: 50764, peak process memory usage: 2966368 kB

Basically 11 minutes of time and 3GB of memory. But can we improve on that?

There are various things we check for in the training, and minor techniques we use, but let’s focus on four big ones.

EvangelistBanner7

Writer Order

I’ve already covered in a previous post why writer order is important and how it affects a translation, so I won’t go into it in depth. Suffice to say that when you have multiple writers the one handling the most data should be triggered first, and I can do that using the order of writers in the Navigator window:

Img2.21.WritersPerformanceOrder

In this exercise the Writer handling the most data was not first in the list. When I changed it here the performance improved to this:

 INFORM|FME Session Duration: 4 minutes 1.9 seconds. (CPU: 219.7s user, 19.4s system)
 INFORM|END - ProcessID: 46312, peak process memory usage: 1776304 kB

That’s saved 7 minutes and cut the amount of memory to nearly half!

EvangelistBanner7

Remove Attributes

You probably know that removing attributes is good for a workspace, but perhaps you don’t know quite how much.

In our workspace there aren’t that many attributes to start with, but there are 1.7m CSV records so that can add up to a sizable amount. I even remove the excess attributes from the neighborhood features. It’s true there are only 6 features and a handful of attributes, but each feature’s attributes is being copied 1.7m times by the Clipper. That’s a lot of work.

So, by using AttributeManagers to get rid of excess attributes, and by dropping an unwanted Logger transformer, performance is now improved to this:

 INFORM|FME Session Duration: 3 minutes 36.2 seconds. (CPU: 194.4s user, 19.9s system)
 INFORM|END - ProcessID: 53072, peak process memory usage: 1349336 kB

That’s about 10% faster and let’s say 20% less memory, so it’s well worth doing.

EvangelistBanner7

Clippers First

The performance bottleneck for most workspaces is a group-based transformer. These are transformers that operate on the whole set of data at once, rather than a feature at a time, and so take up a lot more memory resources. However, most group-based transformers have parameters to reduce the size of groups processed. “Input is Ordered By Group” is a common one, but in our workspace the key is the Clipper Type parameter and being able to set it to Clippers First:

Performance2016-ClippersFirst

Now each clippee feature can be processed at once, and not stored in memory, because FME knows it already has the full set of possible clippers. The result? This:

 INFORM|FME Session Duration: 3 minutes 41.1 seconds. (CPU: 200.7s user, 18.7s system)
 INFORM|END - ProcessID: 52820, peak process memory usage: 96220 kB

It might not be quicker, but it sure has saved me a huge amount of memory usage.

EvangelistBanner7

Transformer Order

In the workspace screenshot above you might have noticed that we’re finding the neighborhood of every cell phone record, but we’re only writing out the neighborhood attribute to the low signal dataset.

Therefore, we could improve on things by rearranging the transformers like this:

Performance2016-EndWorkspace

All we’ve done there is put the Tester before the Clipper, but now our performance has improved to this:

 INFORM|FME Session Duration: 1 minute 49.0 seconds. (CPU: 89.6s user, 18.7s system)
 INFORM|END - ProcessID: 53180, peak process memory usage: 95240 kB

Incredibly that’s much faster than it was before. It’s a great illustration of how a little thought about the logic of a workspace can reap huge benefits.

EvangelistBanner7

Conclusion

The reason I wanted to post this outside of the training course is that I am stunned by how much I could improve FME performance by doing a few minor edits. Remember, the original workspace didn’t seem that bad. You might look at it and think that there wasn’t much that could be done to improve it. But four small edits later – about the same time it took you to read this article – and we have a workspace that is five times faster and uses 95% less memory! Everyone should be aware of these techniques.

Obviously your results – even for this translation – will be different with a different computer system, and the order in which I apply the fixes makes a difference in the relative improvement each provides. So I can’t guarantee that every project of yours will get the same improvement, but isn’t it worth taking the time to check?

If you want to try this exercise yourself, check the FME Training pages to either sign up for an advanced course or watch the on-demand recording (it should be available shortly).

NewBlogSignature

About FME Attributes Best Practice CSV FME Desktop FME Evangelist Memory Management Performance Shapefile Spatial Analysis

Mark Ireland

Mark, aka iMark, is the FME Evangelist (est. 2004) and has a passion for FME Training. He likes being able to help people understand and use technology in new and interesting ways. One of his other passions is football (aka. Soccer). He likes both technology and soccer so much that he wrote an article about the two together! Who would’ve thought? (Answer: iMark)

Comments

4 Responses to “FME 2016 Use Case: Performance Tuning”

  1. Is this the Performance-Ex3-Complete.fmw? I ran it on my powerful computer and got these stats. This might be to the SSD-drive? What are the specs on the test-machine you ran on?:

    INFORM|FME Session Duration: 17.5 seconds. (CPU: 12.5s user, 1.2s system)
    INFORM|END – ProcessID: 12056, peak process memory usage: 73800 kB, current process memory usage: 63068 kB

    • Mark Ireland says:

      Yes, the starting workspace is Performance-Ex1-Begin and the final result is Performance-Ex3-Complete. You might notice that in those workspaces is a Cloner transformer that I added to bulk up the data for training. If you run a training course on fast machines you can increase the number of clones created, or reduce the number for slower machines. I ran it on a Dell system, Windows 7, Xeon processor (quad core, 3.20GHz) with 24GB, though as you will see I only ran it in 32-bit.

  2. Carl Von Stetten says:

    If you moved the CSV reader to the top, how do you know that the Neighborhood features will arrive first to the clipper transformer as clippers? Do readers operate in parallel despite how the log appears?

    • Mark Ireland says:

      No, Readers don’t operate in parallel. The first reader in the list has its features read (and passed for processing) then the next reader does. So in this example I would make sure the Neighborhood features (Clippers) were at the top of the list so that they appeared first. The CSV features (Clippees) I would read after. If the Clippee features arrived first (and I had Clippers First set) then the Clippees would be discarded (not sure if they would error out or if they might be rejected)

      One thing you have to be careful about is that you don’t have any intervening transformers that might cause the order to be altered. It would need to be a group-based transformer to do that. For example, if I piped the Neighborhood features into an AreaBuilder before the Clipper, that might hold them up and let the CSV features slip through first. It might not, but I wouldn’t rely on that.

      In 2016.1 though, we’re getting new functionality that lets you change the order that specific connections are used. So that will make the Reader order a little less important.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts