Custom Function Optimized to Evaluate Hundreds Times Faster

September 14, 2011

Custom Function Optimized

Two weeks ago I wrote an article about a FileMaker custom function I needed to preprocess some data imported from the web. I used this custom function in an auto-enter calculation to immediately preprocess the data while being imported. I created a server-side script that does the import every morning. But when I discovered that the import was taking over an hour every day, I saw it deserves some optimization. I used FM Bench Detective and optimized the custom function to evaluate up to several hundreds times faster…

I used FM Bench Detective to measure the exact time each depth of this recursive custom function took to evaluate, and even split the calc to three parts to find out which part was taking how much time.

The key discovery for me was that of the about 73 milliseconds every single recursion was taking, 69 milliseconds was occupied by the Substitute function that really only needed to be performed once int he topmost instance. As I mentioned in my Marvelous Optimization Formula, the more stupid is the original solution the more marvelous the optimization seems.

Just by making sure that the Substitute function is only used once, in the topmost instance, I was able to cut the evaluation time for every consecutive recursion from the 73 milliseconds to just about 5.9. Actually, I later discovered that the measuring overhed of FM Bench Detective was around 1.7 milliseconds per measurement and I had 3 measurement points inside the calc, so the actual time per recursion without the measuring code could be around 68 milliseconds before the optimization and less than 1 millisecond after the optimization.

I then further optimized the calc by replacing the PatternCount function with Position, cutting down the time per recursion to about 5.6 milliseconds including the measuring overhead.

Since the measuring overhead is so big I can only estimate that I cut the actual time per recursion from 68 to 0.8 milliseconds by eliminating the unnecessary Substitute and from 0.8 to 0.5 milliseconds by replacing PatternCount with Position.

Anyway, as a result, the custom function now evaluates about twice as fast when it dives into 1 recursion, and over 100 times faster when diving into the 100 levels deep recursion.

The recursion depth directly depends on the number of enumerated HTML entities in the source text (instances of the “&#” string), so the more enumerated entities you have in your text the more you will benefit from this optimization.

And by the way, it still can be optimized further, but I’ll leave that to your elaboration… ;-)

Download Sample fp7 File (193 KB)

Enjoy, and please let me know your optimization experiences by leaving comments either here or at the FM Bench Detective page.

{ 16 comments… read them below or add one }

Daniel Wood September 14, 2011 at 3:30 pm

Hi HOnza, very cool stuff indeed. In the article you mention you used FM Bench to measure the individual recursions of the function, how did you achieve this? Is there a way to use FMBench within a custom function definition to keep track of each recusive call? If so that is very cool – could you elaborate on how this was done?

Cheers

Reply

HOnza September 14, 2011 at 11:03 pm

Yes, it definitely is possible. I have recorded the whole optimization as a screencast using Screenflow but it’s over an hour long, so I am first going to cut it to a reasonable length, and then share it.

I am also going to show this in my session at Pause[x]London

Bruce Robertson September 14, 2011 at 3:51 pm

Excellent article. You mention a question about what optimizations might come next. And you describe your initializer method as a source of much of the speed improvement. I wonder about splitting the function into an initializer custom function that calls a second custom function, which would contain the recursive operations.

Reply

HOnza September 14, 2011 at 11:11 pm

Thanks, Bruce. I was thinking about that, but splitting the custom function would make it less portable/harder to share. The depth variable does the trick with relatively smal overhead.

Actually, much more useful than removing this overhead would be optimizing the top-level instance – the Substitute.

I forgot to mention that after doing this optimization, my import script still takes almost an hour to run. I figured out that it’s because most of the records I import don’t contain any enumerated HTML entities. They contain mostly named entities. So the function almost never dives into the recursion.

It may seem funny, but the Replacement of PatternCount with Position had actually bigger impact on my script than avoiding the unnecessary Substitute in the recursion.

Jack Rodgers September 15, 2011 at 4:41 pm

You keep at this and you’ll finish the job before you start, then what will you do? I’ve used longer processes as an excuse to get a cup of coffee, etc. and now I’ll have to just keep plugging away… :)

Reply

HOnza September 15, 2011 at 11:37 pm

Heh, when you get so much efficient you won’t need excuses for things like coffee. You will be able to argue why your chief/client should pay you for doing nothing… :-)

LaRetta September 19, 2011 at 7:51 am

Hi HOnza,

“And by the way, it still can be optimized further, but I’ll leave that to your elaboration… ”

I had just created a custom function earlier the same day before I read this. In speed tests, my function handles the HTML in 3 seconds compared to 48 seconds for the custom function in the sample file (on same test data with restart between).

It was the 251 substitutions which hurt it. I wasn’t ready to publish it because it actually will handle more than HTML and I am considering additional improvements for search/replacing/evaluating within the strings but here is a link to the file:
http://directlinesolutions.com/solutions/BookEnds.fp7

Now if you can speed mine up, that would be wonderful as well! :^)
LaRetta

Reply

HOnza September 19, 2011 at 2:21 pm

Sounds great! Unfortunately your server does not serve sample file as a binary file. If you can change your server’s configuration, please make sure that files with the .fp7 extension are served as application/octet-stream, not as text/plain.
If you cannot change your server’s configuration, please compress the file with ZIP before sharing it.
I will be glad to present your optimization in my session on Pause[x]London if you don’t mind. And if I can optimize it even further I will let you know for sure ;-)

HOnza September 19, 2011 at 10:37 pm

OK, I have now checked your example and discovered that your function is faster because it does not do the work ;-)

Yes, you’re right that the 251 nested substitutions are what needs to be optimized, but you if you optimize it by simply removing it, then you’ll break the function.

Or am I missing something in your example?

LaRetta September 19, 2011 at 3:06 pm

Try zip extension now instead of fp7. I created it to solve &#8722 within text. I didn’t think anyone else would ever need it but I kept it because it can do a lot more than apply Char() to multiple strings within text.

If you cannot grab it, let me know email me and I’ll send it or upload it to free site. I don’t care what you do with it. I am just a freak for speed and the two issues seemed to come together (the HTML post on FileMaker that I was solving) and discovering this thread looking for HTML requirements (like the HTML table which was also in the attached file). :)

Reply

HOnza September 19, 2011 at 10:26 pm
LaRetta September 20, 2011 at 4:23 am

HOnza said, “if you optimize it by simply removing it, then you’ll break the function.”

Isn’t the purpose to take mixed text such as “Smith&#8722Wesson” and turn it into “Smith-Wesson” where each HTML code converts? It doesn’t need to be hard-coded into Substitutes() when FM provides us Char() which does the conversion for us.

Reply

HOnza September 20, 2011 at 2:21 pm

Char only helps with enumerated HTML entities, but not with named HTML entities. The large Substitute is supposed to decode named entities, such as á

LaRetta September 20, 2011 at 2:42 pm

I was provided only the &# strings in the requirements presented. Thanks for filling me in!

Reply

HOnza September 20, 2011 at 2:49 pm

You’re welcome.
So, the challenge is still valid ;-)

HOnza October 19, 2011 at 2:09 pm

I have been able to optimize the Substitution part during my flight from Phoenix to London. Going to show and discuss tomorrow at Pause[x]London and I will be posting results as an update here.

Reply

Leave a Comment