JaamSim Blog: 2014

Sunday, November 30, 2014

The Fastest Simulation Software

One of my goals for 2014 was to perform the execution speed benchmarks described in my previous posts on a wide selection of mainstream simulation software packages. The software packages were compared by measuring the execution times for three basic activities:

executing a delay block
creating and destroying an entity
seizing and releasing a resource

Note that in previous posts, the third benchmark was the time to execute a "simple" block, which was taken to be one-half of the time to seize and release a resource. The new benchmark avoids this inference and presents the quantity that was actually measured.

Execution times were measured by counting the number of entities processed in 60 seconds measured with a stop watch. The posted results are the averages of three runs for each benchmark. All the runs were performed on the same laptop computer with a Second Generation ("Sandybridge") Core i5 processor running at 2.5 GHz. To make the results less machine specific, execution times were converted to clock cycles.

The following bar graph presents the results of the comparison.

Previous versions of this graph were labelled as "preliminary" to provide a chance for the vendors or other interested parties to improve the benchmark models for the individual software packages. Now that I have corresponded with most of the vendors and incorporated their suggestions, these results can be considered to be final for the specified version of each software package.

Revision 1: New results for Arena. Selecting the "Run > Run Control > Batch Run (No Animation)" option speeds up the benchmarks by a factor of ten. My thanks to Jon Santavy and Alexandre Ouellet for pointing this out.

Revision 2: New results for SIMUL8. Setting travel times between objects to zero resulted in a large improvement for the first and third benchmarks. The default setting for SIMUL8 is to assign a travel time in proportion to the distance between work centres, which increased the computational effort for these two benchmarks. The setting can be changed to zero travel time by selecting File > Preferences > Distance. My thanks to Sander Vermeulen for auditing and correcting the benchmark models.

Revision 3: New results for FlexSim. Execution speed was increased by approximately 10% by closing the “Model View (3D)” window during the run. The seize/release resource benchmark was also added to the result. After gaining more experience with FlexSim, it became clear that the Processor and Operator objects used for the first and third benchmarks are more complex objects than the simple seize, release, resource, and delay blocks that these two benchmarks are intended to evaluate. Since each object performs a series of actions to process an incoming entity, rather than just a single action, the results for the two benchmarks cannot be compared on a like-to-like basis with the other software packages. My thanks to Bill Nordgren for his help with the benchmarks.

Revision 4: New results for ExtendSim. Execution speeds were increased significantly by replacing the "Activity" blocks in Benchmarks 1 and 3 with "Workstation" blocks. The Workstation block is faster because it supports less functionality (pre-emption, shut-downs, and state statistics) than the Activity block, decreasing its overhead. It may be possible for ExtendSim users to increase execution speed further by creating a customized Activity block with any unneeded functionality stripped from the ModL code. My thanks to Peter Tag his guidance on ExtendSim.

Revision 5: New results for Simio. The entity creation and destruction benchmark was revised to use tokens instead of entities. All three benchmarks are now token-based. Tokens were used for the benchmark because they provide the same capabilities as the basic entity provided by some of the other simulation packages. The corresponding times for Simio's entity-based benchmarks are many times longer than the token-based ones. My thanks to David Sturrock preparing the new benchmark models.

Revision 6: New results for Arena. All three benchmark models were revised to use elements instead of modules to avoid unnecessary overhead. It is common practice for Arena modellers to avoid modules in large models where execution speed is an important factor. The module-based benchmark times are about 50% longer than the element-based times. My thanks to Alexandre Ouellet for preparing the new benchmark models.

Revision 7: Results for Simio 7.119. The latest release of Simio shows a significant improvement in execution speed for the seizing and releasing a resource (using tokens). Processing time was reduced from 21,000 clock cycles for release 7.114 to 4,300 clock cycles for release 7.119.

I should caution readers not to put very much importance on differences in execution speeds of less than a factor of two. Ease of use and suitability for the system to be modelled will always be the primary criteria for selecting simulation software. Execution speed is only important if it is a significant time saver for the analyst (impacting ease of use) or if it is required for the model to be tractable (impacting suitability for the system to be modelled). For some systems, even very slow software will be quite fast enough.

To put the execution times in perspective, consider the number of times an activity can be performed in one minute of processing time on a 2.5 GHz laptop computer. An activity requiring 150 clock cycles can be executed one billion times in one minute. One requiring 1,500 clock cycles can be executed 100 million times in one minute. Even one requiring 150,000 clock cycles can be executed one million times in one minute. In most cases, the fastest benchmark times measured by this exercise will be important only for very large models that require hundreds of millions of blocks to be executed over the course of each simulation run.

SLX holds the crown as the fastest of the simulation packages with times of 60, 110, and 230 clock cycles, however, it differs from the others in that it is strictly a programming language and lacks any drag & drop capabilities. JaamSim is the fastest of the simulation packages that include a drag & drop interface, with times of 240, 530, and 2,400 clock cycles. Arena, SIMUL8, AnyLogic, and Simio turn in solid results in the 1,000 - 10,000 clock cycle range. ExtendSim is significantly slower, with one benchmark time that exceeds 20,000 clock cycles. The results for FlexSim are not directly comparable to the other software packages and are provided for reference only (see the notes for Revision 3).

Sunday, September 7, 2014

Another Big Performance Increase for Release 2014-36

Faster Event Processing

Harvey Harrison has done it again -- event scheduling and processing is about 50% faster in this week's release (2014-36). On my laptop, the benchmark runs at nearly 8 million events/second compared to 5 million last week. It wasn't many months ago when I was excited about 2 million events/second. The following graph shows the full results for the benchmark.

SLX Software

A new entry on the above graph is the result for SLX from Wolverine Software. The developer, Jim Henriksen, had sent me this result several months ago along with the input file. The results for other two benchmarks were described in a post Jim made to the LinkedIn group for the Society for Modeling & Simulation International a few weeks ago. You can read his post here.

Ideally, I should re-run the SLX benchmarks myself, but have not found the time to explore SLX enough to do so yet. In the interest of keeping things moving, the results Jim provided are shown in all the latest graphs.

It is fair to say that SLX sets the gold standard for execution speed. This is one of its key features, and SLX achieves this status by being the only simulation software to provide a compiler for its proprietary programming language. Somewhat more effort is required to build an SLX model -- there is no drag and drop model building environment -- but the pay-off is the shortest possible execution time.

Concluding Remarks

With Harvey's latest improvements, JaamSim processes events about 6 times faster than Simio, but is still about 2.7 times slower than SLX. JaamSim's event processing code is unlikely to get any faster -- all reasonable optimizations have been made already -- so we will have to accept that SLX is faster in this area. To be completely honest, we were pleasantly surprised to have got this close to SLX.

To put the event processing speed differences in in perspective, consider a model that requires one billion events to be executed. On my 2.5 GHz laptop, the event processing portion of the simulation run would total 13 minutes in Simio, 2.1 minutes in JaamSim, and 0.8 minutes in SLX. For this number of events, the difference between SLX and JaamSim amounts to only 1.3 minutes out of what is likely to be a much longer total duration for the run. Nearly 10 billion events would be required for the difference between SLX and JaamSim to reach 10 minutes. Beyond 10 billion events, SLX is likely to have a significant advantage over JaamSim. Below 1 billion events, the difference is likely to be insignificant.

Now that we have good performance for event processing, our next objective is to bring JaamSim's event creation/destruction time into line with the other benchmarks. SLX achieves its excellent result for this benchmark by providing an internal entity pool, so that used entities can be recycled instead of being destroyed. Up to now, we have shied away from an entity pool for JaamSim, preferring to focus our attention on making entity creation more efficient. Moving forward, we may have to reconsider various ways to implement an entity pool.

Thursday, August 28, 2014

Huge Performance Increase for Release 2014-35

Some programming magic by Harvey Harrison and Matt Chudleigh has resulted in a major increase in JaamSim's execution speed for release 2014-35. The following graph show the latest result for Benchmark 1 compared to those in my previous posts.

Benchmark 1 - Event Scheduling and Processing

Event Scheduling and Processing

The bottom line on the graph is the result for JaamSim2014-35 with the graphics for the EntityDelay object turned off. Normally, EntityDelay shows the various entities moving along a line representing the fraction of time completed for each delay. Unfortunately, this graphic requires the data for each entity to be stored in a HashMap, adding considerable overhead unrelated to scheduling and executing events. A more realistic benchmark is obtained when this overhead is turned off.

A practical example of JaamSim's speed advantage was revealed a few weeks ago when we delivered a very complex supply chain model to a mining industry client for their internal use. The model was prepared in our TLS software -- an add-on to JaamSim. The TLS model required only 7 minutes to complete a simulation run, while the less-detailed Arena model it replaced required 35 minutes. Our client was very pleased by this unexpected boost to his productivity.

Friday, August 22, 2014

What is in a name?

Selecting a name for a new software product such as JaamSim can be a vexing process. Names are important -- a rose by any other name might smell as sweet -- but it would not sell very well if was hard to pronounce or evoked something negative.

The Igor Naming Guide provided some excellent advice, but it was hard to put it into practice. We thought of all sorts of clever, evocative names only to find that they had been used twice over or that someone was already sitting on the .com domain name. Even the names we had rejected as being too dumb or cute had been used already.

According to Igor, an acronym is one of the worst possible ways to name something. However, we were getting desperate and the name JamSim -- "Java Animation, Modeling, and Simulation" seemed promising. That is, it did until a Google search showed that it had already been used for "Java MicroSimulation". The jamsim.com domain name was available, but there were too many references to JamSim for this name to be a viable option. How about "JaamSim" then?

The name JaamSim seemed pretty weak until a search on Google revealed that the "Jaam-e Jam" or Cup of Jamshid is a magical wine bowl from Persian mythology. By looking into the bowl, one could see people and events taking place in other locations:

"The whole world was said to be reflected in it, and divinations within the cup were said to reveal deep truths."

This was great! I had been interested in Persian culture ever since reading The Rubáiyát of Omar Khayyám at an impressionable age. The idea that our software might "reveal deep truths" had me hooked, so the name "JaamSim" won out in the end.

Now that you know the story of its name, perhaps JaamSim will help you to reveal your own deep truths.

Monday, July 21, 2014

Benchmarking Part 3 - Executing Model Blocks

Introduction

My first post on the topic of benchmarking discrete event simulation software identified three processes that could potentially bottleneck a typical simulation model:

Event scheduling and processing
Creating and destroying entities
Executing model block that do not involve simulated time

This post deals with the last item on the list - the time required to execute a model block that does not involve simulated time. This benchmark may seem a bit vague since there are many type of model blocks that do not advance simulated time. Its intent is to capture the overhead associated with moving an entity from one block to the next in a process flow type simulation model.

In a perfect world, we would benchmark a wide variety of blocks for each simulation software package. No doubt, the efficiency of each software package will vary with the type of block. Software A might be much more efficient the software B for one block, but much less efficient for another block. To get started, we chose to benchmark the blocks that seize and release a resource. These blocks are commonly used in simulation models and are implemented in one form or another in every simulation software package. Very little computation is required to seize or release a block - only statistics collection - so we expect that it provides an approximate measure of the overhead time to move an entity from one block to another.

Model Block Execution Benchmark

The model used to benchmark the execution of model blocks that do not advance simulated time is shown in the following figure.

In this model, two entities are created at time zero and directed to the Seize block. The first entity seizes the resource and executes a one second delay. The second entity enters the Seize block's queue to wait for the resource. On completing the one second delay, the first entity releases the resource and is returned to the Seize block. This process continues endlessly, with one entity completing the delay during each second of simulated time. Two entities were used in the model to ensure that the Seize block always had an entity to process, avoiding a potential source of inefficiency for some software packages.

As with the previous benchmarks, the average time to seize and release a resource was measured by running the model for 60 seconds of real time (using a stopwatch) and counting the number of times the Release block was executed. The effect of computer speed was allowed for by converting the calculated time into clock cycles. All measurements were made using my laptop computer which has a second generation (Sandybridge) Core i5 processor running at 2.5 GHz.

Performance Results

The results of the benchmark for Arena, Simio and JaamSim are shown in the following bar chart.

The time to execute a model block was calculated by taking the average execution time per entity for the benchmark, subtracting the time to execute the delay, and dividing by two. The time for the delay was taken from the first benchmark for each software package. It was necessary to divide by two since two blocks were executed for each trip through the benchmark - a seize block and a release block.

Only the result for JaamSim2014-20 is shown - there was no difference between the values for the three versions shown in the previous posts.

The benchmark results show that JaamSim requires very little time to process simple blocks such as Seize and Release.