Before you read further, we would like you to know that this is still relevant for Planning Analytics (despite this test having been carried out in TM1 10.2).
Many promising claims about performance improvements were made for TM1 10.2, most of which are related to the new multi-threading feature that allows to utilise as many threads as the server has.
The pre TM1 10.2 situation was pretty uninspiring as TM1 server could use only one thread per request, which seems quite out-dated in the era when even mobile phones have 4 cores! That said, some initial steps like multi-thread server load were made in the right direction.
We’ve conducted some high level tests on the performance of IBM Cognos TM1 10.2 versus TM1 10.1 and the results are simply astounding. I’m please to present my results below:
Testing environment
All testing is performed in virtual environment using snapshots ensuring that core number and TM1 version are the only differences. No changes were made to the model, so the volume of information to be processed remains exactly the same.
Test hardware:
- CPU: Intel i7-2670QM with 4 cores, 8 threads
- RAM: 32GB
1 or 4 threads were allocated to the virtual machine as well as 24GB of RAM.
Test software:
- TM1 10.1.1
- TM1 10.2.0
For TM1 10.2 multi-threading queries should be enabled (MTQ=4 in our case) in the TM1 configuration file.
Test cases:
- TM1 10.1 1 core
- TM1 10.1 4 cores
- TM1 10.2 1 core
- TM1 10.2 4 cores
There were some interesting TM1Server.log lines from TM1 10.2 with the “MTQ” configuration enabled:
“Cube loads are single threaded” – multi thread cube load feature is separate from MTQ; also unlike multi-threaded cube load, MTQ only does not increase RAM consumption. Refer to out previous blog post which explains basics of multi-threading in TM1 including Parallel Interaction.
“Server threading mode is thread per connection” – weird enough since MTQ was enabled when this line appeared.
Test results: (all times are in mm:ss format)
TM1 Version/Number of threads | Server start up | Cube View Load | Rule Recalculation | Flat file load | Reprocessing feeders |
TM1 10.1 / 1 | 18:13 | 1:00 | 17:40 | 6:33 | 18:54 |
TM1 10.1 / 4 | 18:01 | 0:58 | 17:35 | 6:31 | 18:47 |
TM1 10.2 / 1 | 18:05 | 1:12 | 18:55 | 4:32 | 17:26 |
TM1 10.2 / 4 | 15:50 | 0:23 | 18:10 | 4:37 | 16:47 |
General performance improvement with the jump to 4 threads instead of 1 is related to the fact that TM1 uses one thread and there are three more threads still left for the operating system and other background processes.
Most tasks in TM1 with MTQ enabled would still use one thread with the exception of server start up and Cube View Load. While the former used four threads only at the very end if the process, the latter used four threads for the whole process, which explains this massive difference.
Also for some yet to be explained reason the load of CSV file took 50% less time using 10.2 compared to 10.1, but it is definitely a good change.
To sum up – this change is massive, now TM1 is able to utilise all 8, 16, 32 and more cores!