TM1 Servers with TBs of Memory

Post Reply
User avatar
mce
Community Contributor
Posts: 352
Joined: Tue Jul 20, 2010 5:01 pm
OLAP Product: Cognos TM1
Version: Planning Analytics Local 2.0.x
Excel Version: 2013 2016
Location: Istanbul, Turkey

TM1 Servers with TBs of Memory

Post by mce »

Hello,

I would like to hear experiences with Large TM1 servers with more than or close to 1 TB of memory.
Are there anyone who experienced TM1 servers with more than or equal to 1 TB of memory?
What are the major challenges and how do you overcome those?
Thanks in advance for all replies.

Regards,
tomok
MVP
Posts: 2832
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: TM1 Servers with TBs of Memory

Post by tomok »

mce wrote: Mon Nov 30, 2020 10:00 am Hello,

I would like to hear experiences with Large TM1 servers with more than or close to 1 TB of memory.
Are there anyone who experienced TM1 servers with more than or equal to 1 TB of memory?
What are the major challenges and how do you overcome those?
Thanks in advance for all replies.

Regards,
Why do you think there would be challenges? It's not like the server is going to freak out or lose track of memory pointers just because it has 1TB versus 1GB of RAM. The only thing I would be concerned about would be the number of CPU cores to handle the calculation demands if you have an extremely large cube with consolidated elements with literally billions of intersections.
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/
User avatar
gtonkin
MVP
Posts: 1199
Joined: Thu May 06, 2010 3:03 pm
OLAP Product: TM1
Version: Latest and greatest
Excel Version: Office 365 64-bit
Location: JHB, South Africa
Contact:

Re: TM1 Servers with TBs of Memory

Post by gtonkin »

Looks like the ideal thread for Lotsaram to respond ;)
User avatar
macsir
MVP
Posts: 782
Joined: Wed May 30, 2012 6:50 am
OLAP Product: TM1
Version: PAL 2.0.9
Excel Version: Office 365
Contact:

Re: TM1 Servers with TBs of Memory

Post by macsir »

RAM=Storage
CPU=Speed
Lotsaram is not just enough, prefer Tonsaram. :D
In TM1,the answer is always yes though sometimes with a but....
http://tm1sir.blogspot.com.au/
User avatar
mce
Community Contributor
Posts: 352
Joined: Tue Jul 20, 2010 5:01 pm
OLAP Product: Cognos TM1
Version: Planning Analytics Local 2.0.x
Excel Version: 2013 2016
Location: Istanbul, Turkey

Re: TM1 Servers with TBs of Memory

Post by mce »

tomok wrote: Mon Nov 30, 2020 12:39 pm Why do you think there would be challenges?
Because, sometimes size does matter a lot :)
User avatar
jim wood
Site Admin
Posts: 3951
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Re: TM1 Servers with TBs of Memory

Post by jim wood »

mce wrote: Tue Dec 01, 2020 1:09 pm
tomok wrote: Mon Nov 30, 2020 12:39 pm Why do you think there would be challenges?
Because, sometimes size does matter a lot :)
Large = volume, not always interesting. Before the days of 64bit there were some very small but very complex models around. They were very interesting as there was no over feeding. A lot of the time large also = lazy.
Struggling through the quagmire of life to reach the other side of who knows where.
Shop at Amazon
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
David Usherwood
Site Admin
Posts: 1454
Joined: Wed May 28, 2008 9:09 am

Re: TM1 Servers with TBs of Memory

Post by David Usherwood »

This thread triggered some memories and I went looking back in our records. Found a response from James Fleming at Applix dated 15 Jul 2005 saying that (at the time) 64 bit Windows supported 16 TB. Also found a more recent post
https://www.compuram.de/blog/en/how-muc ... g-systems/ with various stats but it looks like Windows Server 2019 can address 24 TB.
User avatar
mce
Community Contributor
Posts: 352
Joined: Tue Jul 20, 2010 5:01 pm
OLAP Product: Cognos TM1
Version: Planning Analytics Local 2.0.x
Excel Version: 2013 2016
Location: Istanbul, Turkey

Re: TM1 Servers with TBs of Memory

Post by mce »

Here are my observations:

Mostly large memory is a result of large data volume. Large data volumes require more processing time to get a proper performance. This is only possible via multithreading or parallel processing, in both querying and data processing. That is where you start hitting some barriers or inefficiencies. Here I just listed some bottlenecks.

1) MTCubeLoad=T does not work in all cubes in all times. When it does not work for large cubes, server start times become too long.
2) When executing RunTI in multiple sessions concurrently to process data in multiple threads, some connections to TM1 server fail in some of the parallel threads. (CAM authentication). RunProcess cannot not always replace RunTI.
3) Parallel execution of TI processes keep causing unnecessary contentions. For example; DIMIX function cause contentions in only some dimensions, when concurrently executed by multiple TI processes.
4) MTQ limits the total number of cores used for a single thread, but if two queries are executed at the same time, then they battle for the cores and CPU utilization easily becomes 100%, and then blocking any thirds or fourth queries. Hence long 2 queries can block the whole system for performance.
5) multithreaded loading of cubes to memory using MaximumCubeLoadThreads=N, causes lots of garbage memory. The more N is, the more garbage you have.
lotsaram
MVP
Posts: 3652
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: TM1 Servers with TBs of Memory

Post by lotsaram »

gtonkin wrote: Mon Nov 30, 2020 3:50 pm Looks like the ideal thread for Lotsaram to respond ;)
A little bit late to the party on this one. I think mce has actually hit on most of the key points already and doen a good job of summarizing.

I have a few customers with systems in the several hundred GB to just over 1 TB territory. Typically it isn't the whole system that is massive but a handful of really large cubes of 250 GB+ and maybe some large dimensions of 1M+ elements which lead to a system being "big". The rest of the objects are completely ordinary and unremarkable.

For any cube that is gigantic (over 200 GB is my point of reference) then performance is key. Avoid N: rule calculations at all costs and pre-calculate everything during the load which can be. (e.g. do those currency calculations really need to be rule calculated?) Data loads can always easily be parallelized to use CPU and optimize throughput. Making performance as fast as possible for users querying the cube is the most important thing to keep in mind. "Old fashioned" BI strategies of summary reporting cubes and drill through to detail when needed can actually be a good way to achieve the best performance for users (and if pre-calculating on load then why not?)

For any system this big there will be tons of records to load. Loading millions of records per day is going to take some time no mattter how efficiently you code it. So think of how to reduce it ...
- drip-feeding of transactions from source system so only new records are loaded
- parallelizing loads
- using separate data load instances or staging cubes not accessible to users to absorb load time and then limit transactions to cubes queried by users to as small a time as possible
- avoid locks and contention at all costs (e.g. splitting dimension and fact updates is not enough. Make sure all fact updates are using temp objects and can't lock each other)
- optimize cube dimension order for smallest commit based on the load parallelization strategy
- as commit is single threaded design to avoid commit queues forming (e.g. stagger loada by some seconds or parcel the load queries so they aren't all the same size. You don't want all your parallelized load threads finishing at the same time and then froming a commit queue)

Use MTQ and PersistentFeeders. Play to see what the bst MTQ value is, but generally the more the better (unless you have a bease server with 128 CPUs in which case it can actually get slower due to the overhead of parcelling out and reassembling views).

MultipleCubeLoadThreads is now redundant. It doesn't play well with MTCubeLoad and if using MTQ & MTCubeLoad switch MultipleCubeLoadThreads off.
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.
User avatar
gtonkin
MVP
Posts: 1199
Joined: Thu May 06, 2010 3:03 pm
OLAP Product: TM1
Version: Latest and greatest
Excel Version: Office 365 64-bit
Location: JHB, South Africa
Contact:

Re: TM1 Servers with TBs of Memory

Post by gtonkin »

Some great feedback Lotsaram!

For my 5 cents, cannot agree more on the pre-calculation of N: level elements. In a current project I need to derive price, standard cost, logistic fee % etc. etc. for Actuals. It would be crazy to have rules derive these and then still need to feed them. The data will keep growing daily likewise with the memory required to calculate and feed. Performance on read would also be impacted. I cannot get away from C: level rules however as these are basically ratios that cannot be aggregated. For my currency translations, I can do those on load where required and the can simply aggregate.

Working with large dimensions >500,000 records brings with it many challenges. Ever opened one of these in the subset editor with the properties window enabled? Working with these dimensions get exponentially slower and unwinding roll-ups could take hours and hours if you are not careful or your methodology is flawed.

Something also worth mentioning on systems where you are loading via ODBC is that garbage collection can take up a significant amount of memory too. One system that is of 100Gb often has an additional 30Gb build up in the garbage and the system needs to be restarted periodically to clear.

Lotsaram also covered this in his post but worth mentioning again - dimension order if poorly done can significantly impact not only performance but memory usage. I have had largish cubes - roughly 25Gb be cut down by over 50%. Rationalising too can help e.g. replacing of multiple period dimensions with a continuous dimension. Made this change on a 60Gb model that took about and hour to start. Reduced to about 5Gb and started in a couple of minutes. Too many calculations were happening to derive MTD, YTD etc. that became natural aggregations in a continuous time dimension.

Similarly, poorly written feeders can also create a drain on memory and performance.

I know this thread was not about performance but I think these are things developers need to consider and test otherwise they may well need TB's of RAM to accommodate a poorly developed and implemented solution.

p.s. On the virtual side, every IT guy tells me that allocating too much memory (that will be free/available) to the VM can have negative performance impacts. Still cannot fathom this myself but may be worth delving into this myth if you are virtual and looking to go big.
User avatar
mce
Community Contributor
Posts: 352
Joined: Tue Jul 20, 2010 5:01 pm
OLAP Product: Cognos TM1
Version: Planning Analytics Local 2.0.x
Excel Version: 2013 2016
Location: Istanbul, Turkey

Re: TM1 Servers with TBs of Memory

Post by mce »

Thanks for all the valuable comments.
gtonkin wrote: Sat Dec 05, 2020 2:10 pm Working with large dimensions >500,000 records brings with it many challenges. Ever opened one of these in the subset editor with the properties window enabled? Working with these dimensions get exponentially slower and unwinding roll-ups could take hours and hours if you are not careful or your methodology is flawed.
I discovered a method to use TM1 cubes for customer level data of many million customers (more than 20million), without having a customer dimension with multi-million elements. I will write details about this in a separate article. But this feature allowed me to have very large amount of customer level data in TM1 cubes, requiring very large TM1 instances, serving billions of populated cells. Solved many challenges of the scale, but the ones below are still pending solution, obviously from IBM.

1) MTCubeLoad=T does not work in all cubes in all times. When it does not work for large cubes, server start times become too long.
2) When executing RunTI in multiple sessions concurrently to process data in multiple threads, some connections to TM1 server fail in some of the parallel threads. (CAM authentication). RunProcess sometimes helps on this, but cannot not always replace RunTI.
4) MTQ limits the total number of cores used for a single thread, but if two queries are executed at the same time, then they battle for the cores and CPU utilization easily becomes 100%, and then blocking any thirds or fourth queries. Hence long 2 queries can block the whole system for performance.
lotsaram
MVP
Posts: 3652
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: TM1 Servers with TBs of Memory

Post by lotsaram »

mce wrote: Mon Dec 07, 2020 9:03 am 2) When executing RunTI in multiple sessions concurrently to process data in multiple threads, some connections to TM1 server fail in some of the parallel threads. (CAM authentication). RunProcess sometimes helps on this, but cannot not always replace RunTI.
Actually this is a really key point that I forgot to mention!
More of a hyperthreading issue than "big memory". But when you have big memory and big data and millions and millions of records to process and time windows to manage to load data then you may be loading on many, many threads at the same time. At one customer where we were using tm1runti to load somewhere between 60 - 80 concurrent threads we had a rare but intermittently reocurring random issue where a thread would "go missing" and fail to commit. (As in in a session monitor you see the thread authenticate and start work but at some point it disapears and never commits and logs out. And you see the same thin in the tm1server.log that a thread starts but then disapears from the log and never logs out). This seemed to happen only when loading on 40+ threads. The end consensus was that this is an OS issue where Windows decides to terminate a thread due to some OS resource allocation decission and not directly a fault of TM1. But it caused big headaches in the project and a lot of extra work in investigating the issue and designing logging and monitoring to validate that all load threads finished.
After changing from runti to RushTI the issue was eventually eliminated. Not sure if the same issue would happen in a Linux environment (I guess probably not).
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.
User avatar
mce
Community Contributor
Posts: 352
Joined: Tue Jul 20, 2010 5:01 pm
OLAP Product: Cognos TM1
Version: Planning Analytics Local 2.0.x
Excel Version: 2013 2016
Location: Istanbul, Turkey

Re: TM1 Servers with TBs of Memory

Post by mce »

lotsaram wrote: Mon Dec 07, 2020 11:22 am
mce wrote: Mon Dec 07, 2020 9:03 am 2) When executing RunTI in multiple sessions concurrently to process data in multiple threads, some connections to TM1 server fail in some of the parallel threads. (CAM authentication). RunProcess sometimes helps on this, but cannot not always replace RunTI.
Actually this is a really key point that I forgot to mention!
More of a hyperthreading issue than "big memory". But when you have big memory and big data and millions and millions of records to process and time windows to manage to load data then you may be loading on many, many threads at the same time. At one customer where we were using tm1runti to load somewhere between 60 - 80 concurrent threads we had a rare but intermittently reocurring random issue where a thread would "go missing" and fail to commit. (As in in a session monitor you see the thread authenticate and start work but at some point it disapears and never commits and logs out. And you see the same thin in the tm1server.log that a thread starts but then disapears from the log and never logs out). This seemed to happen only when loading on 40+ threads. The end consensus was that this is an OS issue where Windows decides to terminate a thread due to some OS resource allocation decission and not directly a fault of TM1. But it caused big headaches in the project and a lot of extra work in investigating the issue and designing logging and monitoring to validate that all load threads finished.
After changing from runti to RushTI the issue was eventually eliminated. Not sure if the same issue would happen in a Linux environment (I guess probably not).
We load large data certainly in multiple threads and observed similar issues, where RunTI fails to trigger some of the processes, when we execute many threads (hundreds) concurrently. We could not understand where the bottleneck is even after opening cases to IBM. Later we added 1 second wait time after triggering each 5 or 10 processes via RunTI to eliminate or reduce this issue. Wherever we could not avoid this issue, then we switched to RunProcess wherever it is necessary, accepting its disadvantages. (Remember RunProcess requires user access for the child process and it leaves threads open in TM1Top and there is no wait option after triggering it).

Moreover, due to these triggering issues, we started updating a status cube, prior to triggering a process flagging that it is triggered, and then at then at the end of the triggered process flagging that it has actually executed, to keep track and see if the triggered processes are actually executed. Using this cube, we are able to keep track of unsuccessful triggering's and in effect it still sometimes happens to be the case, but at least business users are now aware of those issues when it happens, so that they re-execute when such problem appears.
User avatar
gtonkin
MVP
Posts: 1199
Joined: Thu May 06, 2010 3:03 pm
OLAP Product: TM1
Version: Latest and greatest
Excel Version: Office 365 64-bit
Location: JHB, South Africa
Contact:

Re: TM1 Servers with TBs of Memory

Post by gtonkin »

Saw this bulletin today:
PH32847: AGENT CANNOT SEND ALERT IF TOTAL SYSTEM MEMORY IS MORE THAN 1TB - JAVA.LANG.NUMBERFORMATEXCEPTION BECAUSE OF THOUSAND SEPARATOR
Administration Agent cannot send mail alert if total system memory is more than 1 Terabyte ...

Something to be aware of if you are looking at adding memory that would exceed 1TB on PAL.
Post Reply