Strange Cognos TM1 9.5 activity

Post Reply
craigparris1
Posts: 12
Joined: Mon Feb 27, 2012 12:05 am
OLAP Product: TM1
Version: 9.1 SP2
Excel Version: 2007

Strange Cognos TM1 9.5 activity

Post by craigparris1 »

Hi guys,

I'm wondering if anyone has experienced a similar issue to what I'm seeing, or can give me a pointer on what the cause might be. We've recently built a couple of new Cognos Express 9.5 servers (both with Windows 2008 Server R2 as the OS), to migrate a TM1 9.1 installation. One server was built first as a stand-alone server, then a second one was subsequently built in our virtual environment (in a data centre). Originally, I was seeing the issue on the VM server, so thought this might have something to do with it (the virtualisation), but then I experienced the issue on the stand-alone server also, so it appears to be particular to the 9.5 version (or so I'm guessing). The issue presented itself slightly differently on the two servers, which is why it's confusing me even more.

For the migration of the TM1 data, I just picked up the old data directory from the TM1 9.1 server and copied over to the new server, changed relevant settings in the .ini and .cfg files, then configured a new "tm1sd.exe" service instance on the new server(s). So, there's no changes to the cubes or rules or anything - there just exactly as they were under 9.1. (the primary reason we want to move to 9.5 is that our current TM1 9.1 environment is a little unstable and running on quite old hardware)

Now, the issue is this - if I open a particular cube (that has a default view) on one of the servers (using "Architect"), it opens just fine (within a couple of seconds). If I then open a different cube (which doesn't have a default view), the cube viewer opens; then I select a view, that view opens OK (again, within a couple of seconds). Then if I switch to a different view, suddenly the CPU usage for the "tm1sd.exe" process shoots up to 99% and stays there for like 10 minutes. The cube viewer shows a little window that says "Building cube view..." (with a "Stop Building View" button). Once the CPU usage drops back to zero for "tm1sd.exe" then the data loads up in the cube viewer. Now, this happens for one particular view on the virtualised servers, but the other views on other cubes are OK. I saw the problem first on the virtualised server, but couldn't replicate it on the stand-alone server, so I thought perhaps it was an issue with the virtualised server. However, I then did some testing with the Xcelerator client on other machines, and noticed that from one machine in particular - when connecting to the virtualised server - it caused the high CPU usage for one of the "good" views .... none of the other clients caused the issue, only one in particular.

So, then I tried connecting to the stand-alone server with some different clients, and managed to get the issue to occur - again, with one particular client attaching, while the others loaded the same view up just fine in a matter of a few seconds.

Anyone seen this issue before? Anyone know how to resolve it? If not, does anyone have some pointers on what might be causing it?

Thanks,
Craig :D
lotsaram
MVP
Posts: 3706
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: Strange Cognos TM1 9.5 activity

Post by lotsaram »

Hi Craig - what's your level of experience with TM1 exactly?

I ask because what you're describing could be perfectly normal behavior depending on the model and the views in question. If a view is relatively small in terms of rows x columns and contains only leaf level data or data requiring minimal consolidation and calculation then the time to render the view will be more or less instantaneous. However a view of equivalent size rows x columns could conceivably take several minutes to calculate (or more) if the view is highly consolidated and contains rule calculations which the server has to calculate (thus maxing out a core while doing this operation). Provided no data in the cube changes (or other cubes upon which the cube has a dependency) and sufficient VMM cache has been specified for the cube then subsequent refresh of the same view would be fast as the server has already performed the calculations.

Depending on content different views take different amounts of time to render. Depending whether calculations have already been performed or not the SAME view can take radically different time to render.

To me it just sounds like you are describing this scenario.
craigparris1
Posts: 12
Joined: Mon Feb 27, 2012 12:05 am
OLAP Product: TM1
Version: 9.1 SP2
Excel Version: 2007

Re: Strange Cognos TM1 9.5 activity

Post by craigparris1 »

Hi there Lotsaram,

Thanks for your comments. My experience with TM1 is not extensive, however as I mentioned in my post the high CPU is occurring on some views on in SOME circumstances. The same views on different servers do load quickly for some, but with high CPU usage for others. If it was the same thing happening with the same view across ALL servers, then I'd figure it was a problem with the cube/view itself. But because this is occurring strangely across different servers, it's got me a bit stumped.

I'm curious, though, about your comment "...sufficient VMM cache has been specified for the cube...". Are you saying there's somewhere that the cache/memory used per cube can be configured somehow. If so, how is this done?

Thanks,
Craig
Duncan P
MVP
Posts: 600
Joined: Wed Aug 17, 2011 1:19 pm
OLAP Product: TM1
Version: 9.5.2 10.1 10.2
Excel Version: 2003 2007
Location: York, UK

Re: Strange Cognos TM1 9.5 activity

Post by Duncan P »

craigparris1 wrote:I'm curious, though, about your comment "...sufficient VMM cache has been specified for the cube...". Are you saying there's somewhere that the cache/memory used per cube can be configured somehow. If so, how is this done?
Basically yes. These are values in the }CubeProperties control cube which you can set by hand and which control whether the data behind a view is cached for further use after the view is no longer needed. You might also consider the TI function ViewConstruct which can be used to pre-populate the view caches and calculation caches after doing data import.

If you search the web for "VMM VMT site:ibm.com" the first few entries will give you plenty of info.
craigparris1
Posts: 12
Joined: Mon Feb 27, 2012 12:05 am
OLAP Product: TM1
Version: 9.1 SP2
Excel Version: 2007

Re: Strange Cognos TM1 9.5 activity

Post by craigparris1 »

Hi there Duncan,

I'll take a look at some of the web links.

I had put a ViewConstruct call in for the particular view that seems to be causing the issue most frequently, but even after re-running the related process it still takes a very long time for the view to open. I will take a look at those cache settings, though.

Cheers,
Craig
craigparris1
Posts: 12
Joined: Mon Feb 27, 2012 12:05 am
OLAP Product: TM1
Version: 9.1 SP2
Excel Version: 2007

Re: Strange Cognos TM1 9.5 activity

Post by craigparris1 »

Interestingly, this post, which came up in the search results (http://www.ibm.com/developerworks/forum ... 7&tstart=0) appears to be explaining the same problem that I'm experiencing - although, as I've noted, it doesn't happen on a different server with the same version of software and identical cube ...
craigparris1
Posts: 12
Joined: Mon Feb 27, 2012 12:05 am
OLAP Product: TM1
Version: 9.1 SP2
Excel Version: 2007

Re: Strange Cognos TM1 9.5 activity

Post by craigparris1 »

Hi guys,

Tried some tweaks of the VMM and VMT settings - hasn't made any difference to the particular views, on particular servers, that still take a long time to build each time their accessed and have tm1sd.exe cranking away on the CPU.

Is there debug/logging that I can access or enable to would give some indication on why these particular views are being rebuilt each time they're requested?

Cheers,
Craig
tomok
MVP
Posts: 2836
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: Strange Cognos TM1 9.5 activity

Post by tomok »

Is it possible you have different paging file settings on the two servers?
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/
craigparris1
Posts: 12
Joined: Mon Feb 27, 2012 12:05 am
OLAP Product: TM1
Version: 9.1 SP2
Excel Version: 2007

Re: Strange Cognos TM1 9.5 activity

Post by craigparris1 »

Hi there Tomok,

Well, it's difficult to necessarily compare the servers explicitly as they do have different amounts of RAM, different CPU quantity, etc. Plus, one is virtualised, one is not. The virtual server, though, does have the recommended size swap file configured.

Cheers,
Craig
tomok
MVP
Posts: 2836
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: Strange Cognos TM1 9.5 activity

Post by tomok »

craigparris1 wrote:Well, it's difficult to necessarily compare the servers explicitly as they do have different amounts of RAM, different CPU quantity, etc. Plus, one is virtualised, one is not. The virtual server, though, does have the recommended size swap file configured.
OMG! Don't you think this is highly relevant information????? Why did it take so long to pry this out? This changes everything. Which server is slow? The virtual one? Are the resources dedicated to TM1? How much RAM is in each of the servers? How many CPUs? What are the virtual memory settings? All of these things could make a difference long before you even get started on settings inside TM1.
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/
craigparris1
Posts: 12
Joined: Mon Feb 27, 2012 12:05 am
OLAP Product: TM1
Version: 9.1 SP2
Excel Version: 2007

Re: Strange Cognos TM1 9.5 activity

Post by craigparris1 »

Wow, take a chill pill, dude!

In the opening post I already noted that one server was virtualised, one is not. While I did mention in that last post that they have different RAM, CPUs, etc., it's not that significant a different. The physical machine has 4GB RAM, the virtual machine (currently) has 5GB RAM (I've tried allocating more or less to it, but it appears to make no difference). The physical machine does have 2 x quad-core CPUs, while the virtual machine has 2 virtual CPUs allocated to it (ie. 2 of the the cores out of the underlying physical CPUs on the VM host). I've noticed, though, that when both servers have TM1 running at "100%" CPU, the one with 2 x quad-cores actually maxes out at 13% across all CPUs, while the virtual one maxes out at 50%. If I allocate only one virtual CPU to the virtual server, then it peaks at 99-100%. So it appears that TM1 is essentially single-threading all it's work anyway. I could probably throw 32 CPUs at it, then would likely only see the equivalent 1/32 of total CPUs being used.

As I also mentioned in my posts, the problem is happening for DIFFERENT views on DIFFERENT servers - so I can't isolate one of the servers and say "This is the one that's running slow". Essentially, I can replicate the 100% CPU issue on the virtual server with one particular view (but the same view loads very quickly on the physical server), while I can replicate the issue with a different view on the physical server (but the same view loads very quickly on the virtual server). I don't believe this is a server resourcing issue - it really appears to be a software issue, but I'm just at a loss at how to troubleshoot/investigate it further.

Cheers,
Craig
rmackenzie
MVP
Posts: 733
Joined: Wed May 14, 2008 11:06 pm

Re: Strange Cognos TM1 9.5 activity

Post by rmackenzie »

craigparris1 wrote:The physical machine has 4GB RAM, the virtual machine (currently) has 5GB RAM (I've tried allocating more or less to it, but it appears to make no difference).
I wonder how the VM controller software is allocating RAM to your VM instance? Some configurations allow dynamic allocation, the same as for hard disk space, where instances are spoon-fed what they need. In these circumstances, I can see this playing havoc with TM1 performance. Can you talk to the people who administrate the virtualisation about this? Also, what is running on the VM apart from TM1?

I think you are running x64 TM1 otherwise you would be capped at 3Gb (maximum) and you mention both environments have minimum of 4Gb. 4Gb on a x64 environment simply isn't enough nowadays and you should look into getting some more especially if you have some cubes that are big - then you have precious little RAM to play with. Check the memory consumption of the exe running the service after it has started up and see what memory you have left over.
craigparris1 wrote:As I also mentioned in my posts, the problem is happening for DIFFERENT views on DIFFERENT servers
It very much ounds like you need to trim down the way you are testing things so you are looking at one variable at a time. Try to avoid a scatter-gun approach of trying out different views, on different machines, logged in as different users. Just stick to using the admin account and one cube where you are noticing the performance issue, and then try that in the different environments, starting after a fresh reboot so you are sure the caches are fully dropped.
Robin Mackenzie
Post Reply