Performance issues running 2.0.5

Nov 18, 2010 at 4:43 PM

Hi Everyone - 

Just wondering if anyone else has experienced what I am seeing here. I just upgraded from 1.4 so this is my first experience with 2.x.

I am merging 12 log files and analyzing with the Exchange 2007 template. I am going on 50 hours of chunking through the data and my system has all but been unusable during that time. PAL is taking all and above my 4gb of ram (it is showing 6.1gb private with 3.8gb working set in resource monitor right now) while cpu usage is negligible. Because of the extensive and over use of my RAM, my system is so taxed it took, for example, 45 minutes for me to be able to even switch windows yesterday. My disk i/o seems to be reasonable but I did see some improvement when I disabled my file level AV scanner.

Any suggestions or help would be appreciated! Don't want to go back to 1.4.x but if I have to...

On a side note, I had another 14 files I needed to analyze so I moved them to a server and I see similar results - taking 4.8gb of ram and sluggish. That machine has 8gb of ram so it's not as un usable. Is 4-5gb of ram usage the norm no matter how much you have installed?


FYI, my system specs - 

windows 7, x64, dual dual core 3.2, 4gb ram

Nov 20, 2010 at 5:35 AM

I haven't tried this yet, but we should be able to edit the pal.ps1 file and replace all the get-content lines with something like get-content -read 1000 for more optimal performance.  Based on the article here,!5A8D2641E0963A97!756.entry , this may double or triple the performance where the get-content is used.  It also appears that there are multiple functions that are using get-content to read in the $CsvFilePath file and may be able to be combined so that the file would only have to be read once instead of three times.  I know that relog.exe also takes some time and to my knowledge there isn't much that can be done with that other than choosing a smaller time frame from the performance counter logs or increasing the intervals read.  If I'm reading it correctly, the function, GetTimeZoneFromCsvFile is doing a get-content on the $CsvFilePath and yet is only getting information from the first line.  That would be better to just read the single line then with something like get-content -totalcontent 1 $CsvFilePath and would cut out a lot of reading of that file and processing time.  I hope this helps.  ClintH -- If I get a chance next week, I'll try this out as I have some good logs I've run before and should be able to get a good before and after comparison with some of these changes.  If it's positive, I'll let you know.

Nov 30, 2010 at 6:59 PM
Edited Dec 2, 2010 at 4:53 PM

Sorry for the late reply...

The end result was that after 7 days, it was still chunking away. I ended up canceling it and moving all of the data to a server we had that had 8gb of ram/quad dual core. Took about 2 days there.

Behavior was similar - ran usage was at about 4.8gb consistently, but obviously with that much more overhead, it allowed me to still use the system. Again, no noticeable cpu or disk activity.

A Word on what I was actually doing -

This was an Exchange 2007 server that we had about 6 days of data for. One thing I found out was that they had run perfwiz with the exchange settings (thus recording every counter imaginable) but they did it with the time to issue set at 1 hour. So... I was chunking three days of data, 200mg each - 28 files totaling about 5.6gb of data with a RIDICULOUS amount of samples (1 hour equals a sample every 5 seconds). So in hindsight - not so smart....

Is there any way to do this with better management on the part of PAL? Does it just take 4-5gb of RAM period? It took about the same amount on the server I moved it to. I wish I knew more about how to do this or I would help... Really love the tool in general, but if this is the way of the world in 2.x land, I may have to go back into 1.x land. I will try with less data and bigger sample gaps and see what happens...