PAL runs forever when analyzing all samples

May 27, 2008 at 2:46 PM
Edited May 27, 2008 at 2:47 PM

Clint,

I have witnessed this behavior several time now, and after letting run PAL overnight  (and more) ,  I decided to report this problem, let me describe it to you.

 

I’m using PAL to detect potential bottleneck in CPU, and other counters. I am asking my customers to collect some counters for a day or so, for this analysis, I am asking PAL to analyze ALL samples as I want to see if there are CPU spikes. The problem that I see is the fact that PAL seems to take very long, and last night it did not go thru after an entire night on a Dual Core machine, to analyse \% Processor Time (see below) for processes which are transcient, meaning like opening and closing notepad, or here regedit.

 

AVG(TO_REAL([\\OCS00202\Process(REGEDIT)\% Processor Time])) AS avg, MAX(TO_REAL([\\OCS00202\Process(REGEDIT)\% Processor Time])) AS max FROM C:\Users

\e169607\AppData\Local\Temp\{497B3B96-640D-4360-8C43-4917C2A3F284}\_FilteredPerfmonLog.csv GROUP BY Interval

 

This issue is reproducible.

 

I first thought that this is only caused by the time to process all samples, but I am starting to think that there might be something wrong as simply choosing PAL to analyse every 2 samples would do it.

 

Do you mind reviewing if there is something wrong selecting ALL for the analysis interval? If you want I can provide you a BLG to reproduce the problem.

 

Thanks

Coordinator
Jun 27, 2008 at 8:07 AM
Can you send me the latest %temp%{GUID}PAL.log file? That will help me track it down. One of the things that has helped with these issues is increasing Log Parser's row size. Here is the key to update:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Log Parser]
"CSVInMaxRowSize"=dword:00040000

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Log Parser\Config]

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Log Parser\Config\Defaults]

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Log Parser\Config\Defaults\Input]

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Log Parser\Config\Defaults\Input\CSV]

 
Analyzing the log with the "ALL" option forces PAL to analyze every data point. This is very resource intensive like you already noticed. PAL will not miss CPU spikes because when it breaks down the log into time slices, it takes into consideration the max and min values of that time slice.

If it still doesn't work, then please send me the BLG and I'll try it out.


pksmania wrote:

Clint,

I have witnessed this behavior several time now, and after letting run PAL overnight  (and more) ,  I decided to report this problem, let me describe it to you.

 

I’m using PAL to detect potential bottleneck in CPU, and other counters. I am asking my customers to collect some counters for a day or so, for this analysis, I am asking PAL to analyze ALL samples as I want to see if there are CPU spikes. The problem that I see is the fact that PAL seems to take very long, and last night it did not go thru after an entire night on a Dual Core machine, to analyse \% Processor Time (see below) for processes which are transcient, meaning like opening and closing notepad, or here regedit.

 

AVG(TO_REAL([\\OCS00202\Process(REGEDIT)\% Processor Time])) AS avg, MAX(TO_REAL([\\OCS00202\Process(REGEDIT)\% Processor Time])) AS max FROM C:\Users

\e169607\AppData\Local\Temp\{497B3B96-640D-4360-8C43-4917C2A3F284}\_FilteredPerfmonLog.csv GROUP BY Interval

 

This issue is reproducible.

 

I first thought that this is only caused by the time to process all samples, but I am starting to think that there might be something wrong as simply choosing PAL to analyse every 2 samples would do it.

 

Do you mind reviewing if there is something wrong selecting ALL for the analysis interval? If you want I can provide you a BLG to reproduce the problem.

 

Thanks




Jul 9, 2008 at 11:54 AM
Hi Clint,

"CSVInMaxRowSize" was already at dword:00040000. I will send you the blg which will help you to reproduce the problem.

Thanks Patrice
Coordinator
Aug 30, 2008 at 4:51 PM
Try increasing CSVInMaxRowSize. In the latest release, I set it to dword:00080000, but in some cases that still isn't enough. Also, in the latest release, PAL will automatically set the registry key to the appropriate size before executing Log Parser. For this to work, you need to run the PAL.vbs with admin rights.
Sep 8, 2008 at 2:08 PM
Hi Clint,
As your said, but I met this error with Version 1.3.3.1.
I run the PAL.vbs in the Windows 2008 x64 Enterprise Edition, and my .blg file more than 70MB. I have been disabe the UAC.

so, how can I fix it ?

thanks.
Johnny.
Sep 9, 2008 at 7:48 PM
Hi,
I am having a problem when running PAL 1.3.3.1. When I try to analyze a 70 MB performance log file, PAL never finish, it go on forever and no error is displayed and neither XML nor HTTP reports are generated. It remains at the following message "Done determining if counters are in the Perfmon log. Organizing data structures. Please wait..." I have checked "CSVInMaxRowSize" and it is set at dword:00080000 (Decimal). The same happend on Windows XP SP2 and Windows 2003 SP1.

Has someone any idea about how to deal with it? 

Thanks.
Adrian
Coordinator
Sep 10, 2008 at 8:54 AM
Can you put the log file on a download location for me? I'll take a look at it and figure it out. Please email me at clinth@microsoft.com directly on this and I will update my findings here. Thank you.
Sep 11, 2008 at 9:44 AM
Clint, Adrian,

Got the same problem with 1.3.3.1., Clint  will email you the BLG.

Thanks

Patrice

Coordinator
Sep 12, 2008 at 2:03 AM
I looked at the perfmon log that Adrian had and others have sent me with similar issues. The problem is the large number of counters in the logs. In Adrian's case, he had over 8000 counters in the perfmon log. This is because when a computer has a large number of processes, then it creates a counter instance for each one of them making the counter list extremely long.

The function GetCountersNeededForAnalysis() in PAL.vbs is responsible for building a counter list for Relog.exe to filter on and this fuction needs to remove any duplicate paths. When there are are large number of counters, then this function will take hours or even days to process. I need to rewrite the function, so I'll post this as another discussion in hopes someone can assist me with rewriting it for efficiency.

Thank you,