PAL Analysis Threshold Templates for analyzing custom processes

Mar 13, 2009 at 12:17 AM
I'm using PAL for load testing our company's product. This is a great tool but I have to create a customized threshold analysis template to analyze the collected perflogs for our company's system services/processes of interest. I find it can be a pain to configure from scratch, even by starting with one of the existing threshold files as a template or starting point. So I created some barebones template versions which one can search and replace parameter values to get the correct XML data that one can insert into a copy of the System Overview standard thresholds template to create a customized version and if System Overview has extra counters you aren't concerned about, then one can delete those XML nodes.

My templates focus on the basics like memory leak detection (which also includes private bytes / memory analysis), handle leak detection, and CPU utilization. My other template is for monitoring .NET related services and processes and covers ".NET CLR Memory # Bytes in all Heaps" for the process of interest. You simply search and replace for the parameters that follow this pattern: "pParamName".

But of course, one may want to customize these templates even more for your specific case like a different analysis algorithm (in VBScript), or different counter analyses than the ones I do.

Hope others may find this useful. :)


<!--// process thresholds template //-->
  <ANALYSIS NAME="Memory Leak Detection for pProcess" ENABLED="True" ANALYZECOUNTER="\Process(pProcess)\Private Bytes" CATEGORY="Process">
    <COUNTER NAME="\Process(pProcess)\Private Bytes" MINVARNAME="MinPrivateBytes" AVGVARNAME="AvgPrivateBytes" MAXVARNAME="MaxPrivateBytes" TRENDVARNAME="TrendPrivateBytes" DATATYPE="Integer">
    </COUNTER>
    <THRESHOLD NAME="Memory: an increasing trend of pMRateMB's per hour detected for pProcess" CONDITION="Warning" COLOR="Yellow" PRIORITY="50">
      <DESCRIPTION><![CDATA[If the private bytes size is greater than <B>pMMaxMB</B> and the process is increasing at a rate greater than <B>pMRateMB's</B> per hour, then an aggressive memory leak is suspected.
        ]]></DESCRIPTION>
      <CODE><![CDATA[' If the private bytes is over pMMaxMBs and if the trend shows a pMRateMB increase at each hour.
If IsNull(AvgPrivateBytes) = False AND IsNull(TrendPrivateBytes) = False Then                  
  If AvgPrivateBytes > pMMax000000 AND TrendPrivateBytes > pMRate000000 Then
    IsTrendThresholdBroken = True
  End If  
End If]]></CODE>
    </THRESHOLD>
    <CHART CHARTTYPE="Line" CATEGORIES="AUTO" MAXCATEGORYLABELS="0" LEGEND="ON" VALUES="AUTO" GROUPSIZE="640x480" OTSFORMAT="MM/dd hh:mm" CHARTTITLE="\Process(pProcess)\Private Bytes" DATASOURCE="\Process(pProcess)\Private Bytes" DATATYPE="Integer">
    </CHART>
    <DESCRIPTION><![CDATA[This analysis determines if the process is consuming a large chunk of the system's memory and if the process is increasing in memory consumption over time. A process consuming large portions of memory is okay as long as the process returns the memory back to the system. Look for increasing trends in the chart. An increasing trend over a long period of time could indicate a memory leak. Private Bytes is the current size, in bytes, of memory that this process has allocated that cannot be shared with other processes. This analysis checks for a pMRateMB’s per hour increasing trends. Use this analysis in correlation with the Available Memory analysis, and total process (i.e. Process(*)) private bytes allocation analysis.<BR>
<BR>
Also, keep in mind that newly started processes will initially appear as a memory leak when it is simply normal start up behavior. A memory leak is when a process continues to consume memory and not releasing memory over a long period of time.<BR>
<BR>
Use this analysis in correlation with the Available Memory analysis, and total process (i.e. Process(*)) private bytes allocation analysis. If you suspect a memory leak condition, then install and use the Debug Diag tool. For more information on the Debug Diag Tool, see the references section.<BR>
<BR>
<B>References:</B><BR>
<BR>
Debug Diagnostic Tool v1.1 http://www.microsoft.com/downloads/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=en]]></DESCRIPTION>
  </ANALYSIS>
  <ANALYSIS NAME="Handle Leak Detection for pProcess" ENABLED="True" ANALYZECOUNTER="\Process(pProcess)\Handle Count" CATEGORY="Process">
    <COUNTER NAME="\Process(pProcess)\Handle Count" MINVARNAME="MinHandleCount" AVGVARNAME="AvgHandleCount" MAXVARNAME="MaxHandleCount" TRENDVARNAME="TrendHandleCount" DATATYPE="Integer">
    </COUNTER>
    <THRESHOLD NAME="Handle Leak Suspected - more than pHMax handles and a trend of more than pHRate handles per hour" CONDITION="Warning" COLOR="Yellow" PRIORITY="50">
      <DESCRIPTION><![CDATA[Checks for any process with a handle count greater than <B>pHMax</B> and if the trend is greater than <B>pHRate</B> handles per hour. If so, check the chart to determine if the counter is on an increasing trend potentially indicating a handle leak.]]></DESCRIPTION>
      <CODE><![CDATA[If IsNull(AvgHandleCount) = False AND IsNull(TrendHandleCount) = False Then                  
  If AvgHandleCount > pHMax AND TrendHandleCount > pHRate Then
    IsTrendThresholdBroken = True
  End If          
End If]]></CODE>
    </THRESHOLD>
    <CHART CHARTTYPE="Line" CATEGORIES="AUTO" MAXCATEGORYLABELS="0" LEGEND="ON" VALUES="AUTO" GROUPSIZE="640x480" OTSFORMAT="MM/dd hh:mm" CHARTTITLE="\Process(pProcess)\Handle Count" DATASOURCE="\Process(pProcess)\Handle Count" DATATYPE="Integer">
    </CHART>
    <DESCRIPTION><![CDATA[This analysis checks the process to determine how many handles it has open and determines if a handle leak is suspected. A process with a large number of handles and/or an aggresive upward trend could indicate a handle leak which typically results in a memory leak. The total number of handles currently open by this process. This number is equal to the sum of the handles currently open by each thread in this process.]]></DESCRIPTION>
  </ANALYSIS>
  <ANALYSIS NAME="pProcess Processor Utilization" ENABLED="True" ANALYZECOUNTER="\Process(pProcess)\% Processor Time" CATEGORY="Process">
    <COUNTER NAME="\Process(pProcess)\% Processor Time" MINVARNAME="MinProcess_PercentProcessorTime" AVGVARNAME="AvgProcess_PercentProcessorTime" MAXVARNAME="MaxProcess_PercentProcessorTime" TRENDVARNAME="TrendProcess_PercentProcessorTime" DATATYPE="integer" />
    <CHART CHARTTITLE="\Process(pProcess)\% Processor Time" OTSFORMAT="MM/dd hh:mm" GROUPSIZE="640x480" CATEGORIES="AUTO" DATATYPE="integer" LEGEND="ON" MAXCATEGORYLABELS="0" CHARTTYPE="Line" VALUES="AUTO" DATASOURCE="\Process(pProcess)\% Processor Time" />
    <THRESHOLD NAME="Significant Processor Use Suspected for pProcess - more than pCMax% CPU utilization" CONDITION="Warning" COLOR="Yellow" PRIORITY="50">
      <DESCRIPTION><![CDATA[This analysis checks if the process is consuming a significant amount of processor time, as specified by the current average/maximum threshold of pCMax% CPU utilization.]]></DESCRIPTION>
      <CODE><![CDATA[
If AvgProcess_PercentProcessorTime > pCAvg Then
    IsTrendThresholdBroken = True
    IsAvgThresholdBroken = True
End If

If AvgProcess_PercentProcessorTime > pCMax Or MaxProcess_PercentProcessorTime > pCMax Then
    IsMaxThresholdBroken = True
End If]]></CODE>
    </THRESHOLD>
    <DESCRIPTION><![CDATA[% Processor Time is the percentage of elapsed time that all of process threads used the processor to execution instructions. An instruction is the basic unit of execution in a computer, a thread is the object that executes instructions, and a process is the object created when a program is run. Code executed to handle some hardware interrupts and trap conditions are included in this count.]]></DESCRIPTION>
  </ANALYSIS>

<!--// .NET process thresholds template, for use in addition to template above //-->
  <ANALYSIS NAME=".NET CLR Memory # Bytes in all Heaps for pProcess" ENABLED="True" ANALYZECOUNTER="\.NET CLR Memory(pProcess)\# Bytes in all Heaps" CATEGORY="Process">
    <COUNTER NAME="\.NET CLR Memory(pProcess)\# Bytes in all Heaps" MINVARNAME="MinNETCLRMemory_#BytesinallHeaps" AVGVARNAME="AvgNETCLRMemory_#BytesinallHeaps" MAXVARNAME="MaxNETCLRMemory_#BytesinallHeaps" TRENDVARNAME="TrendNETCLRMemory_#BytesinallHeaps" DATATYPE="integer" />
    <CHART CHARTTITLE="\.NET CLR Memory(pProcess)\# Bytes in all Heaps" OTSFORMAT="MM/dd hh:mm" GROUPSIZE="640x480" CATEGORIES="AUTO" DATATYPE="integer" LEGEND="ON" MAXCATEGORYLABELS="0" CHARTTYPE="Line" VALUES="AUTO" DATASOURCE="\.NET CLR Memory(pProcess)\# Bytes in all Heaps" />
    <THRESHOLD NAME="Significant .NET Heap usage for pProcess, possible memory leak." CONDITION="Warning" COLOR="#FFFF00" PRIORITY="50">
      <DESCRIPTION><![CDATA[This analysis checks if the process's .NET heap has exceeded pNHMax MBytes and if the trend shows a rate increase of pNHRate MBytes per hour, in which case, a memory leak may be occuring, or at least the process is consuming a significant amount of memory resources on the .NET side.]]></DESCRIPTION>
      <CODE><![CDATA[' If the heap usage is over pNHMaxMBs and if the trend shows a pNHRateMB increase at each hour.
If IsNull(AvgNETCLRMemory_#BytesinallHeaps) = False AND IsNull(TrendNETCLRMemory_#BytesinallHeaps) = False Then                  
  If AvgNETCLRMemory_#BytesinallHeaps > pNHMax000000 AND TrendNETCLRMemory_#BytesinallHeaps > pMRate000000 Then
    IsTrendThresholdBroken = True
  End If  
End If]]></CODE>
    </THRESHOLD>
    <DESCRIPTION><![CDATA[This counter is the sum of four other counters; Gen 0 Heap Size; Gen 1 Heap Size; Gen 2 Heap Size and the Large Object Heap Size. This counter indicates the current memory allocated in bytes on the GC Heaps. <BR>
<BR>
This analysis checks if the process's .NET heap has exceeded pNHMax MBytes and if the trend shows a rate increase of pNHRate MBytes per hour, in which case, a memory leak may be occuring, or at least the process is consuming a significant amount of memory resources on the .NET side.<BR>
<BR>
Also, keep in mind that newly started processes will initially appear as a memory leak when it is simply normal start up behavior. A memory leak is when a process continues to consume memory and not releasing memory over a long period of time.<BR>
<BR>
Use this analysis in correlation with the Available Memory analysis, and total process (i.e. Process(*)) private bytes allocation analysis. If you suspect a memory leak condition, then install and use the Debug Diag tool. For more information on the Debug Diag Tool, see the references section.<BR>
<BR>
<B>References:</B><BR>
<BR>
Debug Diagnostic Tool v1.1 http://www.microsoft.com/downloads/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=en]]></DESCRIPTION>
  </ANALYSIS>

Coordinator
Mar 17, 2009 at 4:06 AM
Thank you, mangar00.

I hope the pound symbol (#) doesn't effect the VBScript from executing properly.

Have you tried using the PAL Threshold editor? You simply select the threshold file and click edit.

If you still need help, then let me know.
Mar 17, 2009 at 5:53 AM
Edited Mar 17, 2009 at 5:54 AM
I don't think the # symbol is a problem, as I've been analyzing the .NET # bytes in all heaps counter data with PAL. So either it's working fine or the analysis is flawed though I don't think so.

Yes, I've tried the PAL Threshold editor and it works nicely for a novice user but as with all GUIs, is a pain to use to configure thresholds in batch, especially like adding a batch of custom software services/processes to monitor and selecting the appropriate counters to monitor for each one (which are generally the same in my case). Hence I created the templates posted here to help with that instead of going through the GUI only.

Or does someone have a useful tip to "batch" edit threshold configurations via the PAL GUI like the scenario I mentioned above?
Coordinator
Mar 17, 2009 at 7:11 AM
I agree with the GUI editing. I like to do the mass XML editing as well. ;-) I even wrote a script awhile back to ease the pain of doing mass updates to the threshold files. I placed the script up on my new MSN SkyDrive at:
http://xmtgwg.bay.livefilestore.com/y1pQ66zE0M9N-ppEr8fq-p8IHmaHNXmeUjOVbRJ0bzZDJkdn6_z843YGu9roVvU-Y4K25DSYuiDJOIRjzb68epyaA/EditXML.txt

Please keep in mind that I never intended to share this script, so it's a mess. ;-)

Also, in PAL v2.0, I am currently working on a threshold file inheritance model. I just finished the function that merges multiple threshold files together through the INHERITANCE tag. The idea is that most of the threshold files will inherit from the SystemOverview threshold file, so if I make a change to SystemOverview, then it is automatically inherited by all of the other threshold files that inherit from it making threshold file management much easier. I'm not sure yet if I want be backward compatible with v1.0 threshold files due to the extra work involved, but if I can get the inheritance working right, then it will be relatively easy to recreate them all.
Apr 20, 2009 at 11:13 AM
mangar00, clinth,

I am very interested in a more complete threshold file with .NET CLR ... counters.
Is something like that available at the moment? It would really be nice if I could use it.
Apr 20, 2009 at 7:28 PM
bsharp, unfortunately, I don't think so. You could wait for PAL v2.0. In the meantime, the best alternatives is to try clint's personal tool he posted in this thread OR customize the .NET CLR template I posted here for your needs. It's just a bunch of copy & paste XML editing to customize the template for other counters. Once you have that, if you need to monitor different counter instances (say different processes), you can then use regular expression search & replace of the template placeholder parameters and merge all those replaced content into a giant XML threshold file.
Apr 21, 2009 at 11:45 AM
Clint, could you please repost your 'personal tool'. The link mentioned above does not work anymore.

Thanks,
bsharp
Coordinator
Apr 27, 2009 at 10:47 PM
Sorry, they must have changed all of the links. Here is a fresh one:
http://cid-e6360c54b48a891b.skydrive.live.com/self.aspx/.Public/VBScripts/EditXML.txt