Using LogParser with FusionReactor: Getting Started and Script Repository, by Charlie Arehart

Using LogParser with FusionReactor: Getting Started and Script Repository

(Created and edited by Charlie Arehart, with some additional content provided by Bill Shelton, marked near bottom.)

(This page used to live as a Google Group page, but Google removed that feature in 2010, so it was moved here.)
FusionReactor creates amazing logs of what's going on within your CF (or other J2EE server) environment, and they can be very helpful in understanding and resolving problems as well as in helping determine appropriate server resource settings and doing capacity planning.

Of course, when logs have lots of data, it can be a challenge pulling out meaningful information. The good news is that there's a great free tool which can help with that, Microsoft LogParser, and this page helps explain how to take maximum advantage of it, pointing out specific features enabled in FusionReactor that help make it easy to use with the FusionReactor logs (though it can be used with any logs).

You'll Need More Background to Benefit from this Page

This discussion was first introduced as a Fusionreactor mailing list thread in early Mar 08, and it would be worth your time to go read thread that first for additional context: http://groups.google.com/group/fusionreactor/browse_thread/thread/f04a0a1a11f32d9f. This page does not repeat the fundamental discussions introduced there. As a brief reminder, there are several useful logs in FR, which detail variously each request, each query, all crash protection events, and finally one that tracks overall activity and resource utilization every 5 seconds. These can be very powerful, but analyzing them can be challenging.

The thread discusses the free Log Parser (http://www.microsoft.com/downloads/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07&displaylang=en), a command-line tool from Microsoft, which enables you to process FusionReactor (and other) log files using SQL syntax.

You can find out much more about using LogParser, generally, from an article Charlie did, Monitoring Your CF Environment with the Free Log Parser Toolkit . It introduces Log Parser and discusses using it from a CF perspective. You'll want to read that too to get most from the tool, and it points to additional resources for learning still more about it. There is also substantial online help provided with the tool, and of course you can do searches such as with Google to find many others talking about it. To save you that trouble, Charlie has put together a growing repository of LogParser resources.

About the Scripts on this Page

This Google Group page is focused on how sample LogParser scripts to use with FR specifically. Significant for FR users is that in FR 3, Intergral now provides a set of "header" files which are perfect for use with LP. There's one for each of the log file, and the header file identifies for its corresponding log file the column names, which makes using LogParser so very easy. (You can also find more about each log file in the corresponding page of the FR help.) The various LP SQL statement have been offered by different FR users, first by Charlie Arehart, who continues to update the page frequently and so far also by Bill Shelton. Except the scripts from Bill, which are identified clearly as his, the rest of the work here is from Charlie and he accepts any responsibility for mistakes or needed corrections. Feel free to point them out to him on the list or by direct email as offered at carehart.org/contact.cfm.

Again, these are just several samples to get you started: there are many variations possible. For instance, many of these are set to provides results as a tabular report. Note that it's possible as well to create charts and graphs, or store the results in a database, and much more. It's also possible to read multiple log files (of the same type) at once, as shown in some examples below. It's also possible to store the SELECT statements in a separate file to be reused within the tool, passing in arguments for variation. These features aren't demonstrated here.

Finally, note that since LogParser is a command-line tool, it's important that you be careful to enter the example commands below on a single line. While an attempt has been made to make these work as a single line, and a copy/paste should carry that through to your local machine as such, if it fails, just copy the command into an editor on your and end ensure it's a single line. Note as well that you can choose instead to put the SQL portion of each command into a "sql file", and point to it in your LogParser command (as discussed in the logparser documentation), and in that case you can freely have the SQL appear on multiple lines to make it easier to read.

One other caution is that there are slight differences between the log files created in FusionReactor 3.0.1 versus 3.0, and significant differences in the logs created in the older FR 2. See the bottom of this page for more info.

Please add new contributions to the bottom of the page.

Contributions from Charlie Arehart

* List top 10 instances where there were 10 or more requests running at once, per the resource-0 log (which tracks the most recent resource log entries)

logparser "select top 10 * from resource-0.log where reqsrunning > 10 order by reqsrunning desc" -i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1

* List top 10 requests whose execution time exceeded 10 seconds, per the request-0 log (which tracks the most recent request log entries)

logparser "select top 10 * from request-0.log where exectime > 10000 and reqstatus='COMPLETED' order by exectime desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1

* Total Number of Requests, per day:

logparser "select date, count(*) as [Count], min(exectime) as [Min Ms], avg(exectime) as [Avg Ms], max(exectime) as [Max MS] from request-*.log where reqstatus='COMPLETED' group by date order by date desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1

* Average Queries Per Request, per day

logparser "select date, div(count(*), sum(jdbcquerycount)) as [Avg Queries Per Request] from request-*.log where reqstatus='COMPLETED' group by date order by date desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1

* Average Queries Per Request, per day

logparser "select QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')), 3600) AS hour, avg(reqsfinished) as [Avg Pgs/Min], min(avgreqexectime) as [Min Ms], avg(avgreqexectime) as [Avg Ms], max(avgreqexectime) as [Max Ms] from resource-*.log group by hour order by hour asc" -i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

--

* List top 10 queries whose total time exceeded 10 seconds, per the jdbc-0 log (which tracks the most recent jdbc log entries)

logparser "select top 10 * from jdbc-0.log where totaltime > 10000 and logtype='METRIC' order by totaltime desc" -i:tsv -iheaderfile:jdbc-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1

(Note that you must enable JDBC logging in FusionReactor, which is not enabled by default. If you get this error, 'Error: WHERE clause: Semantic Error: content field "10000" doesn't match type of left-operand "totaltime"', that's the problem.)

Also, when querying the JDBC log, note that there is a logtype column whose value is 'METRIC' for entries that track the end of a query's processing. It's important to add that to the WHERE clause, otherwise you may find entries for other logtype values, the records for which are added when the notifyAfter and/or remindAfter arguments are added to the JDBC URL as FR configuraiton options.)

---

* Hourly Report of Min/Max Memory Used, and Requests Completed

Sometimes, when analyzing memory problems, it's best to focus not on when memory used is getting "high", but rather when the low water mark of memory used is getting high. In other words, since the JVM may let memory get "high" before kicking in to do a garbage collection, the question is how much memory is reclaimed (how far down does used memory fall to) when it is GCed. This report shows both the low-water and high-water mark of Heap memory used (not free), per hour, also noting the number of requests that completed in that hour (the latter two columns may help identify if there's some connection between request/query activity and memory use.)

logparser "select QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')),3600) AS hour, min(mempct) as [MinMemPctUsed], max(mempct) as [MaxMemPctUsed], sum(reqsfinished) as [CF Requests Completed], sum(jdbcreqsfinished) as [Queries Completed] from resource-0.log group by hour order by hour asc" -i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

Note that the report above (as with many) may be most useful if you could plot the data on a graph. In order to create graphs and charts with LogParser, you need to install the Microsoft Office Chart Web Component be installed on the machine where Log Parser is running--which need not be the server where the logs being analyzed are created. Following is one example of how that chart might be created:

logparser "select QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')),3600) AS hour, min(mempct) as [MinMemPctUsed], max(mempct) as [MaxMemPctUsed], div(sum(reqsfinished),10) as [CF Requests Completed, tens], div(sum(jdbcreqsfinished),1000) as [Queries Completed, thousands] into mem.gif from resource-0.log group by hour order by hour asc" -i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -o:chart -charttype:line -groupsize:1280x1024 -view:on -maxcategorylabels:40

Note that you name the file to hold the chart in an INTO clause, added before the FROM clause, and some arguments used in normal reports must be removed, like the rpt and ralign arguments, while the charttype and groupsize arguments control the type and size, respectively. See the LogParser help for more on available charttype values. Note as well that the count of requests or queries will typically be several orders of magnitude larger than the pct of memory used/free, so the counts have been divided into tens/thousands respectively.

(Note: the rest of the examples here also refer to all the rotations of a given FR log file type (ending in -0 through -5, by default), not just the most recent (-0)).

* List URLs which have been executed more than 100 times, in order by the most hits, per day (and accounting for different request methods), across all request logs (not just the most recent)

logparser "select date, count(*) as hits, requrl, reqmethod from request-*.log where reqstatus='COMPLETED' group by date, requrl, reqmethod having hits > 100 order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1

(Note that for tracking number of request executions, it's critical to recognize that FR's request log tracks an entry for both the start and end of a request, so for the request above, you want to look only at completed requests: and reqstatus='COMPLETED'.)

* List URLs whose average exectime is > 10 seconds, in order by slowest avg time, per day (and accounting for different request methods), across all request logs, with cols right-aligned. This example also shows an example of how to support a column alias name that has a space within it (use brackets)

logparser "select date, requrl, reqmethod, avg(exectime) as [Avg Time (ms)] from request-*.log where reqstatus='COMPLETED' group by date, requrl, reqmethod having [Avg Time (ms)] > 10000 order by date, [Avg Time (ms)] desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* List URLs with any requests where exectime > 10 secs, showing avg/min/max time, in order by number of executions per day (and accounting for different request methods), across all request logs, with cols right-aligned

logparser "select date, count(*) as hits, requrl, reqmethod, avg(exectime), min(exectime), max(exectime) from request-*.log where exectime > 10000 group by date, requrl, reqmethod order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* List above for yesterday only

logparser "select date, count(*) as hits, requrl, reqmethod, avg(exectime), min(exectime), max(exectime) from request-*.log where exectime > 10000 and date=to_Date(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('2','d'))) group by date, requrl, reqmethod order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

(Note: to look at just today, change the value '2' in the TIMESTAMP() function to '1'. For the day before yesterday, make it '3', and so on. The to_localtime function must be used because the system_timestamp function will by default return the current date/time in UTC form, but the FR log dates are instead in local time.)

* List above for past 8 hours only

logparser "select date, count(*) as hits, requrl, reqmethod, avg(exectime), min(exectime), max(exectime) from request-*.log where exectime > 10000 and date=to_Date(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('2','d'))) and substr(time,0,8) > to_string(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('8','h')),'hh:mm:ss') group by date, requrl, reqmethod order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* List requests that exceed 2 minutes during past 8 hours

logparser "select date,time,requrl,reqmethod,exectime from request-*.log where date = to_Date(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('1','d'))) and substr(time,0,8) > to_string(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('8','h')),'hh:mm:ss') and exectime > 100000 order by date, time desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

---

* List URLs with any requests where bytes >80k, showing avg/min/max bytes, in order by number of executions per day (and accounting for different request methods), across all request logs, with cols right-aligned

logparser "select date, count(*) as hits, requrl, reqmethod, avg(bytes), min(bytes), max(bytes) from request-*.log where bytes > 80000 group by date, requrl, reqmethod order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* List URLs with any requests where jdbctotaltime (sum of all queries in a request) > 2 secs, showing avg/min/max jdbctotaltime, in order by number of executions per day (and accounting for different request methods), across all request logs, with cols right-aligned

logparser "select date, count(*) as hits, requrl, reqmethod, avg(jdbctotaltime), min(jdbctotaltime), max(jdbctotaltime) from request-*.log where jdbctotaltime > 2000 group by date, requrl, reqmethod order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* List count of requests finished per hour in order by date/time, from most recent resource log, with cols right-aligned

logparser "select QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')), 3600) AS hour, sum(reqsfinished) from resource-0.log group by hour order by hour desc" -i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

To change previous report to every 15 minutes, change 3600 to 900. To report every 5 minutes, change it to 300.

To get the report for a particular date, add "where date='yyyy-mm-dd'" before the "group by", as in:

logparser "select QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')), 900) AS quarterhour, sum(reqsfinished) from resource-0.log where date='2009-11-27' group by quarterhour order by quarterhour desc" -i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* Show a break down per day of IP addresses with more than 1000 hits per day in descending order by date/time/count, from the current request log, with cols right-aligned

logparser "select date, clientip, count(*) as hits from request-0.log where reqstatus='COMPLETED' group by date, clientip having hits > 1000 order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* Show a break down above per hour, with more than 50 hits per hour in descending order by date/time/count, from the current request log, with cols right-aligned

logparser "select date, QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')), 3600) AS hour, clientip, count(*) as hits from request-0.log where reqstatus='COMPLETED' group by date, hour, clientip having hits > 50 order by date, hour, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* List requests completed during a given minute in a day (url and client ip address) in order by date/time, from all request logs, with cols right-aligned

logparser "select date, time, clientip, requrl from request-*.log where reqstatus='COMPLETED' and to_string(to_timestamp(time,'hh:mm:ss.ll'),'hh:mm') = '16:35' and date='2009-11-27'" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* Group requests by IP address, completed during a given minute in a day in order by count of requests, from all request logs, with cols right-aligned

logparser "select clientip, count(*) as ipct from request-*.log where reqstatus='COMPLETED' and to_string(to_timestamp(time,'hh:mm:ss.ll'),'hh:mm') = '16:35' and date='2009-11-27' group by clientip order by ipct" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on

* Count number of completed requests from a given IP address in a given request log, with cols right-aligned

logparser "select count(*) as hits from request-0.log where clientip ='124.160.32.226' and reqstatus = 'COMPLETED'" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1

* Count number of completed requests per hour from a given IP address across all request logs, with cols right-aligned

logparser "select date, QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')), 3600) AS hour, count(*) as hits from request-*.log where clientip ='124.160.32.226' and reqstatus = 'COMPLETED' group by date, hour order by date, hour" -i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1

Contributions from Bill Shelton:

* * *

logparser "select top 1000 date, time, requrl, querystring, div(exectime,1000) as et, div(jdbctotaltime,1000) as jdbcet, statuscode into mylog.csv from request-0.log order by et desc" -i:TSV -iSeparator:space -iHeaderFile:request-headers.txt -o:csv

* * *

Here's a command I used for some postmortem this week. It gets 65000 rows from the crashprotection-n.log and puts it into a csv file that I can open and play around with in Excel. Note that Excel allows a max of 65000 (plus change) rows.

Note, you might need to scroll right to get full line of LogParser command.

logparser file:cp_summary_daterange_csv.sqlsrc=crashprotection-0.log+dest=crash_test_dummy.csv+start\\0752008-03-17+end\\0752008-03-18 -i:TSV -iSeparator:space -iHeaderFile:crashprotection-headers.txt -o:csv

Note the file command-line option and the http url-like syntax for parameters. The %start% param is the start of a date range and the %end% is the end of the date range.

The cp_summary_daterange_csv.sql sql file looks like this:

select top 65000 date,time,mempct,clientip,exectime,cptype,action,requrl,querystring
into %dest%
from %src%
where date between '%start%' and '%end%'
order by date,time,mempct,exectime desc

Note that the %src% param can use a network location; I just happen to have copied locally for analysis. Likewise, the %dest% param can be somewhere else, too.

* * *

Uploading FusionReactor log file data to a SQL Server database

LogParser "SELECT * INTO RequestLog FROM \\myserver\FusionReactor\instance\myinstance.name\log\request-*.log" -i:TSV -iSeparator:space -iheaderfile:request-headers.txt -iCheckPoint:request-log-checkpoint.lpc -o:SQL -server:MySqlServer -database:MyFusionReactorDb -createTable:Off -headerRow:off -transactionRowCount:1000

The INTO RequestLog indicates the table name. The credentials are the account under which LogParser is running, or the user logged in. In this example, LogParser will not attempt to create the table, but you can set the -createTable: option to ON and LogParser will attempt to create the table. The -iCheckpoint:file-name instructs LogParser not to re-scan all the log files on subsequent reads. This is very handy!

* * *

Ant Task Wrapper for LogParser (FR + Ant + LogParser = Frantic)

This is a hack I put together last year (2008), but it's what I use at work when they run frantically into my office and want to know why something is coming to a halt. It generates canned reports and allows you to do some adhoc queries. If you know your way around Ant, this should seem familiar and you can easily build in some lower level automation.

� Documentation: http://cloud.github.com/downloads/virtix/frantic/frantic-doc.html

� Goodies: http://wiki.github.com/virtix/frantic/goodies

Source Code: http://github.com/virtix/frantic/tree/master
Binary: http://cloud.github.com/downloads/virtix/frantic/xant.jar

* * *

Notes:

If you'd like offer your own LP commands, feel free to edit this document (assuming you're a member of the FusionReactor Google Group), adding new items at the bottom of the info above. If you're a member and find you can't edit this file, you may want to confirm that you are logged into groups.google.com using the same address with which you receive the FR group emails.

If you're using FR 2, there is now also a page for using Log Parserwith FR 2: http://groups.google.com/group/fusionreactor/web/log-parser-files-for-fr-2 .

If you're using FR 3.0.1, the -headers.txt files have been updated to reflect the slight changes (primarily, an addition of a srvstartts column, and a change in the value of the version column form 3 to 4.) Just be aware that you may still have some data in your logs from when you were running 3.0, which could cause errors with the examples below. If you get such errors, look carefully at your logs to see if you have entries with the version column set to 3. If so, you may want to remove them into their own log. To do so, stop CF (or your J2EE server) and copy the log file in question and give it a new name, then open that file to remove the new entries, and open the original FR log file to remove the older entries. This way you can still analyze the old log files, though you'll need to find or create headers.txt files to correspond to that old format (or maybe someone still running 3.0 can post that versions header files here).