Using LogParser with FusionReactor: Getting Started and Script Repository
(Created and edited by Charlie Arehart, with some additional content provided by Bill Shelton, marked near bottom.)(This page used to live as a Google Group page, but Google removed that feature in 2010, so it was moved here.)
FusionReactor creates amazing logs of what's going on within your CF (or other J2EE server) environment, and they can be very helpful in understanding and resolving problems as well as in helping determine appropriate server resource settings and doing capacity planning.
Of course, when logs have lots of data, it can be a challenge pulling out meaningful information. The good news is that there's a great free tool which can help with that, Microsoft LogParser, and this page helps explain how to take maximum advantage of it, pointing out specific features enabled in FusionReactor that help make it easy to use with the FusionReactor logs (though it can be used with any logs).
You'll Need More Background to Benefit from this Page
This discussion was first introduced as a Fusionreactor mailing list thread in early Mar 08, and it would be worth your time to go read thread that first for additional context: http://groups.google.com/group/fusionreactor/browse_thread/thread/f04a0a1a11f32d9f. This page does not repeat the fundamental discussions introduced there. As a brief reminder, there are several useful logs in FR, which detail variously each request, each query, all crash protection events, and finally one that tracks overall activity and resource utilization every 5 seconds. These can be very powerful, but analyzing them can be challenging.The thread discusses the free Log Parser (http://www.microsoft.com/downloads/details.aspx?FamilyID=890cd06b-abf8-4c25-91b2-f8d975cf8c07&displaylang=en), a command-line tool from Microsoft, which enables you to process FusionReactor (and other) log files using SQL syntax.
You can find out much more about using LogParser, generally, from an article Charlie did, Monitoring Your CF Environment with the Free Log Parser Toolkit . It introduces Log Parser and discusses using it from a CF perspective. You'll want to read that too to get most from the tool, and it points to additional resources for learning still more about it. There is also substantial online help provided with the tool, and of course you can do searches such as with Google to find many others talking about it. To save you that trouble, Charlie has put together a growing repository of LogParser resources.
About the Scripts on this Page
This Google Group page is focused on how sample LogParser scripts to use with FR specifically. Significant for FR users is that in FR 3, Intergral now provides a set of "header" files which are perfect for use with LP. There's one for each of the log file, and the header file identifies for its corresponding log file the column names, which makes using LogParser so very easy. (You can also find more about each log file in the corresponding page of the FR help.) The various LP SQL statement have been offered by different FR users, first by Charlie Arehart, who continues to update the page frequently and so far also by Bill Shelton. Except the scripts from Bill, which are identified clearly as his, the rest of the work here is from Charlie and he accepts any responsibility for mistakes or needed corrections. Feel free to point them out to him on the list or by direct email as offered at carehart.org/contact.cfm.Again, these are just several samples to get you started: there are many variations possible. For instance, many of these are set to provides results as a tabular report. Note that it's possible as well to create charts and graphs, or store the results in a database, and much more. It's also possible to read multiple log files (of the same type) at once, as shown in some examples below. It's also possible to store the SELECT statements in a separate file to be reused within the tool, passing in arguments for variation. These features aren't demonstrated here.
Finally, note that since LogParser is a command-line tool, it's important that you be careful to enter the example commands below on a single line. While an attempt has been made to make these work as a single line, and a copy/paste should carry that through to your local machine as such, if it fails, just copy the command into an editor on your and end ensure it's a single line. Note as well that you can choose instead to put the SQL portion of each command into a "sql file", and point to it in your LogParser command (as discussed in the logparser documentation), and in that case you can freely have the SQL appear on multiple lines to make it easier to read.
One other caution is that there are slight differences between the log files created in FusionReactor 3.0.1 versus 3.0, and significant differences in the logs created in the older FR 2. See the bottom of this page for more info.
Please add new contributions to the bottom of the page.
Contributions from Charlie Arehart
* List top 10 instances where there were 10 or more requests running at once, per the resource-0 log (which tracks the most recent resource log entries)* List top 10 requests whose execution time exceeded 10 seconds, per the request-0 log (which tracks the most recent request log entries)
logparser "select top 10 * from request-0.log where
exectime > 10000 and reqstatus='COMPLETED' order by exectime desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1
* Total Number of
Requests, per day:
logparser "select date, count(*) as [Count], min(exectime) as [Min Ms], avg(exectime) as [Avg Ms], max(exectime)
as [Max MS] from request-*.log where reqstatus='COMPLETED'
group by date order by date desc"
-i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on
-itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1
* Average Queries Per Request, per day
logparser "select date, div(count(*), sum(jdbcquerycount))
as [Avg Queries Per Request] from
request-*.log where reqstatus='COMPLETED' group by
date order by date desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1
* Average Queries Per Request, per day
logparser "select QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')),
3600) AS hour, avg(reqsfinished)
as [Avg Pgs/Min], min(avgreqexectime) as [Min Ms], avg(avgreqexectime) as [Avg Ms], max(avgreqexectime)
as [Max Ms] from resource-*.log group by
hour order by hour asc" -i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on
-itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
--
* List top 10
queries whose total time exceeded 10 seconds, per the jdbc-0 log
(which tracks the most recent jdbc log entries)
logparser "select top 10 * from jdbc-0.log where totaltime > 10000 and logtype='METRIC'
order by totaltime desc"
-i:tsv -iheaderfile:jdbc-headers.txt -iseparator:space -nsep:1 -fixedsep:on
-itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1
(Note that
you must enable JDBC logging in FusionReactor, which is not enabled by default.
If you get this error, 'Error: WHERE clause: Semantic Error: content field
"10000" doesn't match type of left-operand "totaltime"',
that's the problem.)
Also, when querying the
JDBC log, note that there is a logtype column whose
value is 'METRIC' for entries that track the end of a query's processing. It's
important to add that to the WHERE clause, otherwise you may find entries for
other logtype values, the records for which are added
when the notifyAfter and/or remindAfter
arguments are added to the JDBC URL as FR configuraiton
options.)
---
* Hourly Report of
Min/Max Memory Used, and Requests Completed
Sometimes, when
analyzing memory problems, it's best to focus not on when memory used is
getting "high", but rather when the low water mark of memory used is
getting high. In other words, since the JVM may let memory get "high"
before kicking in to do a garbage collection, the question is how much memory
is reclaimed (how far down does used memory fall to) when it is GCed. This report shows both the low-water and high-water
mark of Heap memory used (not free), per hour, also noting the number of
requests that completed in that hour (the latter two columns may help identify
if there's some connection between request/query activity and memory use.)
logparser "select
QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')),3600) AS hour,
min(mempct) as [MinMemPctUsed],
max(mempct) as [MaxMemPctUsed],
sum(reqsfinished) as [CF Requests Completed], sum(jdbcreqsfinished) as [Queries Completed] from
resource-0.log group by hour order by hour asc"
-i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on
-itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
Note that the report above (as with many) may be most useful if
you could plot the data on a graph. In order to create graphs and charts with LogParser, you need to install the Microsoft Office Chart
Web Component be installed on the machine where Log Parser is running--which
need not be the server where the logs being analyzed are created. Following is
one example of how that chart might be created:
logparser "select
QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')),3600) AS hour,
min(mempct) as [MinMemPctUsed],
max(mempct) as [MaxMemPctUsed],
div(sum(reqsfinished),10) as [CF Requests Completed,
tens], div(sum(jdbcreqsfinished),1000) as [Queries
Completed, thousands] into mem.gif from resource-0.log group by hour order by
hour asc" -i:tsv
-iheaderfile:resource-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -o:chart -charttype:line -groupsize:1280x1024 -view:on
-maxcategorylabels:40
Note that you name the file to hold the chart in an
INTO clause, added before the FROM clause, and some arguments used in normal
reports must be removed, like the rpt and ralign arguments, while the charttype
and groupsize arguments control the type and size,
respectively. See the LogParser help for more on
available charttype values. Note as well that the
count of requests or queries will typically be several orders of magnitude
larger than the pct of memory used/free, so the
counts have been divided into tens/thousands respectively.
--
(Note: the
rest of the examples here also refer to all the rotations of a given FR log
file type (ending in -0 through -5, by default), not just the most recent
(-0)).
* List URLs which have been executed more than 100 times, in
order by the most hits, per day (and accounting for different request
methods), across all request logs (not just the most recent)
logparser "select date, count(*) as hits, requrl,
reqmethod from request-*.log where reqstatus='COMPLETED' group by date, requrl, reqmethod having
hits > 100 order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1
(Note that for
tracking number of request executions, it's critical to recognize that FR's
request log tracks an entry for both the start and end of a request, so for the
request above, you want to look only at completed requests: and reqstatus='COMPLETED'.)
--
* List URLs whose average exectime is > 10
seconds, in order by slowest avg
time, per day (and accounting for different request methods), across all
request logs, with cols right-aligned. This example also shows an example of
how to support a column alias name that has a space within it (use brackets)
logparser "select date, requrl, reqmethod, avg(exectime) as [Avg Time (ms)] from request-*.log where reqstatus='COMPLETED' group by date, requrl, reqmethod having [Avg Time (ms)] > 10000 order by
date, [Avg
Time (ms)]
desc" -i:tsv
-iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
--
* List URLs with
any requests where exectime > 10 secs, showing avg/min/max
time, in order by number of executions per day (and
accounting for different request methods), across all request logs, with cols
right-aligned
logparser "select date, count(*) as hits, requrl,
reqmethod, avg(exectime), min(exectime), max(exectime) from request-*.log where exectime > 10000 group by date, requrl, reqmethod order by
date, hits desc" -i:tsv
-iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
* List above for yesterday
only
logparser "select date, count(*) as hits, requrl,
reqmethod, avg(exectime), min(exectime), max(exectime) from request-*.log where exectime > 10000 and
date=to_Date(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('2','d'))) group
by date, requrl, reqmethod
order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
(Note: to look at just
today, change the value '2' in the TIMESTAMP()
function to '1'. For the day before yesterday, make it '3', and so on. The to_localtime function must be used because the system_timestamp function will by default return the
current date/time in UTC form, but the FR log dates are instead in local time.)
* List above for past
8 hours only
logparser
"select date, count(*) as hits, requrl, reqmethod, avg(exectime), min(exectime), max(exectime) from request-*.log where exectime > 10000 and
date=to_Date(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('2','d'))) and substr(time,0,8) >
to_string(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('8','h')),'hh:mm:ss') group
by date, requrl, reqmethod
order by date, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
* List requests that
exceed 2 minutes during past 8 hours
logparser "select date,time,requrl,reqmethod,exectime
from request-*.log where date =
to_Date(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('1','d'))) and substr(time,0,8) >
to_string(SUB(to_localtime(SYSTEM_TIMESTAMP()),TIMESTAMP('8','h')),'hh:mm:ss')
and exectime > 100000 order by date, time desc" -i:tsv
-iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
---
* List URLs with
any requests where bytes >80k, showing avg/min/max
bytes, in order by number of executions per day (and
accounting for different request methods), across all request logs, with cols
right-aligned
logparser "select date, count(*) as hits, requrl,
reqmethod, avg(bytes),
min(bytes), max(bytes) from request-*.log where bytes >
80000 group by date, requrl, reqmethod order by date, hits desc"
-i:tsv -iheaderfile:request-headers.txt -iseparator:space -nsep:1 -fixedsep:on
-itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
* List URLs with
any requests where jdbctotaltime (sum of all queries
in a request) > 2 secs, showing avg/min/max jdbctotaltime, in
order by number of executions per day (and
accounting for different request methods), across all request logs, with cols
right-aligned
logparser "select date, count(*) as hits, requrl,
reqmethod, avg(jdbctotaltime), min(jdbctotaltime),
max(jdbctotaltime) from request-*.log where
jdbctotaltime > 2000 group by date, requrl, reqmethod order by
date, hits desc" -i:tsv
-iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
* List count of requests
finished per hour in
order by date/time, from most recent resource log, with cols right-aligned
logparser "select QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')),
3600) AS hour, sum(reqsfinished) from resource-0.log
group by hour order by hour desc"
-i:tsv -iheaderfile:resource-headers.txt -iseparator:space -nsep:1 -fixedsep:on
-itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
To change previous
report to every 15 minutes, change 3600 to 900. To report every 5 minutes,
change it to 300.
To get the report for a
particular date, add "where date='yyyy-mm-dd'" before the "group by", as in:
logparser "select QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')),
900) AS quarterhour, sum(reqsfinished)
from resource-0.log where date='2009-11-27' group by quarterhour order by quarterhour
desc" -i:tsv
-iheaderfile:resource-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
--
* Show a break
down per day of IP addresses with more than 1000 hits per day in
descending order by date/time/count, from the current request log, with cols
right-aligned
logparser "select date, clientip, count(*)
as hits from request-0.log where reqstatus='COMPLETED'
group by date, clientip having hits > 1000 order
by date, hits desc" -i:tsv
-iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
* Show a break
down above per hour, with more than 50 hits per hour in
descending order by date/time/count, from the current request log, with cols
right-aligned
logparser "select date, QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')),
3600) AS hour, clientip, count(*) as hits from
request-0.log where reqstatus='COMPLETED' group
by date, hour, clientip having hits > 50 order
by date, hour, hits desc" -i:tsv -iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
* List requests
completed during a given minute in a day (url and client ip address)
in order by date/time, from all request logs, with cols right-aligned
logparser "select date, time, clientip,
requrl from request-*.log where reqstatus='COMPLETED' and to_string(to_timestamp(time,'hh:mm:ss.ll'),'hh:mm') = '16:35' and date='2009-11-27'" -i:tsv -iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
* Group requests by IP address,
completed during a given minute in a day in order by count of requests,
from all request logs, with cols right-aligned
logparser "select clientip, count(*)
as ipct from request-*.log where reqstatus='COMPLETED' and to_string(to_timestamp(time,'hh:mm:ss.ll'),'hh:mm') = '16:35' and date='2009-11-27' group by clientip order by ipct" -i:tsv -iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1 -ralign:on
* Count number of
completed requests from a given IP address in a given request log, with
cols right-aligned
logparser "select count(*) as hits from request-0.log where
clientip ='124.160.32.226' and reqstatus
= 'COMPLETED'" -i:tsv
-iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1
* Count number of
completed requests per hour from a given IP address across all request logs,
with cols right-aligned
logparser "select date, QUANTIZE(to_timestamp(date,to_timestamp(time,'hh:mm:ss.ll')), 3600) AS hour, count(*) as hits from
request-*.log where clientip ='124.160.32.226'
and reqstatus = 'COMPLETED' group by date,
hour order by date, hour" -i:tsv
-iheaderfile:request-headers.txt -iseparator:space
-nsep:1 -fixedsep:on -itsformat:"yyyy-MM-dd" -headerrow:off -rtp:-1
Contributions from Bill Shelton:
* * *
logparser "select
top 1000 date, time, requrl, querystring,
div(exectime,1000) as et, div(jdbctotaltime,1000) as jdbcet,
statuscode into mylog.csv from
request-0.log order by et desc" -i:TSV -iSeparator:space -iHeaderFile:request-headers.txt
-o:csv
* *
*
Here's a command I used for some
postmortem this week. It gets 65000 rows from the crashprotection-n.log
and puts it into a csv file that I can open and play
around with in Excel. Note that Excel allows a max of 65000 (plus change) rows.
Note, you might need to scroll right
to get full line of LogParser command.
logparser
file:cp_summary_daterange_csv.sqlsrc=crashprotection-0.log+dest=crash_test_dummy.csv+start\\0752008-03-17+end\\0752008-03-18
-i:TSV -iSeparator:space
-iHeaderFile:crashprotection-headers.txt -o:csv
Note the file command-line option and the http url-like syntax
for parameters. The %start% param is the start of a date
range and the %end% is
the end of the date range.
The cp_summary_daterange_csv.sql sql file looks like this:
select top 65000
date,time,mempct,clientip,exectime,cptype,action,requrl,querystring
into %dest%
from %src%
where date between '%start%' and '%end%'
order by date,time,mempct,exectime desc
Note that the %src% param
can use a network location; I just happen to have copied locally for analysis.
Likewise, the %dest% param can be somewhere else,
too.
* * *
Uploading FusionReactor log
file data to a SQL Server database
LogParser
"SELECT * INTO RequestLog FROM \\myserver\FusionReactor\instance\myinstance.name\log\request-*.log"
-i:TSV -iSeparator:space
-iheaderfile:request-headers.txt -iCheckPoint:request-log-checkpoint.lpc
-o:SQL -server:MySqlServer
-database:MyFusionReactorDb -createTable:Off
-headerRow:off -transactionRowCount:1000
The INTO RequestLog indicates
the table name. The credentials are the account under which LogParser
is running, or the user logged in. In this example, LogParser
will not attempt to create the table, but you can set the -createTable:
option to ON and LogParser will attempt to create the
table. The -iCheckpoint:file-name
instructs LogParser not to re-scan all the log files
on subsequent reads. This is very handy!
* * *
Ant Task Wrapper for LogParser (FR + Ant + LogParser =
Frantic)
This is a hack I put
together last year (2008), but it's what I use at work when they run
frantically into my office and want to know why something is coming to a halt.
It generates canned reports and allows you to do some adhoc
queries. If you know your way around Ant,
this should seem familiar and you can easily build in some lower level
automation.
·
Documentation: http://cloud.github.com/downloads/virtix/frantic/frantic-doc.html
·
Goodies: http://wiki.github.com/virtix/frantic/goodies
- Source Code: http://github.com/virtix/frantic/tree/master
- Binary: http://cloud.github.com/downloads/virtix/frantic/xant.jar
* * *
Notes:
If you'd like offer your
own LP commands, feel free to edit this document (assuming you're a member of
the FusionReactor Google Group), adding new items at
the bottom of the info above. If you're a member and find you can't edit this
file, you may want to confirm that you are logged into groups.google.com using
the same address with which you receive the FR group emails.
If you're using FR 2,
there is now also a page for using Log Parserwith FR
2: http://groups.google.com/group/fusionreactor/web/log-parser-files-for-fr-2 .
If you're using FR
3.0.1, the -headers.txt files have been updated to reflect the slight changes
(primarily, an addition of a srvstartts column,
and a change in the value of the version column form
3 to 4.) Just be aware that you may still have some data in your logs from
when you were running 3.0, which could cause errors with the examples below. If
you get such errors, look carefully at your logs to see if you have entries
with the version column set to 3. If so, you may want to remove them into their
own log. To do so, stop CF (or your J2EE server) and copy the log file in
question and give it a new name, then open that file to remove the new entries,
and open the original FR log file to remove the older entries. This way you can
still analyze the old log files, though you'll need to find or create
headers.txt files to correspond to that old format (or maybe someone still
running 3.0 can post that versions header files here).