I've been using LogParser whenever I need to really dig into Web Server Logs since before 2005. It's an amazing tool. I love it. Jeff Atwood loves it, and you should to. It may not being something you use every day but when you need it, it's there and it's awesome. It's kind of like a really focused sed or awk. A low-level tool with a high-powered focus.
Log Parser hasn't changed that I know of since 2005. I've been working with some folks to try to get it to escape the big house, but we'll see how far we get. Until then, it works fabulously and unchanged after all these years. It's great because while my primary use of LogParser is with IIS Log files, it'll query anything you can plug into it like the File System, Event Logs, the Registry or just a CSV file. The diagram from their docs is at right.
I did a blog post 6 years ago before FeedBurner where I analyzed traffic to my RSS feed from Newsgator. NewsGator was an RSS reader that would include statistics and information in its User-Agent HTTP Header. I was reminded of this post when I was talking to the NuGet team about how they are releasing new versions of NuGet every month or so but it's not clear how many people are upgrading. It'd also be interesting to find out what other ways folks are hitting the NuGet feed and what they are using to do it. I volunteered, so David Ebbo sent me a day's log file to "figure out."
Log Parser is wonderful because it effectively lets you run SQL queries against text files. Here's a few choice examples from Atwood's post a few years back:
Top 10 Slowest Items
SELECT TOP 10 cs-uri-stem AS Url, MIN(time-taken) as [Min],
AVG(time-taken) AS [Avg], max(time-taken) AS [Max],
count(time-taken) AS Hits
FROM ex*.log
WHERE time-taken < 120000
GROUP BY Url
ORDER BY [Avg] DESC
HTTP Errors Per Hour
SELECT date, QUANTIZE(time, 3600) AS Hour,
sc-status AS Status, COUNT(*) AS Errors
FROM ex*.log
WHERE (sc-status >= 400)
GROUP BY date, hour, sc-status
HAVING (Errors > 25)
ORDER BY Errors DESC
Given queries like these, I figured that LogParser would be perfect for me to explore the NuGet web service logs. (Of course, I realize that the service itself could be instrumented, but this is more flexible, and I plan to make these queries run on a schedule and show up on http://stats.nuget.org.)
There are a number of ways to access a NuGet packaging server. You can use the Add Package Dialog, the Command Line, the PowerShell Console within Visual Studio, or the NuGet Package Explorer. There's also some testing data and some "no user agent" stuff in there also. I filtered that out by just charting "NuGet" clients.
I started doing the initial work from the command line, but it was slow going. I was having trouble visualizing what I wanted and what was being returned. Here is one of my first command lines. It was pretty hairy and hard to build this at the command line.
C:\u_ex110831>LogParser.exe -i:IISW3C "SELECT DISTINCT cs(User-Agent) AS Client,
count(1) AS NumberOfHits
FROM u_ex110831.log
WHERE Client
LIKE 'NuGet%'
GROUP BY Client
ORDER by count(1) DESC"
Client NumberOfHits
------------------------------------------------------------------------------------------- ------------
NuGet+Add+Package+Dialog/1.4.20701.9038+(Microsoft+Windows+NT+6.1.7601+Service+Pack+1) 38840
NuGet+Command+Line/1.5.20830.9001+(Microsoft+Windows+NT+6.1.7601+Service+Pack+1) 15591
NuGet+Add+Package+Dialog/1.4.20701.9038+(Microsoft+Windows+NT+6.1.7600.0) 13360
NuGet+Command+Line/1.4.20615.182+(Microsoft+Windows+NT+6.1.7600.0) 8562
NuGet+Add+Package+Dialog/1.4.20607.9007+(Microsoft+Windows+NT+6.1.7601+Service+Pack+1) 5531
NuGet+Package+Manager+Console/1.4.20701.9038+(Microsoft+Windows+NT+6.1.7601+Service+Pack+1) 5497
NuGet+Command+Line/1.4.20615.182+(Microsoft+Windows+NT+6.1.7601+Service+Pack+1) 3699
NuGet+Package+Manager+Console/1.4.20701.9038+(Microsoft+Windows+NT+6.1.7600.0) 3654
NuGet+Add+Package+Dialog/1.4.20701.9038+(Microsoft+Windows+NT+5.1.2600+Service+Pack+3) 3558
NuGet+Command+Line/1.4.20615.182+(Microsoft+Windows+NT+5.2.3790+Service+Pack+2) 2539
Press a key...
There were 40 unique User Agents in this file and they include the client and its version as well as the operating system. I wanted first to chop it up to find out what Types of NuGet clients were being used. I have broken the lines up to make it clearer in this snippet.
C:\u_ex110831>LogParser.exe
-i:IISW3C "SELECT DISTINCT SUBSTR(cs(User-Agent),0, index_of(cs(User-Agent),'/')) AS Client,
count(1) AS NumberOfHits FROM u_ex110831.log
WHERE Client LIKE 'NuGet%'
GROUP BY Client
ORDER by count(1) DESC"
Client NumberOfHits
----------------------------- ------------
NuGet+Add+Package+Dialog 74761
NuGet+Command+Line 32284
NuGet+Package+Manager+Console 12637
NuGet+Package+Explorer 943
NuGet+Visual+Studio+Extension 49
Statistics:
-----------
Elements processed: 208235
Elements output: 5
Execution time: 0.79 seconds
Pretty amazing, though. A sub-second query over almost a quarter million line long log file with useful results and no database. Reminds me of working on Unix 20 years ago.
After some experimenting and installing the Office Web Components 2003 (discontinued) and was outputting a chart with this MONSTER command line:
C:\u_ex110831>LogParser.exe -i:IISW3C -o:CHART -chartType:PieExploded
-categories:Off -values:On -view:on
-chartTitle:"NuGet Clients by User Agent"
"SELECT DISTINCT SUBSTR(cs(User-Agent),0,index_of(cs(User-Agent),'/')) AS Client,
count(1) AS NumberOfHits
INTO foo.png
FROM u_ex110831.log
WHERE Client
LIKE 'NuGet%'
GROUP BY Client
ORDER by count(1) DESC"
Which yields me this profoundly 2003-looking chart, but still allows me to cheer a tiny victory inside. I will be able to get this (or a prettier one) to run as on a schedule (AT or Chron job) and serve it to the interwebs. I t'll probably be better to output a CSV or XML file, then process that with the web server and create a proper interactive chart. Regardless, tiny cheer.
Still, I'm thinking I'm too old for this crap. Where's my GUI? What's a brother got to do to drag a DataGrid around here? A little Binging with DuckDuckGo (yes, I'm trying DDG this month) and I find - wait for it - LogParser Lizard GUI.
LogParser Lizard GUI
What's this? Oh YES. It's intellisense and tooltips, baby!
I can't say how much faster this tool made me once I had figured out LogParser. It's funny how you have to suffer at the command line before you can really appreciate a good GUI. At this point I called Jon Galloway for some pair-SQLing and we pounded out a few more queries.
NuGet by Version
I filtered out NuGet Package Explorer because it has its own version scheme. However, I'm not sure about this query, as I wanted to get the Major.Minor versions. I noticed that by coincidence the third value z (of x.y.z) always started with .2 so I cheated with the SUB() below because I couldn't figure out how to just filter out the x.y values. Any thoughts are appreciated.
SELECT DISTINCT SUBSTR( cs(User-Agent),
ADD(index_of(cs(User-Agent),'/'),1),
SUB(index_of(cs(User-Agent),'.2'),STRLEN(cs(User-Agent))))
AS Client, count(1) AS NumberOfHits
FROM u_ex110831.log
WHERE cs(User-Agent) NOT LIKE '%Explorer%' AND cs(User-Agent) LIKE '%NuGet%'
GROUP BY Client
ORDER by count(1) DESC
Client Hits
------ ----- 1.4 98097
1.5 18985
1.3 2524
1.6 69
So then I did the whole version:
SELECT SUBSTR( cs(User-Agent),
ADD(index_of(cs(User-Agent),'/'),1),
SUB(index_of(cs(User-Agent),'+('),STRLEN(cs(User-Agent))))
AS Client, count(1) AS NumberOfHits
FROM u_ex110831.log
WHERE cs(User-Agent) NOT LIKE '%Explorer%' AND cs(User-Agent) LIKE '%NuGet%'
GROUP BY Client, cs(User-Agent)
ORDER by count(1) DESC
Client Hits
--------------- ------
1.4.20701.9038 38840
1.5.20830.9001 15591
1.4.20701.9038 13360
1.4.20615.182 8562
1.4.20607.9007 5531
1.4.20701.9037 5497
1.4.20615.182 3699
1.4.20701.9038 3654
I was extremely impressed with how quickly (about an hour) was able to get really substantive, interesting and targeted data out of these log files. The next step will be to get all the logs and run the command line tool create month over month line charts. The goal will be to figure out how many folks are successfully upgrading their NuGet installations as well as how they are using it. Are they using the right-click menu or are they using the console?
If you've got an application that makes HTTP calls to a service that you own, whether your application is a phone or a custom client, while you can certainly instrument your code on the server side to collect stats, there's a LOT of information in your IIS logs. You can use LogParser Lizard GUI to develop your queries and then schedule runs of the command line tool to generate reports that will really help you improve your product. This technique isn't as sophisticated as custom 3rd party analytics package but you can certainly get a surprising amount of information in a short amount of time with LogParser.
Related Links
- Charting with LogParser
- Parsing my IIS Log Files with LogParser 2.2 to learn more about Blogs stats from NewsGator and NewsGatorOnline
- Parsing CSVs and Poor Man's Web Log Analysis with PowerShell
- FREE Download - Log Parser Lizard
- Download LogParser 2.2
- Atwood on LogParser
- Download: Office Web Components 2003
- Forensic Log Parsing with Microsoft's LogParser
- Book: The LogParser Book (awesome)
© 2011 Scott Hanselman. All rights reserved.