weblog_parse - extract specified fields from a web log file

Fetch the software.

Reads a web server log file, in either "Common Logfile Format" or "Combined Logfile Format". Parses it, and writes out only the user-specified fields, separated by tabs for easier handling. Examples:

Show filenames and byte counts:
  weblog_parse file bytes
Show just dates, for making a histogram via timegraph:
  weblog_parse date
    

This is intended as a utility for writing web-log statistics generators. It's written in C and is very fast.

In addition to just extracting specified fields, the program can also do some simple database-like conditional matching. If any field names on the command line are followed by an equals sign and a string, then only lines where that field matches the string are shown. The string can contain wildcards. And if you use a '^' instead of an '=' then only lines which do not match the string are shown. Examples:

Show files fetched by a particular host:
  weblog_parse file host=anvil.acme.com
Show files and referers fetched during the noon hour:
  weblog_parse file referer 'date=.*:12:..:..'
Show all unsuccessful fetches and where they came from:
  weblog_parse status^200 file host referer
    

Note that if the string you're matching on doesn't have any wildcard characters in it, the program uses plain old strcmp(), so it remains very fast.


ACME Labs / Software / weblog_parse
email