AWStats Logfile Analyzer
I wanted to report on basic activity on my web server so I found and installed AWstats, which I really like.
Importing Apache logs into SQL
Later I wanted to see more details so I started looking for a way to convert Log file entries to database tables. I found the following on goolge and listed them in the order I think will work best on linux.
- Importing Apache (httpd) logs into MySQL
- LogToMysql - piped logging from Apache to MySQL
- Query Apache logfiles via SQL (windows)
- Parsing Apache access_log files using php, mySQL LOAD, importing to mySQL
- Convert apache http combined logs into sql (and import it into a mysql database eventually)
My log files from apache with awstats are in the combined format as follows:
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
Custom Log Formats
The format argument to the LogFormat and CustomLog directives is a string. This string is used to log each request to the log file. It can contain literal characters copied into the log files and the C-style control characters "\n" and "\t" to represent new-lines and tabs. Literal quotes and back-slashes should be escaped with back-slashes.
The characteristics of the request itself are logged by placing "%" directives in the format string, which are replaced in the log entry by the values as follows:
%...a: Remote IP-address %...A: Local IP-address %...B: Bytes sent, excluding HTTP headers. %...b: Bytes sent, excluding HTTP headers. In CLF format i.e. a '-' rather than a 0 when no bytes are sent. %...c: Connection status when response was completed. 'X' = connection aborted before the response completed. '+' = connection may be kept alive after the response is sent. '-' = connection will be closed after the response is sent. %...{FOOBAR}e: The contents of the environment variable FOOBAR %...f: Filename %...h: Remote host %...H The request protocol %...{Foobar}i: The contents of Foobar: header line(s) in the request sent to the server. %...l: Remote logname (from identd, if supplied) %...m The request method %...{Foobar}n: The contents of note "Foobar" from another module. %...{Foobar}o: The contents of Foobar: header line(s) in the reply. %...p: The canonical Port of the server serving the request %...P: The process ID of the child that serviced the request. %...q The query string (prepended with a ? if a query string exists, otherwise an empty string) %...r: First line of request %...s: Status. For requests that got internally redirected, this is the status of the *original* request --- %...>s for the last. %...t: Time, in common log format time format (standard english format) %...{format}t: The time, in the form given by format, which should be in strftime(3) format. (potentially localized) %...T: The time taken to serve the request, in seconds. %...u: Remote user (from auth; may be bogus if return status (%s) is 401) %...U: The URL path requested, not including any query string. %...v: The canonical ServerName of the server serving the request. %...V: The server name according to the UseCanonicalName setting.