2.2. lars.apache - Reading Apache Logs

This module provides a wrapper for Apache log files, typically in common or combined format (but technically any Apache format which is can be unambiguously parsed with regexes).

The ApacheSource class is the major element that this module exports; this is the class which wraps a file-like object containing a common, combined, or otherwise Apache formatted log file and yields rows from it as tuples.

2.2.1. Classes

class lars.apache.ApacheSource(source, log_format=COMMON)[source]

Wraps a stream containing a Apache formatted log file.

This wrapper converts a stream containing an Apache log file into an iterable which yields tuples. Each tuple has fieldnames derived from the following mapping of Apache format strings (which occur in the optional log_format parameter):

Format String Field Name
%a remote_ip
%A local_ip
%B size
%b size
%{Foobar}C cookie_Foobar (1)
%D time_taken_ms
%{FOOBAR}e env_FOOBAR (1)
%f filename
%h remote_host
%H protocol
%{Foobar}i req_Foobar (1)
%k keepalive
%l ident
%m method
%{Foobar}n note_Foobar (1)
%{Foobar}o resp_Foobar (1)
%p port
%{canonical}p port
%{local}p local_port
%{remote}p remote_port
%P pid
%{pid}P pid
%{tid}P tid
%{hextid}P hextid
%q url_query
%r request
%R handler
%s status
%t time
%{format}t time
%T time_taken
%u remote_user
%U url_stem
%v server_name
%V canonical_name
%X connection_status
%I bytes_received
%O bytes_sent

Notes:

  1. Any characters in the field-name which are invalid in a Python identifier are converted to underscore, e.g. %{foo-bar}C becomes "cookie_foo_bar".

Warning

The wrapper will only operate on log_format specifications that can be unambiguously parsed with a regular expression. In particular, this means that if a field can contain whitespace it must be surrounded by characters that it cannot legitimately contain (or cannot contain unescaped versions of). Typically double-quotes are used as Apache (from version 2.0.46) escapes double-quotes within %r, %i, and %o. See Apache’s Custom Log Formats documentation for full details.

Parameters:
  • source – A file-like object containing the source stream
  • format (str) – Defaults to COMMON but can be set to any valid Apache LogFormat string
source

The file-like object that the source reads rows from

count

Returns the number of rows successfully read from the source

log_format

The Apache LogFormat string that the class will use to decode rows

2.2.2. Data

lars.apache.COMMON

This string contains the Apache LogFormat string for the common log format (sometimes called the CLF). This is the default format for the ApacheSource class.

lars.apache.COMMON_VHOST

This string contains the Apache LogFormat strnig for the common log format with an additional virtual-host specification at the beginning of the string. This is a typical configuration used by several distributions of Apache which are configured with virtualhosts by default.

lars.apache.COMBINED

This string contains the Apache LogFormat string for the NCSA combined/extended log format. This is a popular variant that many server administrators use as it combines the COMMON format with REFERER and USER_AGENT formats.

lars.apache.REFERER

This string contains the (rudimentary) referer log format which is typically used in conjunction with the COMMON format.

lars.apache.USER_AGENT

This string contains the (rudimentary) user-agent log format which is typically used in conjunction with the COMMON format.

2.2.3. Exceptions

class lars.apache.ApacheError(message, line_number=None, line=None)[source]

Base class for ApacheSource errors.

Exceptions of this class take the optional arguments line_number and line for specifying the index and content of the line that caused the error respectively. If specified, the __str__() method is overridden to include the line number in the error message.

Parameters:
  • message (str) – The error message
  • line_number (int) – The 1-based index of the line that caused the error
  • line (str) – The content of the line that caused the error
exception lars.apache.ApacheWarning[source]

Raised when an error is encountered in parsing a log row.

2.2.4. Examples

A typical usage of this class is as follows:

import io
from lars import apache, csv

with io.open('/var/log/apache2/access.log', 'rb') as infile:
    with io.open('access.csv', 'wb') as outfile:
        with apache.ApacheSource(infile) as source:
            with csv.CSVTarget(outfile) as target:
                for row in source:
                    target.write(row)

Project Versions

Table Of Contents

Previous topic

2.1. lars - Introduction

Next topic

2.3. lars.iis - Reading IIS Logs

This Page