3.1. lars.apache - Reading Apache Logs

This module provides a wrapper for Apache log files, typically in common or combined format (but technically any Apache format which can be unambiguously parsed with regexes).

The ApacheSource class is the major element that this module exports; this is the class which wraps a file-like object containing a common, combined, or otherwise Apache formatted log file and yields rows from it as tuples.

3.1.1. Classes

class lars.apache.ApacheSource(source, log_format=COMMON)[source]

Wraps a stream containing a Apache formatted log file.

This wrapper converts a stream containing an Apache log file into an iterable which yields tuples. Each tuple has fieldnames derived from the following mapping of Apache format strings (which occur in the optional log_format parameter):

Format String Field Name
%a remote_ip
%A local_ip
%B size
%b size
%{Foobar}C cookie_Foobar (1)
%D time_taken_ms
%{FOOBAR}e env_FOOBAR (1)
%f filename
%h remote_host
%H protocol
%{Foobar}i req_Foobar (1)
%k keepalive
%l ident
%m method
%{Foobar}n note_Foobar (1)
%{Foobar}o resp_Foobar (1)
%p port
%{canonical}p port
%{local}p local_port
%{remote}p remote_port
%P pid
%{pid}P pid
%{tid}P tid
%{hextid}P hextid
%q url_query
%r request
%R handler
%s status
%t time
%{format}t time
%T time_taken
%u remote_user
%U url_stem
%v server_name
%V canonical_name
%X connection_status
%I bytes_received
%O bytes_sent


  1. Any characters in the field-name which are invalid in a Python identifier are converted to underscore, e.g. %{foo-bar}C becomes "cookie_foo_bar".


The wrapper will only operate on log_format specifications that can be unambiguously parsed with a regular expression. In particular, this means that if a field can contain whitespace it must be surrounded by characters that it cannot legitimately contain (or cannot contain unescaped versions of). Typically double-quotes are used as Apache (from version 2.0.46) escapes double-quotes within %r, %i, and %o. See Apache’s Custom Log Formats documentation for full details.

  • source – A file-like object containing the source stream
  • format (str) – Defaults to COMMON but can be set to any valid Apache LogFormat string

The file-like object that the source reads rows from


Returns the number of rows successfully read from the source


The Apache LogFormat string that the class will use to decode rows


Close the source; attempting to read further rows is not permitted after this method is called.

3.1.2. Data


This string contains the Apache LogFormat string for the common log format (sometimes called the CLF). This is the default format for the ApacheSource class.


This string contains the Apache LogFormat strnig for the common log format with an additional virtual-host specification at the beginning of the string. This is a typical configuration used by several distributions of Apache which are configured with virtualhosts by default.


This string contains the Apache LogFormat string for the NCSA combined/extended log format. This is a popular variant that many server administrators use as it combines the COMMON format with REFERER and USER_AGENT formats.


This string contains the (rudimentary) referer log format which is typically used in conjunction with the COMMON format.


This string contains the (rudimentary) user-agent log format which is typically used in conjunction with the COMMON format.

3.1.3. Exceptions

class lars.apache.ApacheError(message, line_number=None, line=None)[source]

Base class for ApacheSource errors.

Exceptions of this class take the optional arguments line_number and line for specifying the index and content of the line that caused the error respectively. If specified, the __str__() method is overridden to include the line number in the error message.

  • message (str) – The error message
  • line_number (int) – The 1-based index of the line that caused the error
  • line (str) – The content of the line that caused the error
exception lars.apache.ApacheWarning[source]

Raised when an error is encountered in parsing a log row.

3.1.4. Examples

A typical usage of this class is as follows:

import io
from lars import apache, csv

with io.open('/var/log/apache2/access.log', 'rb') as infile:
    with io.open('access.csv', 'wb') as outfile:
        with apache.ApacheSource(infile) as source:
            with csv.CSVTarget(outfile) as target:
                for row in source: