Set Up AWStats for Nginx on Ubuntu 20.04

For something as simple as it is, AWStats is surprisingly complex to get up and running. Here’s a little guide to save you some searching.

Update Sept 2023: I still use this method and it still works. I’ve updated the guide so that the commands work on both Debian 12 and Ubuntu 23.04. Unfortunately the spambots love this page so I have disabled comments, if you have a question, email me

If you’d like some basic data about your site’s visitors, but don’t want to let spyware vendors track them around the web, AWStats makes a good solution. It parses your server log files and tells you who came by and what they did. There’s no spying, no third-party code bloat. AWStats just analyzes your visitors’ footprints.

Here’s how I’ve managed to get AWStats installed and running on Ubuntu 18.04, Ubuntu 20.04, Debian 10, and Debian 11.

AWStats with GeoIP

The first step is to install the AWStats package from the Ubuntu repositories:

sudo apt install awstats

This will install the various tools and scripts AWStats needs. Because I like to have some geodata in my stats, I also installed the tools necessary to use the AWStats geoip plugin. Here’s what worked for me.

First we need build-essential and libgeoip:

sudo apt install libgeoip-dev build-essential

Next you need to fire up the cpan shell:

cpan

If this is your first time in cpan you’ll need to run two commands to get everything set up. If you’ve already got cpan set up, you can skip to the next step:

make install
install Bundle::CPAN

Once cpan is set up, install GeoIP:

install Geo::IP

That should take care of the GeoIP stuff. You can double-check that the database files exist by looking in the directory /usr/share/GeoIP/ and verifying that there’s a file named GeoIP.dat.

Now, on to the log file setup.

Optional Custom Nginx Log Format

This part isn’t strictly necessary. To get AWStats working the next step is to create our config files and build the stats, but first I like to overcomplicate things with a custom log format for Nginx. If you don’t customize your Nginx log format then you can skip this section, but make a note of where Nginx is putting your logs, you’ll need that in the next step.

Open up /etc/nginx/nginx.conf and add these lines:

log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                '$status $body_bytes_sent "$http_referer" '
                '"$http_user_agent" "$http_x_forwarded_for"';    

Now we need to edit our individual nginx config file to use this log format. If you follow the standard nginx practice, your config file should be in /etc/nginx/sites-enabled/. For example this site is served by the file /etc/nginx/sites-enabled/luxagraf.net.conf. Wherever that file may be in your setup, open it and add this line somewhere in the server block.

server {
    # ... all your other config ...
    access_log  /var/log/nginx/yourdomain.com.access.log main;
    # ... all your other config ...
}

Configure AWStats for Nginx

As I said in the beginning, AWStats is ancient, it hails from a very different era of the internet. One legacy from the olden days is that AWStats is very strict about configuration files. You have to have one config file per domain you’re tracking and that file has to be named in the following way: awstats.domain.tld.conf. Those config files must be placed inside the /etc/awstats/ directory.

If you go take a look at the /etc/awstats directory you’ll see two files in there: awstats.conf and awstats.conf.local. The first is a main conf file that serves as a fallback if your own config file doesn’t specify a particular setting. The second is an empty file that’s meant to be used to share common config settings, which really doesn’t make much sense to me.

I took a tip from this tutorial and dumped the contents of awstats.conf into awstats.local.conf. That way my actual site config file is very short. If you want to do that, then all you have to put in your config file are a few lines.

Using the naming scheme mentioned above, my config file resides at /etc/awstats/awstats.luxagraf.net.conf and it looks like this (drop your actual domain in place of “yourdomain.com”):

# Path to your nginx log file
LogFile="/var/log/nginx/yourdomain.com.access.log"

# Domain of your vhost
SiteDomain="yourdomain.com"

# Directory where to store the awstats data
DirData="/var/lib/awstats/"

# Other domains/subdomain you want included from your logs, for example the www subdomain
HostAliases="www.yourdomain.com"

# If you customized your log format above add this line:

LogFormat = "%host - %host_r %time1 %methodurl %code %bytesd %refererquot %uaquot %otherquot"

# If you did not, uncomment and use this line:
# LogFormat = 1

Save that file and open the fallback file awstats.conf.local. Now set a few things:

# if your site doesn't get a lot of traffic you can leave this at 1
# but it can make things slow
DNSLookup = 0

# find the geoip plugin line and uncomment it:
LoadPlugin="geoip GEOIP_STANDARD /usr/share/GeoIP/GeoIP.dat"

Then delete the LogFile, SiteDomain, DirData, and HostAliases settings in your awstats.conf.local file. We’ve got those covered in our site-specific config file. Also delete the import statement at the bottom to make sure you don’t end up with a circular import.

Okay, that’s it for configuring things, let’s generate some data to look at.

Building Stats and Rotating Log Files

Now that we have our log files, and we’ve told AWStats where they are, what format they’re in and where to put its analysis, it’s time to actually run AWStats and get the raw data analyzed. To do that we use this command:

sudo /usr/lib/cgi-bin/awstats.pl -config=yourdoamin.com -update

Alternately, if you have a bunch of config files you’d like to update all at once, you can use this wrapper script conveniently located in a completely different directory:

/usr/share/doc/awstats/examples/awstats_updateall.pl now -awstatsprog=/usr/lib/cgi-bin/awstats.pl

You’re going to need to run that command regularly to update the AWStats data. One way to do is with a crontab entry, but there are better ways to do this. Instead of cron we can hook into logrotate, which rotates Nginx’s log files periodically anyway and conveniently includes a prerotate directive that we can use to execute some code. Technically logrotate runs via /etc/cron.daily under the hood, so we haven’t really escaped cron, but it’s not a crontab we need to keep track of anyway.

Open up the file `/etc/logrotate.d/nginx` and replace it with this: 

    /var/log/nginx/*.log{
        daily
        missingok
        rotate 30
        compress
        delaycompress
        notifempty
        create 0640 www-data adm
        sharedscripts
        prerotate
            /usr/share/doc/awstats/examples/awstats_updateall.pl now -awstatsprog=/usr/lib/cgi-bin/awstats.pl
            if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
                run-parts /etc/logrotate.d/httpd-prerotate; \
            fi \
        endscript
        postrotate
            invoke-rc.d nginx rotate >/dev/null 2>&1
        endscript
    }

The main things we’ve changed here are the frequency, moving from weekly to daily rotation in line 2, keeping 30 days worth of logs in line 4, and then calling AWStats in line 11.

One thing to bear in mind is that if you re-install Nginx for some reason this file will be overwritten.

Now do a dry run to make sure you don’t have any typos or other problems:

sudo logrotate -f /etc/logrotate.d/nginx

Serving Up AWStats

Now that all the pieces are in place, we need to put our stats on the web. I used a subdomain, awstats.luxagraf.net. Assuming you’re using something similar here’s an nginx config file to get you started:

server {
    server_name awstats.luxagraf.net;

    root    /var/www/awstats.luxagraf.net;
    error_log /var/log/nginx/awstats.luxagraf.net.error.log;
    access_log off;
    log_not_found off;

    location ^~ /awstats-icon {
        alias /usr/share/awstats/icon/;
    }

    location ~ ^/cgi-bin/.*\\.(cgi|pl|py|rb) {
        auth_basic            "Admin";
        auth_basic_user_file  /etc/awstats/awstats.htpasswd;

        gzip off;
        include         fastcgi_params;
        fastcgi_pass unix:/var/run/php/php7.2-fpm.sock; # change this line if necessary
        fastcgi_index   cgi-bin.php;
        fastcgi_param   SCRIPT_FILENAME    /etc/nginx/cgi-bin.php;
        fastcgi_param   SCRIPT_NAME        /cgi-bin/cgi-bin.php;
        fastcgi_param   X_SCRIPT_FILENAME  /usr/lib$fastcgi_script_name;
        fastcgi_param   X_SCRIPT_NAME      $fastcgi_script_name;
        fastcgi_param   REMOTE_USER        $remote_user;
    }

}

This config is pretty basic, it passes requests for icons to the AWStats icon dir and then sends the rest of our requests to php-fpm. The only tricky part is that AWStats needs to call a Perl file, but we’re calling a PHP file, namely /etc/nginx/cgi-bin.php. How’s that work?

Well, in a nutshell, this script takes all our server variables and passes them to stdin, calls the Perl script and then reads the response from stdout, passing it on to Nginx. Pretty clever, so clever in fact that I did not write it. Here’s the file I use, taken straight from the Arch Wiki:

<?php
$descriptorspec = array(
   0 => array("pipe", "r"),  // stdin is a pipe that the child will read from
   1 => array("pipe", "w"),  // stdout is a pipe that the child will write to
   2 => array("pipe", "w")   // stderr is a file to write to
);
$newenv = $_SERVER;
$newenv["SCRIPT_FILENAME"] = $_SERVER["X_SCRIPT_FILENAME"];
$newenv["SCRIPT_NAME"] = $_SERVER["X_SCRIPT_NAME"];
if (is_executable($_SERVER["X_SCRIPT_FILENAME"])) {
   $process = proc_open($_SERVER["X_SCRIPT_FILENAME"], $descriptorspec, $pipes, NULL, $newenv);
   if (is_resource($process)) {
       fclose($pipes[0]);
       $head = fgets($pipes[1]);
       while (strcmp($head, "\n")) {
           header($head);
           $head = fgets($pipes[1]);
       }
       fpassthru($pipes[1]);
       fclose($pipes[1]);
       fclose($pipes[2]);
       $return_value = proc_close($process);
   } else {
       header("Status: 500 Internal Server Error");
       echo("Internal Server Error");
   }
} else {
   header("Status: 404 Page Not Found");
   echo("Page Not Found");
}
?> 

Save that mess of PHP as /etc/nginx/cgi-bin.php and then install php-fpm if you haven’t already:

sudo apt install php-fpm

Next we need to create the password file referenced in our Nginx config. We can create a .htpasswd file with this little shell command, just make sure to put an actual username in place of username:

printf "username:`openssl passwd -apr1`\n" >> awstats.htpasswd

Enter your password when prompted and your password file will be created in the expected format for basic auth files.

Then move that file to the proper directory:

sudo mv awstats.htpasswd /etc/awstats/

Now we have an Nginx config, a script to pass AWStats from PHP to Perl and some basic password protection for our stats site. The last, totally optional, step is to serve it all over HTTPS instead of HTTP. Since we have a password protecting it anyway, this is arguably unnecessary. I do it more out of habit than any real desire for security. I mean, I did write an article criticizing the push to make everything HTTPS. But habit.

I have a separate guide on how to set up Certbot for Nginx on Ubuntu that you can follow. Once that’s installed you can just invoke Certbot with:

sudo certbot --nginx

Select the domain name you’re serving your stats at (for me that’s awstats.luxagraf.net), then select 2 to automatically redirect all traffic to HTTPS and certbot will append some lines to your Nginx config file.

Now restart Nginx:

sudo systemctl restart nginx

Visit your new site in the browser at this URL (changing yourdomain.com to the domains you’ve been using): https://awstats.yourdomain.com/cgi-bin/awstats.pl?config=yourdomain.com. If all went well you should see AWStats with a few stats in it. If all did not go well, feel free to drop whatever your error message is in a comment here and I’ll see if I can help.

Motivations

And now the why. The “why the hell don’t I just use —insert popular spyware here—” part.

My needs are simple. I don’t have ads. I don’t have to prove to anyone how much traffic I get. And I don’t really care how you got here. I don’t care where you go after here. I hardly ever look at my stats.

When I do look all I want to see is how many people stop by in a given month and if there’s any one article that’s getting a lot of visitors. I also enjoy seeing which countries visitors are coming from, though I recognize that VPNs make this information suspect.

Since I don’t track you I certainly don’t want third-party spyware tracking you, so that means any hosted service is out. Now there are some self-hosted, open source spyware packages that I’ve used, Matomo being the best. It is nice, but I don’t need or use most of what it offers. I also really dislike running MySQL, and unfortunately Matomo requires MySQL, as does Open Web Analytics.

By process of elimination (no MySQL), and my very paltry requirements, the logical choice is a simple log analyzer. I went with AWStats because I’d used it in the past. Way in the past. But you know what, AWStats ain’t broke. It doesn’t spy, it uses no server resources, and it tells you 95 percent of what any spyware tool will tell you (provided you actually read the documentation)

In the end, AWStats is good enough without being too much. But for something as simple as it is, AWStats is surprisingly complex to get up and running, which is what inspired this guide.

Shoulders stood upon: