diff --git a/README.md b/README.md index 0fe43a0..7d52b51 100644 --- a/README.md +++ b/README.md @@ -42,14 +42,6 @@ sudo cp _regina.compdef.zsh /usr/share/zsh/site-functions/_regina sudo chmod +x /usr/share/zsh/site-functions/_regina ``` -# Changelog -## 2.0 -- Refactored databse code -- New database format: - - Removed filegroups table - - Put referrer, browser and platform in own table to reduze size of the database -- - ## 1.0 - Initial release diff --git a/pyproject.toml b/pyproject.toml index 72f0255..d93eeea 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -3,7 +3,7 @@ requires = ["setuptools"] [project] name = "regina" -version = "2.0.0" +version = "1.1.0" description = "Get website analytics from nginx logs and visualize them" requires-python = ">=3.10" readme = "README.md" @@ -28,7 +28,7 @@ documentation = "https://quintern.xyz/en/software/regina.html" [tool.setuptools.packages.find] -where = ["regina"] +where = ["."] [project.scripts] regina = "regina.main:main" diff --git a/regina.1.md b/regina.1.md index 74c7e1e..2e88d86 100644 --- a/regina.1.md +++ b/regina.1.md @@ -1,4 +1,4 @@ -% NICOLE(1) nicole 2.0 +% REGINA(1) regina 1.1 % Matthias Quintern % April 2022 @@ -32,8 +32,14 @@ Regina supports several data visualization configurations and can generate an ad **--update-geoip** geoip-db : Recreate the geoip part of the database from the geoip-db csv. The csv must have this form: lower, upper, country-code, country-name, region, city -# INSTALLATION AND UPDATING -## pip +# GETTING STARTED + +## Dependencies +- **nginx**: You need a nginx webserver that outputs the access log in the `combined` format, which is the default +- **Python 3.10** +- **Python/matplotlib** + +## Installation You can install regina with python-pip: ```shell git clone https://github.com/MatthiasQuintern/regina.git @@ -44,13 +50,111 @@ You can also install it system-wide using `sudo python3 -m pip install .` If you also want to install the man-page and the zsh completion script: ```shell -sudo cp regina.1.man /usr/share/man/man1/regina.1 -sudo gzip /usr/share/man/man1/regina.1 -sudo cp _regina.compdef.zsh /usr/share/zsh/site-functions/_regina -sudo chmod +x /usr/share/zsh/site-functions/_regina + sudo cp regina.1.man /usr/share/man/man1/regina.1 + sudo gzip /usr/share/man/man1/regina.1 + sudo cp regina/package-data/_regina.compdef.zsh /usr/share/zsh/site-functions/_regina + sudo chmod +x /usr/share/zsh/site-functions/_regina ``` +## Configuration +The following instructions assume you have an nginx webserver configured for a website like this, with `/www` as root (`/`): +``` + /www + |-- resources + | |-- image.jpg + |-- index.html +``` +By default, nginx will generate logs in the `combined` format with the name `access.log` in `/var/log/nginx/` and rotate them daily. + +Copy the default configuration and template from the git directory to a directory of your choice, in this case `~/.config/regina` +If you did clone the git repo, the files should be in `/usr/local/lib/python3.11/site-packages/regina/package-data/`. +```shell + mkdir ~/.config/regina + cp regina/package-data/default.cfg ~/.config/regina/regina.cfg + cp regina/package-data/template.html ~/.config/regina/template.html +``` +Now edit the configuration to fit your needs. +For our example: +``` + [regina] + server_name = my_server.com + access_log = /var/log/nginx/access.log.1 + ... + [html-generation] + html_out_path = /www/analytics/analytics.html + img_location = /img + + [plot-generation] + img_out_dir = /www/analytics/img +``` +Most defaults should be fine. The default configuration should also be documented well enough for you to know what do do. +It is strongly recommended to only use absolute paths. + +Now you fill collect the data from the nginx log specified as `access_log` in the configuration into the database specified at the `database` location (or `~/.local/share/regina/my-server.com.db` if left blank): +``` + regina --config ~/.config/regina/regina.cfg --collect +``` + +To visualize the data, run: +``` + regina --config ~/.config/regina/regina.cfg --visualize +``` +This will generate plots and statistics and replace all variables in `template_html` and output the result to `html_out_path`. +If `html_out_path` is in your webroot, you should now be able to access the generated site. +In our example, `/www` will look like this: +``` + /www + |-- analytics + | |-- analytics.html + | |-- img + | |-- ranking_referer_total.svg + | |-- ranking_referer_last_x_days.svg + | ... + |-- resources + | |-- image.jpg + |-- index.html + +``` + +### Automation +You will probably run `regina` once per day, after `nginx` has filled the daily access log. The easiest way to that is using a *cronjob*. +Run `crontab -e` and enter: +`10 0 * * * /usr/bin/regina --config /home/myuser/.config/regina/regina.conf --collect --visualize` +This assumes, you installed `regina` system-wide. +Now the `regina` command will be run every day, ten minutes after midnight. +After each day, rotates the logs, so `access.log` becomes `access.log.1`. +Since `regina` is run after the log rotation, you will probably want to run it on `access.log.1`. + +#### Logfile permissions +By default, `nginx` logs are `-rw-r----- root root` so you can not access them as user. +You could either run regina as root, which I **strongly do not recommend** or make a root-cronjob that changes ownership of the log after midnight. +Run `sudo crontab -e` and enter: +`9 0 * * * chown your-username /var/log/nginx/access.log.1` +This will make you the owner of the log 9 minutes after midnight, just before `regina` needs read access. + + +## GeoIP +`regina` can show you from which country or city a visitor is from, but you will need an *ip2location* database. +You can acquire such a database for free at [ip2location.com](https://lite.ip2location.com/) (and probably some other sites as well!). +After creating create an account you can download several different databases in different formats. +For `regina`, download the `IP-COUNTRY-REGION-CITY` for IPv4 as *csv*. +By default, `regina` only tells you which country a user is from. +To see the individual cities for countries, append the two-letter country code to the `get_cities_for_contries` option in the `data-collection` section in the config file. +After that, oad the GeoIP-data into your database: +``` + regina --config regina.conf --update-geoip path-to-csv +``` +Depending on how many countries you specified, this might take a long time. You can delete the `csv` afterwards. + # CHANGELOG +## 1.1 +- Improved database format: + - put referrer, browser and platform in own table to reduze size of the database + - route groups now part of visualization, not data collection +- Data visualization now uses more sql for improved performance +- Refactored codebase +- Bug fixes +- Changed setup.py to pyproject.toml ## 1.0 - Initial release diff --git a/regina/utility/globals.py b/regina/utility/globals.py index 45f1ccf..18bd941 100644 --- a/regina/utility/globals.py +++ b/regina/utility/globals.py @@ -1,8 +1,8 @@ """global variables for regina""" import os - import re +import importlib.metadata if __name__ == "__main__": # make relative imports work as described here: https://peps.python.org/pep-0366/#proposed-change if __package__ is None: @@ -14,9 +14,7 @@ if __name__ == "__main__": # make relative imports work as described here: http from regina.utility.config import CFG_Entry, CFG_File, ReginaSettings, Path, comment -version = "2.0" - - +version = importlib.metadata.version("regina") # these oses and browser can be detected: # lower element takes precedence