Deployment
Source: docs/deployment.md
How to install hzmetrics.py on a new HUBzero hub.
Prerequisites
- A working HUBzero hub with two MariaDB schemas:
<hub>(live CMS) and<hub>_metrics(analytics). If the metrics DB doesn't exist yet, see "First-time install" below. - Python 3.10+ on the host. 3.11 strongly preferred. On Rocky 8 the
system
python3is 3.6, which failshzmetrics.py's minimum check; installpython3.11alongside (the pipeline self-relaunches into the highestpython3.N>= 3.10 it finds onPATH). - A user with read access to the hub DB and full ownership of the
metrics DB. On a stock HUBzero hub this is the
apacheuser; the cron entry is owned byapache.
Production hosts get the deps as system packages (Rocky 8 names — adapt for other distros):
dnf install python3.11 python3.11-PyMySQL unbound # unbound optional
pip3.11 install --user aiodns # not packaged on Rocky 8
Versions required:
aiodns>= 3.x (the c-ares-based async resolver used byresolve-dns)pymysql(any current 1.x)- Python >= 3.10 is enforced by
hzmetrics.pyitself (see_MIN_PYTHONat the top of the file). On Rocky 8 the system/usr/bin/python3is 3.6, so a separatepython3.11install alongside is the supported configuration.
Test the Python version is discoverable: hzmetrics.py self-relaunches
into the highest-numbered python3.N (≥ 3.10) it can find on PATH,
so just having python3.11 installed alongside the system python3
is enough.
Dev installs (not for production)
For a development machine where you'll run tests or hack on the
code, pyproject.toml declares the same deps and wires a
hzmetrics console-script entry point:
pip install --user --break-system-packages -e . # PEP 668-friendly
This is not the production install path — production hosts still
get hzmetrics.py dropped on PATH by the Makefile + cron pulled
in via the cron.d / spool drop-in, exactly as above. pyproject.toml
is purely a dev / CI convenience so pip install resolves the dep
set instead of relying on system packages being present.
File layout (post-install)
Everything lives under /opt/hubzero/metrics/, a self-contained tree
owned by the apache user. Only the operator-facing log file lives
outside, at the conventional /var/log/ location:
/opt/hubzero/metrics/bin/hzmetrics.py the pipeline
/opt/hubzero/metrics/bin/hzmetrics-postrotate.sh logrotate hook
/opt/hubzero/metrics/conf/access.cfg DB credentials (mode 600, apache)
/opt/hubzero/metrics/conf/hzmetrics.conf optional runtime overrides
/opt/hubzero/metrics/conf/cron.apache crontab template (apache crontab)
/opt/hubzero/metrics/conf/hzmetrics.conf.sample reference config
/opt/hubzero/metrics/state/hzmetrics.pid PID lock (created at runtime)
/var/log/hubzero/metrics/manage.log pipeline log (apache-writable)
Tree is owned apache:apache, mode 0750 — only the apache user (and
root) can see inside. No files in /etc/, no /var/run/ tmpfiles
dance, no /var/spool/cron/ writing. Orchestrator state lives in
the pipeline_state DB table; the only on-disk state is the lock
file.
The legacy reference scripts under tests/legacy/ are not
installed on a production host. They live in this repo only as the
A/B-test parity reference.
Install via Makefile
From a checkout of this repo:
# One-time, on a fresh host (the only root step):
sudo make install-bootstrap
# Now and forever after, no root needed:
sudo -u apache make install
sudo -u apache crontab /opt/hubzero/metrics/conf/cron.apache
sudo make uninstall # removes /opt/hubzero/metrics tree
make help # list targets (lint, test, test-ab, …)
install-bootstrap creates /opt/hubzero/metrics/{bin,conf,state}
and /var/log/hubzero/metrics/ owned apache:apache mode 0750 —
the only root-required step in the install, and only once per host.
Subsequent installs and upgrades are pure sudo -u apache make install
identity-switches; no actual root needed.
What install copies (all owned apache:apache):
hzmetrics.py→/opt/hubzero/metrics/bin/hzmetrics.py(mode 755)conf/hzmetrics-logrotate-postrotate.sh→/opt/hubzero/metrics/bin/hzmetrics-postrotate.sh(mode 755)conf/hubzero-metrics.cron.apache→/opt/hubzero/metrics/conf/cron.apache(mode 644)conf/hzmetrics.conf.sample→/opt/hubzero/metrics/conf/hzmetrics.conf.sample(mode 644)
install deliberately does NOT drop access.cfg — it's an operator-
supplied secret. After make install, the Makefile prints the
remaining manual steps (the access.cfg copy, the crontab
registration; setup-db and migrate --apply are now folded into
hzmetrics.py init or the auto-bootstrap on cron's first tick —
see the First-time install section below).
Cron-style default is spool — the install registers the apache
user's crontab directly via crontab(1), no /etc/cron.d/
drop-in. Override with CRON_STYLE=dropin make install if your
hub's policy needs the global /etc/cron.d/ flavor.
Overrides: HZMETRICS_HOME (install root, default /opt/hubzero/metrics),
LOG_DIR (default /var/log/hubzero/metrics), INSTALL_OWNER /
INSTALL_GROUP (default apache), and DESTDIR for staged installs
(make install DESTDIR=/tmp/stage for packaging dry-runs).
The cron entry is a single line, every five minutes, in the apache user crontab:
*/5 * * * * /opt/hubzero/metrics/bin/hzmetrics.py tick
Register it via:
sudo -u apache crontab /opt/hubzero/metrics/conf/cron.apache
cronie auto-detects the user-crontab change; no daemon restart needed.
access.cfg
The mandatory config file. Bare $var = 'value'; syntax (no
<?php). Read by hzmetrics.py:
$hub_dir = '/var/www/<hub>';
$hub_db = '<hub>';
$metrics_db = '<hub>_metrics';
$db_host = 'localhost';
$db_user = '<hub>';
$db_pass = '<secret>';
$db_prefix = 'jos_';
Drop it in place after the bootstrap step has created the conf dir:
sudo install -o apache -g apache -m 600 <your-cfg> \
/opt/hubzero/metrics/conf/access.cfg
The harness uses HZMETRICS_ACCESS_CFG=<path> to point at a test
config — useful for one-off catch-up against a copy of production.
hzmetrics.conf (optional)
Runtime tuning. See conf/hzmetrics.conf.sample
for the documented form. The two settings that matter:
[dns]
nameserver = system # or 127.0.0.1 with local unbound
concurrency = 100 # raise to 500 with unbound in front
timeout = 2.0
Precedence (lowest to highest): built-in defaults → this file →
HZMETRICS_DNS_* env vars → CLI flags.
If you don't deploy this file, the built-in defaults are fine —
concurrency=100 against the system resolver is benchmarked clean
against Purdue's resolvers and produces ~4 ms/IP cold.
First-time install
After install-bootstrap has seated /opt/hubzero/metrics and you've
dropped a populated conf/access.cfg in place, the script can finish
its own setup in one call:
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py init
init is idempotent and does:
- Asserts
site = <hubname>is set in/etc/hubzero.conf(refuses otherwise — the site name prefixes every staged-log filename and a few DB conventions, so a silent "hub" fallback would collide on every multi-hub host). mkdir -pfor every directory the pipeline writes to —HZMETRICS_HOME/{bin,conf,state},/var/log/hubzero/{daily, imported,metrics}, and/var/log/{httpd,apache2}/{daily, imported}.CREATE DATABASE IF NOT EXISTS <hub>_metrics, then runs the baseline DDL and applies every pending migration.
The same machinery runs automatically on the first cron tick when
the process is owned by apache / www-data (see
_self_bootstrap in architecture.md) — so
operators who'd rather skip init and let cron handle it can do so,
provided access.cfg is in place before cron fires.
The CMS-side tables created by metrics
(jos_resource_stats_tools_topvals, jos_session_geo, etc.) are
created by the hub's own CMS migrations and shouldn't need anything
from hzmetrics.py. If they're missing, see the exclude_list
schema work and check the hub's migration state.
The two underlying commands init composes (setup-db and migrate --apply) are still individually invocable for the rare case where you
want to run one without the other.
Verifying the install
doctor is the diagnostic entry point. It walks the four phases
self-bootstrap touches and reports each:
# Full health check — reports every issue, fixes nothing:
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py doctor
# Same, but attempt to repair the fixable ones (mkdir, CREATE
# DATABASE, run pending migrations) — same code paths self-bootstrap
# uses on cron startup:
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py doctor --fix
Things doctor cannot fix from its own privileges (missing
/etc/hubzero.conf site line, root-owned parent dirs, MySQL down,
bad access.cfg credentials) are reported clearly so the operator
knows the next step.
The traditional smoke-tests still work after init:
# What does the pipeline see?
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py status
# Does DNS work?
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py resolve-dns \
metrics web --dry-run 2025-07
# Does the daily run actually do anything?
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py run --force
Watch /var/log/hubzero/metrics/manage.log while the run is in
progress. Each pipeline phase prints [<phase>] … start and end
markers.
Logrotate
The pipeline writes to a single log file in
/var/log/hubzero/metrics/. Add a logrotate stanza that invokes the
postrotate hook so the pipeline picks up the new file without
restarting (it reopens on the next tick):
/var/log/hubzero/metrics/manage.log {
daily
rotate 14
compress
missingok
notifempty
create 640 apache apache
postrotate
/opt/hubzero/metrics/bin/hzmetrics-postrotate.sh
endscript
}
Catch-up after a stalled host
cmd_run is a three-mode state machine (see
architecture.md → Catchup orchestration):
when it detects a backlog it flips itself into catchup mode, drains
one month per tick using the per-month decision matrix, then enters
rebuild mode to refresh long-window summary cells. All autonomous.
# Where is the orchestrator? status shows mode + cursors:
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py status
# Drive ticks manually if `tick` cadence is too slow:
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py run
# Resummarize a range without touching state.mode
# (useful for a one-shot rebuild after a data fix):
sudo -u apache python3 /opt/hubzero/metrics/bin/hzmetrics.py rebuild-summaries \
--since 2022-01 --through 2024-12
See operations.md for the runbook on common catch-up scenarios.
Coexistence with the legacy pipeline
If you're migrating a hub that's still running the legacy
hubzero-metrics package, disable its cron entries before enabling
the new one — they write to the same tables and concurrent runs
will deadlock on the summary tables. The legacy crontab looks like:
*/15 * * * * /opt/hubzero/bin/metrics/xlogfix_whoisonline.php
10 0 * * * /opt/hubzero/bin/metrics/import/__fetch_apache_and_auth_log.sh
15 0 * * * /opt/hubzero/bin/metrics/import/__import_apache_and_auth_log.sh
30 0 * * * /opt/hubzero/bin/metrics/import/__archive_apache_and_auth_log.sh
40 0 * * * /opt/hubzero/bin/metrics/__process_tool_metrics.sh
50 0 * * * /opt/hubzero/bin/metrics/__process_usage_metrics.sh
50 1 1 * * /opt/hubzero/bin/metrics/__process_usage_metrics_summary.sh
Comment all seven out before deploying the hzmetrics.py tick entry.
The data shape is unchanged, so there's no migration step beyond
running migrate --apply to pick up the indexed dnload column.
The first summarize-month after deployment will rewrite all six
period cells for the target month — no in-place data conversion.