Spotting stuck meters and tool theft with hours-delta anomalies
After every hours log we z-score the delta against a 30-day rolling baseline. Most flags catch stuck meters; some catch unauthorized use. Here's the math + the false-positive guardrails.
We shipped hours-delta anomaly detection in DirtFleet a couple weeks ago. It catches three things: stuck meters, unauthorized use after hours, and the boring case of somebody fat-fingered a reading by 1000. The math is dead-simple — z-score against a rolling baseline — but the guardrails took longer than the algorithm.
The math
After every HoursLog insert we run checkHoursAnomaly. It pulls the last 30 days of deltas for that asset, computes mean and standard deviation, and z-scores the new entry. If |z| ≥ 2.5 AND we have at least 6 prior samples, we raise a YELLOW flag with the actual numbers in the note.
2.5σ is the dial. Tighter (2σ) flags more, including normal variance. Looser (3σ) misses real signal. We landed at 2.5 after a couple weeks of pilot data — flag rate was around 2–4% of logs, which the manager queue can absorb.
Why a sample minimum matters
Brand-new assets have no baseline. If you flag the first 5 readings against a non-existent distribution, every reading looks “anomalous.” We require ≥6 prior logs in the window before any z-score check fires. The trade-off is that a truly stuck meter on a 3-day-old asset gets missed for the first week. Acceptable — meter-stuck on day-3 is rare.
The constant-prior special case
If the last 30 days all read exactly 1500.0, the standard deviation is zero, the z-score is undefined, and a new reading of 1500.5 looks infinite. We special-case this: when σ = 0, only flag if the new delta is > max(1, mean × 5). Catches the “someone reset the meter to baseline every day” pattern without raising on a normal half-hour log.
The stale-meter detector
Separate from the z-score path: if the previous log was >14 days ago, raise a YELLOW “Meter gap” flag with the gap length. Catches the case where a unit went OUT_OF_SERVICE, came back, and the operator never logged the in-between hours. Different problem, same UI.
Dedupe
A flurry of bad logs would otherwise spam the queue with identical anomaly flags. We dedupe by source string + 24h window: if there's already an open YELLOW flag with our marker on the same asset in the last 24 hours, we don't raise a second.
What false positives look like
- Backdated logs — a driver posts a log for three days ago. Z-score against today's baseline can fire wrong. We accept the false positive; the resolution is one tap.
- Seasonal ramps — first warm spring day, everything wakes up at once and posts 6× the winter rate. The 30-day window includes the slow weeks, so the new deltas z-score high. Manager sees a flurry, hits Acknowledge on each, baseline catches up in a week.
What it's actually caught
In pilot: a torque-sensor stuck-at on an excavator (delta of 0 for two weeks while the unit was clearly running), unauthorized weekend use of a service truck (3 logs in a row that didn't map to any approved job), and one case of a fat-fingered 9999.5 reading on a CAT 320 that would have triggered an immediate AUTO_PM if it had landed.
The code
lib/anomaly-detection.ts; called fromcreateHoursLog right after the row insert. Non-blocking — a failure logs and continues so a Postgres hiccup never breaks an hours log.