Back to Blog
Product Updates

Local AI for Anomaly Detection: Privacy-First Market Surveillance

QFQuantForge Team·April 3, 2026·7 min read

When something unusual happens in the market, you need to know two things immediately: what happened and what to do about it. The first question is statistical: is this price move, volume spike, or correlation break significantly outside normal parameters? The second question requires judgment: is this a buying opportunity, a warning sign, or noise?

Our anomaly detection system splits these two questions between two different approaches. Statistical detection using Z-scores and moving averages identifies when something unusual has occurred. Local AI running Ollama with Llama 3.1 at 8 billion parameters classifies what the anomaly means and recommends an appropriate response. The statistical detection is fast and precise. The AI classification adds context that pure statistics cannot provide.

Why Local AI

The decision to run anomaly classification locally rather than through a cloud API was driven by three requirements: latency, cost, and privacy.

Latency matters because anomalies are time-sensitive. When a price Z-score exceeds 3.0 standard deviations, we need classification within seconds, not the multiple seconds that a round trip to a cloud API requires. Ollama running on the same machine produces responses in hundreds of milliseconds. For anomaly detection, where the window between detection and response can determine whether you avoid a loss or walk into one, this speed difference is material.

Cost matters because anomaly detection is high frequency relative to other AI tasks. Our statistical detectors evaluate every candle across every symbol. During volatile markets, they might flag multiple anomalies per hour across 45 bots. At cloud API pricing, this could cost several dollars per day during market stress, precisely when the system is most active. Local inference costs zero per call after the initial model download.

Privacy matters because anomaly detection data includes your portfolio composition, position sizes, and risk exposure. Sending this to a cloud API means a third party has a real-time view of your trading activity. For a self-hosted trading platform, this undermines the entire premise of running your own infrastructure. Local inference keeps all data on your machine.

The Statistical Detection Layer

Before AI gets involved, three statistical detectors screen for unusual market behavior. Each uses simple, fast calculations that run on every candle.

The price anomaly detector calculates a rolling Z-score over 50 candles. If the current price's deviation from the rolling mean exceeds 3.0 standard deviations, it is flagged as anomalous. Severity scales with the Z-score: 3.0 to 3.5 is low, 3.5 to 4.0 is medium, 4.0 to 5.0 is high, and above 5.0 is critical. A Z-score of 5.0 represents a move that should occur approximately once every 3.5 million observations under normal distribution assumptions. In crypto, fat tails make this more frequent, but it is still rare enough to warrant attention.

The volume anomaly detector compares current volume to a 20-candle rolling average. If current volume exceeds 5 times the average, it is flagged. Severity scales similarly: 5 to 7 times is low, 7 to 10 is medium, 10 to 20 is high, and above 20 times is critical. Volume explosions often precede or accompany significant price moves and provide early warning of regime changes.

The correlation break detector monitors the rolling correlation between each symbol and BTC over 30 candles. If the recent correlation drops by more than 0.5 points from the rolling average, it is flagged. A sudden decorrelation from BTC often signals an asset-specific event (a hack, a listing, a regulatory action) that the broader market has not yet priced in.

These three detectors are deterministic, fast, and produce no false positives on normal market data. They serve as the first filter, and only anomalies that pass this statistical screen are sent to AI for classification.

The Classification Layer

When a statistical detector flags an anomaly, the local AI model receives a structured description of what was detected: the symbol, the anomaly type (price, volume, or correlation), the severity, the specific metric values (Z-score, volume multiplier, correlation drop), and recent price context.

The model classifies the anomaly into one of six categories. Potential manipulation: coordinated activity suggesting wash trading or spoofing. Whale activity: large single-entity moves detected through volume patterns. News-driven: reaction to a specific event (often identifiable by the timing and magnitude). Liquidation cascade: forced selling creating a waterfall pattern. Technical breakout: a genuine trend initiation or reversal. Unclassified: insufficient evidence for a specific classification.

Each classification comes with a recommended action. Potential manipulation and critical-severity anomalies recommend pausing trading on the affected symbol. High-severity whale activity recommends reducing exposure. News-driven anomalies recommend monitoring closely (the initial move may be an overreaction). Technical breakouts recommend evaluating the opportunity. Low-severity anomalies recommend logging and continuing.

Rule-Based Fallback

If Ollama is unavailable (the service is not running, or the local GPU is occupied), the system falls back to rule-based classification using severity alone. Critical anomalies trigger a pause recommendation. High severity triggers an exposure reduction. Medium severity triggers close monitoring. Low severity is logged and ignored.

This fallback is less nuanced than the AI classification. It cannot distinguish between a whale accumulating (potentially bullish) and a liquidation cascade (bearish). But it maintains the core safety function: severe anomalies produce conservative responses regardless of whether AI is available to classify them.

The fallback is important for system reliability. Our anomaly detection needs to work at 3 AM when no one is watching, during network outages when cloud APIs are unreachable, and during system updates when the Ollama service might be temporarily down. The statistical detectors and rule-based fallback provide a floor of protection that AI enhances but never replaces.

How Anomalies Flow Through the System

When an anomaly is detected and classified, it is published on the event bus as an ANOMALY_DETECTED event. Multiple subscribers react to this event.

The bot manager checks whether any running bots trade the affected symbol. If the recommended action is to pause trading, those bots receive a temporary hold on new entries. Existing positions are not closed (the anomaly might be a buying opportunity for existing positions), but no new risk is added.

The Telegram notification system alerts the operator with the anomaly details, classification, and recommended action. This allows human judgment to override the automated response if the classification seems incorrect.

The risk event persister logs the anomaly to the database for later analysis. Over time, the anomaly log reveals patterns: which symbols experience the most anomalies, which types of anomalies precede profitable or unprofitable conditions, and how well the AI classification correlates with actual outcomes.

Local Model Selection

We chose Llama 3.1 at 8 billion parameters for anomaly classification because it balances capability with resource requirements. The model runs comfortably on machines with 8 GB or more of available memory, which is reasonable for a trading server that also runs 45 bots, a database, and a web dashboard.

Larger models (70 billion parameters) would provide more nuanced classifications but require dedicated GPU hardware. Smaller models (1 to 3 billion parameters) would be faster but sacrifice classification quality. The 8 billion parameter model produces classifications that are correct approximately 80 to 85 percent of the time based on our manual review of historical anomalies, which is sufficient given that the rule-based fallback handles the remaining cases conservatively.

The model is downloaded once and runs indefinitely without updates. This is a deliberate choice for stability. Trading infrastructure should not change behavior unexpectedly. A model update that changes classification tendencies could alter the system's risk response in ways that are difficult to predict or backtest. When we want to test a new model, we run it in parallel with the existing model and compare classifications before switching.

Privacy as a Feature

For traders who use cloud-based platforms, every trade, every position, and every signal is visible to the platform operator and potentially to the AI service provider. This creates multiple risks: data breaches exposing trading strategies, providers front-running detected patterns, and regulatory exposure from third-party data handling.

Running anomaly detection locally eliminates all of these risks. Your market analysis, portfolio composition, and risk exposure never leave your machine. The local AI model has no network connection and cannot exfiltrate data even if compromised. This is the privacy promise of self-hosted trading: your data is your data, and no third party needs to see it to provide the service.

Cloud AI (Claude for sentiment and enrichment) does receive limited data about individual signals and market conditions. But the bulk of the surveillance, the continuous anomaly detection across all symbols and all bots, runs entirely on local hardware. The cloud AI handles infrequent, high-value tasks. The local AI handles frequent, privacy-sensitive tasks. The boundary is drawn at the point where the privacy cost of cloud inference outweighs the quality benefit.