From Paper to Live: The Deployment Checklist Nobody Publishes

Most trading content ends at the backtest. You get the Sharpe ratio, the equity curve, maybe a paragraph about going live. Then silence. The gap between a validated backtest and a running live system is enormous, and nobody publishes the details of what goes in between.

We currently run 45 paper trading bots on Binance mainnet API with real market prices and simulated execution. The total paper allocation is 45,000 dollars across six strategy types. Before any of these bots went live in paper mode, they passed a deployment checklist that covers execution modeling, risk management, decay detection, safety mechanisms, and crash recovery. This is that checklist.

Step 1: Execution Model Verification

Paper trading must be indistinguishable from live trading at the application layer. If your paper trader uses instant fills at the exact candle close price, you are lying to yourself. Real execution has slippage, fees, and latency.

Our paper trader applies slippage between 2 and 10 basis points per fill, drawn from a uniform random distribution within that range. Buys receive a higher fill price than the signal price. Sells receive a lower fill price. This models the market impact of placing an order in a real order book. The range accounts for varying liquidity conditions: 2 basis points during high-volume periods and up to 10 during thin markets.

Fees are modeled at Binance spot rates: 0.02 percent for taker (market) orders and 0.01 percent for maker (limit) orders. Since our strategies use market orders for entry and exit timing, we conservatively model taker fees on both sides. The fee is calculated on the notional value of each fill and deducted from the virtual balance.

Latency is simulated at 1 to 5 milliseconds per order execution. This matters less for 15-minute strategies but becomes relevant for higher-frequency execution. The latency is tracked in each order record for monitoring.

The paper trader tracks a virtual balance that starts at the allocated capital per bot (currently 1,000 dollars per bot) and adjusts with every fill: capital decreases by the buy notional plus fees, increases by the sell notional minus fees. This virtual balance is stored in the database identically to how a live balance would be tracked.

Step 2: Per-Bot Risk Gates

Every order passes through five sequential risk checks before execution. The first check that fails blocks the trade.

The first check is stop-loss enforcement. Every long or short entry must have a suggested stop-loss price greater than zero. If the strategy produces a signal without a stop-loss, the order is rejected. This prevents any trade from running with unbounded downside.

The second check is maximum position size. No single position can exceed 25 percent of the bot's allocated capital. If a strategy's sizing algorithm requests a position larger than this, the order is either reduced to the limit or rejected entirely.

The third check is the drawdown circuit breaker. Each bot has a maximum drawdown threshold of 20 percent. The system monitors equity snapshots taken hourly, comparing current equity to peak equity. If the bot's drawdown from peak exceeds 20 percent, all new entries are blocked. The bot can still close existing positions but cannot open new ones until equity recovers.

The fourth check is the daily loss limit. If a bot's realized losses for the current UTC day exceed 5 percent of its allocated capital, all trading is suspended until the next day. This prevents a catastrophic sequence of losses from depleting capital in a single session.

The fifth check is the consecutive loss cooldown. After three consecutive losing trades, the position size is halved for each subsequent loss. After five consecutive losses, the size multiplier is approximately 12.5 percent of normal. This progressive reduction prevents a broken strategy from continuing to bleed at full size. The multiplier resets after a winning trade.

Step 3: Portfolio-Level Risk Constraints

Individual bot risk is necessary but not sufficient. If all 45 bots pile into correlated positions simultaneously, the portfolio can still blow up even though each individual bot is within its limits.

Three portfolio-level constraints prevent this. Total exposure is capped at 50 percent of aggregate capital across all bots. With 45,000 dollars allocated, no more than 22,500 dollars can be in open positions at any time. This means that even if every bot simultaneously signals an entry, the portfolio manager will reject entries that would push total exposure above the cap.

Single asset concentration is capped at 25 percent of aggregate capital. No single symbol (for example, SOL/USDT) can have more than 11,250 dollars of open exposure across all bots that trade it. This prevents over-concentration when multiple strategies converge on the same asset.

The portfolio drawdown halt triggers at 15 percent aggregate drawdown from peak portfolio equity. If the entire portfolio declines 15 percent from its highest recorded value, all trading stops across all bots. This is the emergency brake. It requires manual intervention to reset, ensuring a human reviews the situation before trading resumes.

Step 4: Strategy Decay Detection

A strategy that passes validation today might stop working in three months as market conditions shift. We cannot rely on periodic manual reviews. The decay detector runs continuously and automatically pauses bots whose strategies degrade.

The mechanism is straightforward. Every bot has its recent trades evaluated over a rolling 30-day window. If the bot has at least 10 closed trades in that window (the minimum for statistical significance), the system calculates a rolling Sharpe ratio from the per-trade PnL percentages, annualized using the square root of 365 for crypto's year-round markets.

If the rolling Sharpe drops below 0.5, the bot is automatically paused. A Sharpe of 0.5 means the strategy is barely outperforming zero after risk adjustment. At that point, continuing to trade is more likely to lose money than make it. The bot remains paused until conditions improve or the operator manually reviews and adjusts the configuration.

The 10-trade minimum prevents false positives. A bot that has only made 3 trades in 30 days does not have enough data for a meaningful Sharpe calculation. The threshold ensures the system only pauses bots where there is statistical evidence of decay, not just bad luck on a small sample.

Step 5: Dead Man's Switch

Automated trading systems need a safety mechanism for when the operator is unavailable. Exchange outages, personal emergencies, network failures. If nobody is watching and something goes wrong, the system should stop itself.

Our dead man's switch requires an operator check-in via Telegram every 24 hours. The state machine has four levels. For the first 20 hours, the status is normal and all systems operate without restriction. At 83 percent of the 24-hour window (approximately 20 hours), the system logs a warning and notifies the operator. At 96 percent (approximately 23 hours), the system escalates to critical and sends an urgent alert. At 24 hours with no check-in, the switch triggers: all bots are forced to stop, and the system enters a latched state.

The latched state is important. A simple check-in does not un-latch the system. The operator must send an explicit reset command, confirming they have reviewed the situation and want trading to resume. This prevents the scenario where an automated check-in script masks the fact that the operator is actually unavailable.

Step 6: Crash Recovery

Processes crash. Power fails. The operating system updates and restarts unexpectedly. When the trading system comes back online, it needs to reconcile its state with the exchange.

Our state recovery runs automatically on startup. It queries the database for all bots that were in running state when the process stopped. For each bot, it finds any pending or partially filled orders. Then it checks each order against the exchange API to determine what happened while the system was down.

If an order was filled on the exchange during the downtime, the system syncs the fill data (quantity, average fill price, fees) and updates the local position accordingly. If an order was cancelled or expired on the exchange, the system marks it as cancelled locally. If an order is still pending on the exchange, it is left in place and the bot resumes monitoring it.

After reconciliation, each bot is marked as recovered and ready to resume normal operation. If any order is in an inconsistent state that cannot be automatically resolved, the bot is marked as errored, requiring manual intervention. The system publishes a recovery event with counts of recovered bots, synced fills, and orphaned orders.

Step 7: Monitoring and Observation Period

After all of the above is in place, we still do not deploy to live immediately. Paper trading runs for a minimum observation period where we monitor several things.

Trade frequency should match backtest expectations. If a 15-minute strategy backtested at 80 trades per year, we expect to see roughly that rate in paper trading. Significant deviations indicate the live market conditions differ from the backtest period.

Execution quality is tracked by comparing the signal price to the actual fill price. The slippage should be within the modeled range. If real slippage exceeds the paper model, the backtest assumptions are too optimistic.

Risk gate activations are monitored. If the drawdown circuit breaker fires frequently, the strategy may be more volatile in current conditions than the backtest suggested. If the consecutive loss cooldown activates repeatedly, the strategy may be in a regime where its edge does not apply.

Equity curves are compared to Monte Carlo projections. If the actual paper trading drawdown exceeds the 95th percentile Monte Carlo drawdown, that is a red flag. The strategy may be encountering conditions outside its historical trade distribution.

Current Deployment

Our current paper trading deployment runs 45 bots across six strategy types. Mean reversion on Bollinger Bands accounts for 13 bots across two parameter groups. Momentum RSI plus MACD runs 5 bots on 15-minute timeframes and 6 bots on 4-hour timeframes. Leverage composite runs 3 bots on derivatives data. Correlation regime runs 6 bots on 4-hour cross-asset macro data. NUPL cycle filter runs 7 bots on on-chain analytics. Stablecoin supply momentum runs 5 bots.

Each bot is allocated 1,000 dollars. The aggregate allocation is 45,000 dollars. All run on the Binance mainnet API with real price data. The only difference from live trading is that orders are simulated rather than submitted to the exchange matching engine. Every other component (risk checks, event publishing, database writes, equity tracking, alerting) operates identically to live mode.

This is the system that sits between a validated backtest and real capital. Every component exists because we either lost money (in paper) or discovered a gap that would have cost real money. The checklist is not theoretical. It is the operational reality of running 45 bots simultaneously on a single machine, with enough safety mechanisms that we can sleep while they trade.

From Paper to Live: The Deployment Checklist Nobody Publishes

Backtesting & Validation

Step 1: Execution Model Verification

Step 2: Per-Bot Risk Gates

Step 3: Portfolio-Level Risk Constraints

Step 4: Strategy Decay Detection

Step 5: Dead Man's Switch

Step 6: Crash Recovery

Step 7: Monitoring and Observation Period

Current Deployment

Related Posts

Overfitting in Crypto: How to Know If Your Strategy Is Curve-Fitted

Why Your Backtest Sharpe Ratio Is Lying to You

Walk-Forward Testing: The Only Backtest That Matters