ccxt in Practice: Fetching Candles in 20 Lines

Fetching historical candle data from a crypto exchange is the first thing every trading system needs to do. The ccxt library (CryptoCurrency eXchange Trading Library) provides a unified API across 100+ exchanges, so you write the code once and it works with Binance, Bybit, OKX, or any other supported exchange. Our platform uses ccxt as the sole interface to Binance for all market data and order execution.

Here is how we do it in practice, and what we learned after fetching 20.8 million candles across 25 symbols and 11 timeframes.

The Minimal Fetch

At its simplest, fetching candles from Binance with ccxt takes about 20 lines of Python including imports.

import ccxt
import pandas as pd

exchange = ccxt.binance({"enableRateLimit": True})
ohlcv = exchange.fetch_ohlcv("SOL/USDT", "15m", limit=500)
df = pd.DataFrame(ohlcv, columns=[
    "timestamp", "open", "high", "low", "close", "volume"
])
df["timestamp"] = pd.to_datetime(df["timestamp"], unit="ms")

That gets you the last 500 fifteen-minute candles for SOL/USDT. The enableRateLimit: True parameter tells ccxt to automatically throttle requests to respect Binance's rate limits. Without it, you will get IP-banned after a few hundred rapid requests.

The response format is standardized across all exchanges: a list of lists where each inner list is [timestamp_ms, open, high, low, close, volume]. Converting to a pandas DataFrame is immediate.

Pagination with the since Parameter

The limit parameter caps at 1,000 or 1,500 depending on the exchange. To fetch years of historical data, you paginate using the since parameter, which is a Unix timestamp in milliseconds.

since = exchange.parse8601("2021-01-01T00:00:00Z")
all_candles = []
while True:
    batch = exchange.fetch_ohlcv(
        "SOL/USDT", "15m", since=since, limit=1000
    )
    if not batch:
        break
    all_candles.extend(batch)
    since = batch[-1][0] + 1  # next ms after last candle

The key detail is since = batch[-1][0] + 1. You advance the cursor to one millisecond after the last candle's timestamp to avoid duplicating the boundary candle. Without that + 1, you get the last candle of each batch duplicated as the first candle of the next batch.

Our DataFetcher module wraps this pagination loop with several production-grade additions that the minimal version lacks.

Rate Limiting in Practice

Binance allows 1,200 weight per minute for REST API calls. Each fetch_ohlcv call costs 1-5 weight depending on the limit parameter. At 1,000 candles per request, you can make roughly 300-400 requests per minute before hitting the limit.

With enableRateLimit: True, ccxt inserts automatic delays between requests. But when you are fetching data for 25 symbols across 11 timeframes, the naive approach of serial fetching takes hours. We use a smarter approach: fetch symbols in sequence but timeframes in parallel batches, with explicit rate limit tracking.

Our DataFetcher maintains a rolling window of request timestamps and calculates the remaining weight budget before each batch. If the budget is low, it sleeps for the remainder of the minute window. This is more aggressive than ccxt's built-in rate limiter (which is conservative by default) while still staying within Binance's limits.

In practice, a full historical backfill for a new symbol (all 11 timeframes from 2021 to present) takes about 15-20 minutes. Adding just the latest candles for all 25 symbols (incremental sync) takes under 2 minutes.

Retry Logic for Exchange Flakiness

Exchanges are not perfectly reliable. Binance returns transient errors roughly once per thousand requests in our experience: timeouts during high-volatility periods, 502 errors during maintenance windows, and occasional rate limit rejections even when you think you are within bounds.

Our DataFetcher retries transient errors with exponential backoff. The first retry waits 1 second, the second waits 2, then 4, up to a maximum of 5 retries. Non-transient errors (invalid symbol, authentication failure) are raised immediately without retry.

The retry wrapper also handles a ccxt-specific quirk: some exchanges return an empty list instead of raising an error when you request data beyond the available history. Our fetcher detects this (empty response when previous batch had data and we have not reached the current time) and terminates the pagination loop gracefully.

Storing to DataFrame and Database

Raw ccxt output is a list of lists. We convert to a pandas DataFrame immediately after fetching, then persist to SQLite with deduplication.

The storage layer uses INSERT OR IGNORE with a unique constraint on (symbol, timeframe, timestamp). This means we can re-fetch overlapping time ranges without creating duplicate rows. It also means incremental updates are simple: fetch the last 500 candles (which overlap with existing data) and insert. Only genuinely new candles create rows.

Our candle database currently holds 20.8 million rows across 25 symbols and 11 timeframes (1m, 5m, 15m, 30m, 1h, 2h, 4h, 6h, 8h, 12h, 1d). The 1-minute timeframe dominates storage since it produces 525,600 candles per symbol per year. The database sits at approximately 5 GB, all in SQLite with WAL mode.

The ccxt.pro Difference for Live Data

For live trading and paper trading, polling fetch_ohlcv every 15 minutes is fine. But for real-time price updates (mark-to-market on open positions), we use ccxt.pro, the WebSocket extension of ccxt.

import ccxt.pro as ccxtpro

exchange = ccxtpro.binance()
while True:
    ohlcv = await exchange.watch_ohlcv("SOL/USDT", "1m")

The watch_ohlcv method opens a WebSocket connection and yields new candle updates as they arrive. No polling, no wasted API calls, sub-second latency. Our data streaming module (data/stream.py) uses this for the 45 live bots that need current prices between their scheduled tick intervals.

The practical difference: polling fetch_ohlcv every second would burn through your rate limit in 20 minutes. A WebSocket connection costs zero API weight and delivers data the instant the exchange has it.

Symbol Format Gotchas

ccxt normalizes symbol names to a standard format: BTC/USDT for spot, BTC/USDT:USDT for perpetual futures. Binance's native API uses BTCUSDT. This normalization is one of ccxt's best features since you write code once and it works across exchanges that all use different naming conventions.

But there are edge cases. PEPE and SHIB trade as 1000PEPE and 1000SHIB on Binance futures (the exchange multiplies the token amount by 1,000 to avoid tiny decimal quantities). The ccxt symbol is 1000PEPE/USDT:USDT for the perpetual. If you try to fetch funding rates for PEPE/USDT, you get nothing.

We maintain a _PERP_OVERRIDES dictionary that maps base symbols to their perpetual contract format. It is a small detail, but one that caused us hours of debugging when our derivatives strategies showed zero trades on PEPE and SHIB.

Exchange Initialization

Our production exchange configuration passes API keys for authenticated endpoints (order placement, balance queries) and sets testnet mode via configuration.

The key parameters beyond enableRateLimit:

defaultType: "future" routes all calls to the futures API (perpetual contracts). Without this, ccxt defaults to spot.
options.defaultType and options.recvWindow handle Binance-specific quirks around timestamp synchronization. If your server clock drifts more than 1 second from Binance's, authenticated requests fail.
We set timeout: 30000 (30 seconds) because the default 10-second timeout is too aggressive during high-volatility periods when Binance's API slows down.

What ccxt Does Not Do

ccxt is a data access layer, not a trading framework. It fetches candles and places orders. It does not compute indicators, manage positions, track PnL, or handle risk. Those are all separate concerns in our architecture.

Our DataFetcher module (data/fetcher.py) wraps ccxt with retry logic, rate limiting, and pagination. The DataFetcherAdapter (data/adapter.py) bridges it to the BotInstance protocol. The storage layer (data/storage.py) handles persistence and deduplication. Each layer has a single responsibility, and ccxt sits at the bottom as the exchange communication primitive.

This separation matters when things go wrong. If Binance changes their API behavior (which happens), we update one wrapper module. The 40 strategies and 45 bots above it never know the difference.

Getting Started

If you are building your first trading data pipeline, start with the 20-line example at the top. Fetch some candles, convert to a DataFrame, and plot them. Then add pagination to get historical data. Then add a storage layer so you are not re-fetching the same data every run.

ccxt's documentation covers exchange-specific details for each of the 100+ supported exchanges. The unified API means most of your code is exchange-agnostic, but the edge cases (symbol naming, rate limits, endpoint quirks) are where you will spend debugging time.

After four years of running ccxt in production across millions of API calls, our assessment is that it is the right abstraction for Python crypto trading. The alternatives are either too low-level (raw REST calls) or too high-level (full frameworks that bundle strategy logic with data access). ccxt sits in the sweet spot: unified exchange access with no opinions about what you do with the data.

ccxt in Practice: Fetching Candles in 20 Lines

Python for Traders

The Minimal Fetch

Rate Limiting in Practice

Retry Logic for Exchange Flakiness

Storing to DataFrame and Database

The ccxt.pro Difference for Live Data

Symbol Format Gotchas

Exchange Initialization

What ccxt Does Not Do

Getting Started

Related Posts

Scheduling 45 Bots with APScheduler: Lessons Learned

Async Python for Real-Time Market Data

Building RSI from Scratch in Python (No Libraries)