How Options Platforms Estimate Intraday Open Interest

The Core Problem: OI Is a Once-Per-Day Number

The Options Clearing Corporation (OCC) and CBOE clear trades and update official Open Interest once per day — typically overnight, reflecting the prior session's activity. During the trading day, the OPRA feed (the consolidated options tape) only broadcasts trade volume, price, and exchange. There are no Open/Close flags. You cannot tell from a printed trade alone whether it represents a new position being initiated or an existing one being closed.

This creates a genuine information gap. Any platform claiming to show live intraday OI is not reading a real-time feed from the OCC — that feed does not exist. What they are showing is an estimate, derived from a model that infers open/close intent from trade characteristics. The quality of that estimate depends entirely on the sophistication of the model and the richness of the data feeding it.

Understanding how these models work — and where they fail — is essential context for evaluating any GEX platform that claims real-time OI accuracy. It also explains why GEX Metrix uses verified end-of-day OI as its primary input: it is the only number in this chain that is not an estimate.

Why this matters for GEX specifically:

GEX is calculated as: OI × Gamma × Contract Size × Spot² × 0.01

If the OI input is estimated rather than official, every GEX value inherits that estimation error — multiplied by gamma, which can be large near ATM strikes. A 10% OI estimation error on an ATM position translates directly to a 10% error in the GEX dollar figure at that strike. On a position like the JPM Collar (40,000+ contracts), that error runs into the hundreds of millions of dollars of phantom or missing hedging flow.

The transparency question: When a platform shows "live GEX," ask: is the OI input official end-of-day data, or is it an intraday estimate? If it is an estimate, what model produces it and what is the documented error rate? Most platforms do not disclose this.

How OI Actually Changes — The Four Trade Types

Every options trade involves a buyer and a seller. The change in OI depends on both sides' intent — and you can only observe the net trade, not the intent behind it.

Buyer intent	Seller intent	OI change
Buy to Open (BTO)	Sell to Open (STO)	+Volume
Buy to Close (BTC)	Sell to Close (STC)	−Volume
Buy to Open (BTO)	Sell to Close (STC)	0
Buy to Close (BTC)	Sell to Open (STO)	0

OI only changes when both sides are new (opening) or both are closing existing positions. Mixed trades — one side opening, one side closing — are OI-neutral and the most common case in liquid markets.

The fundamental ambiguity: given a single trade of 500 contracts, the actual ΔOI could be anywhere from −500 to +500. In practice, the estimator assigns a probability to each case:

Estimated ΔOI per trade =
Volume × (P(Open) − P(Close))

Where P(Open) + P(Close) + P(Mixed) = 1
P(Open) → +Volume contribution
P(Close) → −Volume contribution
P(Mixed) → 0 contribution

Running intraday OI estimate:
OI(t) = OI_yesterday
      + Σ [Volume_i × (P(Open)_i − P(Close)_i)]
        for all trades i up to time t
                            

The entire modeling challenge reduces to: how accurately can you estimate P(Open) and P(Close) for each individual trade using only observable market data?

Layer 2 — Trade Classification: The Three Main Approaches

The model's core task is classifying each trade as Opening, Closing, or Mixed. Three approaches exist, roughly in order of sophistication:

A — Lee-Ready / Tick Rule

Classifies the aggressor side from price movement: uptick = buyer aggressor, downtick = seller aggressor. Then cross-references with whether the bid/ask side suggests opening or closing intent. Fast and simple, but noisy for options — SPX options frequently trade at mid or cross the spread in complex multi-leg structures, making tick-rule classification unreliable on a large fraction of prints.

B — Signed Volume + Greeks Heuristics

Uses moneyness, DTE, time of day, and IV behavior to assign opening/closing probabilities. Key signals: a large OTM call bought with immediate IV spike → likely opening. A near-expiry deep ITM option sold → likely closing or rolling. OPRA strategy codes identify multi-leg trades (iron condors, spreads, rolls) which can be decoded separately. More accurate than Lee-Ready, but requires IV surface data updated tick-by-tick.

C — Delta-Hedge Flow Reversal

Tracks the dealer's estimated delta hedge in the underlying (SPX/ES futures). If a large options trade is immediately followed by a correlated ES futures trade in the opposite direction within a ±500ms window — that's a dealer delta-hedge, confirming a customer opening trade. Highly accurate for large institutional trades, but requires co-located futures tick data and sub-second timestamp alignment.

In practice: all three layers are combined. A production-grade intraday OI estimator runs all three classifiers in parallel and weights their outputs based on trade type — Lee-Ready for small retail trades, heuristics for mid-size flow, delta-hedge reversal for large block trades where the futures confirmation signal is most reliable.

The Data Stack — What You Actually Need to Build This

Standard OHLCV price data is not sufficient. A viable intraday OI estimator requires several expensive, high-bandwidth data sources:

Essential

Data	Why needed
OPRA full feed	Every print: trade size, price, exchange, condition codes, strategy flags
L2 options quote feed	Bid/ask at exact millisecond of trade → classify aggressor side
ES/SPX futures tick data	Detect delta-hedging flows confirming customer opening trades
Prior day OI by strike/expiry	The baseline anchor — all intraday estimates are deltas on top of this
IV surface tick data	Detect IV spikes/drops confirming opening vs closing intent
Historical OCC clearing data	Actual opening/closing volume by strike per day — required to train and validate the ML model

Very Useful

Data	Why useful
OPRA condition codes	Flag spread trades, multi-leg, and — on some venues — opening/closing designations reported by brokers
Block trade tape	Large institutional prints dominate OI changes; modeling them separately improves accuracy
ETF options flow (SPY, QQQ)	Proxy for sentiment that may lead SPX OI positioning

Cost reality: The OPRA full feed alone runs $10,000–$30,000/month for a professional subscription. Historical OCC clearing data with open/close breakdowns is sold separately and costs thousands of dollars per year. The compute infrastructure to process tick-level options data for all SPX strikes in real time requires dedicated co-location or high-performance cloud infrastructure. This is not a weekend project.

The Machine Learning Architecture — From Tick Feed to OI Estimate

Two modeling approaches:

Approach A — Daily-Level Regression (Simpler)

Aggregate intraday trade features into daily buckets (% of volume at ask, net IV change during high-volume periods, morning vs afternoon volume ratio). Train XGBoost or Random Forest to predict end-of-day ΔOI. Easier to build and validate. Does not give tick-by-tick OI — gives a daily projection that updates as the session progresses. Better suited for GEX calculations that refresh every 15 minutes rather than every second.

Approach B — Trade-Level Classifier (Complex)

Train a classifier on individual trades using historical data where actual open/close flags are known (e.g., CBOE LiveVol historical tick data). Deploy in real time: for every intraday print, output P(Open) and P(Close). Sum into a running OI estimate updated per trade. Requires labeled training data, significant infrastructure, and produces estimates with meaningful error bars that most platforms do not publish.

The full pipeline:

[OPRA tick feed]

↓

[Trade Classifier]

◄ L2 quote at trade time
◄ ES futures ±500ms window
◄ IV surface tick

↓

[Opening / Closing / Mixed label]

↓

[ΔOI Accumulator per strike/expiry]

↓

[Kalman Filter / Bayesian smoother]

◄ EOD OI as calibration anchor

↓

[Intraday OI Surface Estimate]

The Kalman filter layer is critical — classification errors accumulate over thousands of trades. Each day when official EOD OI is published, the error vs the estimate is computed and used to recalibrate the classifier's priors for the next session.

Key Pitfalls — Why Even Well-Funded Teams Get This Wrong

Multi-leg trades and rolls: A roll (sell near expiry, buy far expiry) is net zero OI change but appears as two separate trades on OPRA. Strategy codes help but are not always populated correctly — especially for complex structures. Incorrectly classifying a roll as two opening trades would double-count OI addition that never actually occurred.
Ex-pit / FLEX options: FLEX options (customized terms negotiated off-exchange) do not always appear on OPRA promptly and can cause large, unexplained OI surprises at the daily close. Any model that ignores FLEX will systematically underestimate OI changes on days when large structured products settle.
0DTE options need separate treatment: Same-day expiry OI goes to zero at 4 PM regardless — every open position either expires worthless or is exercised. The afternoon volume on 0DTE is overwhelmingly closing, which a naively trained model may not properly account for given the time-of-day skew.
Early exercise (deep ITM puts): Deep ITM SPX puts can be early-exercised if the interest cost of carrying the position outweighs the extrinsic value. Early exercise reduces OI in ways that never appear on the OPRA tape and have no observable trade signature until the OCC reports the change overnight.
Market maker internalization: Some MM trades are matched internally and either never hit the tape or arrive with a timestamp delay. These delayed prints can cause retroactive OI estimate corrections that are invisible to users seeing only the live feed.
Error compounding: Classification errors on individual trades are small in isolation but accumulate across thousands of prints per strike per session. Without the Kalman smoothing and daily recalibration step, drift from the true OI can reach 10–20% by mid-session on active days — a large enough error to materially mislocate GEX structural levels.

A Simpler Starting Point — When You Don't Have the Full Stack

If you do not have access to OPRA or full tick data, a reasonable proxy model can still improve on using yesterday's OI unchanged:

Simple proxy model:

# Base rate: ~45-55% of SPX volume is opening
# on a typical day — but varies significantly

open_ratio = base_open_rate
open_ratio += otm_skew(moneyness)
  # OTM options are more likely opening
open_ratio += dte_skew(days_to_expiry)
  # Very short DTE skews toward closing/rolling
open_ratio += tod_skew(time_of_day)
  # First and last hour skew toward closing

estimated_OI_change = volume * (2*open_ratio - 1)
  # +volume if all opening, -volume if all closing
                            

This simplified approach captures the three dominant skews without requiring OPRA access:

Moneyness skew

OTM options are disproportionately bought to open (speculative and tail-hedge positioning). Deep ITM options are more likely closing transactions — holders rolling out or taking delivery.

DTE skew

Very short DTE (1–2 days) options skew heavily toward closing and rolling. Longer DTE options (7+ days) skew toward new opening positions as institutions build forward hedges.

Time-of-day skew

The first and last 30–60 minutes of the session show elevated closing/rolling activity as traders flatten before major events (open) or expiration (close). Mid-session volume is more neutral in opening/closing mix.

Why End-of-Day OI Remains the Most Honest GEX Input

Given everything described above — the data requirements, model complexity, compounding errors, and undisclosed accuracy rates — there is a strong case for treating official end-of-day OI as the most reliable GEX input available, even though it reflects the prior session's close.

The key insight from the OI vs GEX article is that institutional positions — the ones that generate large, structurally meaningful GEX — change slowly. A JPM Collar position of 40,000 contracts does not roll intraday. The overnight OI for that position is accurate and stable. The structural GEX level it creates at the 7,000 strike is real and persists across the full session.

What intraday OI estimation adds is at the margins: smaller position changes, active 0DTE flow, and aggressive retail positioning. These are real effects, but they operate at a much smaller notional scale than the institutional positioning captured in overnight OI. For structural GEX analysis — identifying the Call Wall, Put Wall, and Zero Gamma Level — the overnight number is a solid foundation that intraday estimation noise rarely materially improves.

When intraday OI estimation does matter

There are specific sessions where real-time OI tracking would meaningfully improve GEX accuracy:

• High-catalyst days — Fed decisions, CPI prints — where extraordinary intraday volume materially shifts OI
• 0DTE expiration days — where OI goes to zero by close regardless of intraday structure
• Post-OPEX Monday — large OI has just expired, new OI is building rapidly from near zero

The honest position:

No platform currently has a validated, publicly documented intraday OI model with a published error rate for SPX. If a platform claims "live OI" without disclosing their methodology, treat the claim as unverified. The correct question is not "is their OI live?" but "what is their model's accuracy, and how does that accuracy vary by moneyness, DTE, and session type?"

Frequently Asked Questions

Does any exchange or broker provide real-time OI data directly?

Not for SPX in the way that real-time price feeds work. Some brokers show an "estimated OI" figure on their platforms, typically derived from exactly the kind of model described in this article — volume-based with open/close heuristics. CBOE publishes intraday volume data but not intraday OI updates. The OCC's official OI report is always end-of-day. A small number of institutional data vendors (e.g., CBOE LiveVol, OptionMetrics) provide historical tick data with estimated open/close flags, which can be used for model training but not as a real-time live feed.

How does this affect GEX calculations specifically?

GEX multiplies OI by gamma. Near ATM options — where gamma is highest and GEX values are largest — are also the options most frequently traded intraday, meaning their OI changes most actively during the session. This is the highest-stakes part of the OI estimation problem: the strikes where estimation errors are most amplified by gamma are exactly the strikes with the highest intraday activity. An overestimate of ATM OI by 15% produces a 15% overstatement of the GEX level that traders use as the Call Wall or Zero Gamma anchor. This is a material error for intraday structural analysis.

Is the intraday OI estimation problem being actively researched?

Yes — it is a known open problem in quantitative options research. Academic work on trade classification (Lee-Ready, Ellis-Michaely-O'Hara, and subsequent refinements) has been applied to equity markets for decades, but options markets are structurally different: multi-leg trades, market maker dominance, and the absence of a central limit order book for many strikes make direct transfer of equity-market methods unreliable. Several sell-side quant teams and specialized data vendors have proprietary models, but none publish validation statistics that would allow independent assessment of accuracy.

Continue Learning

For the conceptual case for why end-of-day OI is a valid and often superior GEX input for most trading applications, see the companion article on Open Interest vs Gamma Exposure.

Open Interest vs GEX → How Market Makers Hedge Delta → Live GEX Dashboard →