The Information Gap: A Systematic Approach to Analyst Drift


Design Document: The Alpha Engine Quantitative Framework

Lead Researcher: Jack Gang | Date: May 2026


1.0 Introduction


Retail investors are often late to analyst research for structural reasons. Wall Street consensus data is often hard to access, and many published targets lag the macro regime that produced them.

The Alpha Engine is a systematic framework built to clean that data and trade the lag between real market conditions and institutional target updates. In a 5-year unbiased backtest, the strategy returned 32% annualized* versus 14.8% for SPY, with a Sharpe ratio of 1.88.

Fig 1: Systematic backtest performance comparison

Fig 2: Pre-tax backtest statistics

*Fairness note on taxes: this strategy trades more often than buy-and-hold SPY, so pre-tax numbers are not enough. After applying California high-income capital gains assumptions, the strategy still returned 17.5% versus 9.9% for SPY.

Content

Fig 3: After-tax backtest statistics


2.0 Data Integrity & Bias Mitigation


Strong backtests often fail in live trading when historical leakage is ignored. The Alpha Engine framework addresses these biases before any signal is generated.


2.1 Eliminating Survivorship & Lookahead Bias


The backtest does not trade today's S&P 500 list retroactively. It uses a snapshot method:

  • Build yearly universes from 2021 to 2026

  • Include US stocks above $2.5B market cap at each historical point

  • Trade only what was available at that time


That means the model sees both future winners and real losers. It can hold names like SMCI, MSTR, and VRT, but it also has to deal with names like WE, PTON, and CHGG.


2.2 Signal Preprocessing


Before scoring, the ingestion pipeline applies four filters:

  1. Deduplication: Keep the latest rating per analyst to reduce echo effects from repeated edits by the same analyst.

  2. Split Normalization: Adjust targets for stock splits so signals are aligned with actual traded prices.

  3. Imputation: For incomplete coverage (analyst ratings with no target price or rating), fill missing rating fields or missing targets from explicit target percentages and a sentiment mapping dictionary.

  4. Recency: Drop ratings older than 90 days to match the earnings-cycle cadence.


3.0 Signal Generation: The “G-Score”


The model ranks stocks with the “G-Score”, which measures analyst conviction, recency, and dispersion over a rolling 90-day window.


3.1 Algorithm Mechanics


The engine applies an exponential time-decay to all incoming analyst ratings using a 45-day half-life, ensuring that stale ratings are aggressively discounted:

Next, to account for the high dispersion in certain stocks, especially in complex sectors, the engine calculates a Risk-Adjusted Valuation by penalizing the weighted mean using the Standard Error of the Mean and a variable Critical T-Value based on the number of analysts covering the asset:


From this, the G-Score is then calculated as the implied upside between the Risk-Adjusted Valuation and the current stock price.


3.2 Note on High-Complexity Edge (Biotech & Pharma)


Many professionals are concentrated in employer stock or familiar sectors such as tech. The Alpha Engine model provides diversification by often finding stronger dispersion in biotech and mid-cap pharma, where binary trial outcomes produce wider analyst disagreement.

Because these sectors are driven by binary clinical trial outcomes, they create valuation complexity that can breed massive analyst disagreement (dispersion), creating the widest "Information Gaps." However, instead of needing a PhD in biology to trade these assets, this model systematically measures the consensus of the analysts who do specialize in that domain. The Alpha Engine explicitly looks for these fat-tail events, and applies strict execution rules by holding outsized winners while mechanically rotating out of losers (see below).


4.0 Execution & Cash Management


Generating a signal can be trivial. It’s a crowded industry with countless newsletters and professional models that tell you what stocks to buy. The Alpha Engine’s Model Signal Framework differentiates itself by utilizing a rigid execution layer (meticulously designed by nearly a decade of human trading experience) that fundamentally separates it from retail stock-picking.


4.1 The "Dip Refill" Mechanism


A static 100% allocation misses how real portfolios are funded and managed. The backtest uses two capital rules to mimic real-world investing:

  1. Base DCA contribution: $1,000/month

  2. Target line trigger: if equity drops below a computed threshold, the engine deploys extra capital through a dip refill


Across 5 years, this required about 20% extra capital beyond the $60,000 DCA base (about $12,000). That same mechanism is applied to benchmarks for parity; benchmarks needed about 8% extra capital to meet their own target line. This is a main component of the strategy's success: it strips away human emotion and algorithmically forces capital deployment during maximum drawdown events. In other words, it systematically “buys the dip.”

Fig 4: Systematic invested capital comparison


4.2 Systematic Buys and Exits


The strategy also has daily explicit risk controls:

  • Universe: US stocks above $2.5B market cap; if a name drops below threshold, it becomes sell-only.

  • Buy Allocation: New capital is split equally across the top 3 names by G-Score.

  • The Percentile Exit: Positions below the 65th percentile G-Score are sold to minimize the opportunity cost of not holding the highest G-Score tickers. 

  • “Fade” the winners: Daily sells are capped at 25% of portfolio equity to reduce flip risk and slippage.

  • The Frozen Stop-Loss: If institutional coverage falls below threshold, apply a 7% trailing stop from the high-water mark.


This produced a practical trade profile for active professionals: the system traded on 426 days over 5 years, about one in three trading days.

Fig 5: 5-year backtest individual stock performance


5.0 Out-of-Sample Validation & The "Coiled Spring"


A model is only as strong as its out-of-sample performance, so it was tested on two periods:

  1. The COVID Stress Test (Mar - Dec 2020): During a historically volatile bear market, the portfolio absorbed the hit and triggered dip refills on the way down. The portfolio subsequently posted 97% annual XIRR, more than 2x SPY during the same period.

  2. The “SaaSpocalypse” (H1 2026): The current market mechanics are similar (though smaller scale) to the above example. As AI adoption punishes traditional SaaS models, the strategy has rotated more into the software sector, while maintaining diversified holdings in biotech and gaming. This has yielded a temporary -8% drawdown (compared to +0.6% in QQQ) during the firsts 2 months of trading. However, like above, the engine is designed to accumulate highly discounted, high-conviction assets.