Results

How to read this page

Sharpe ratios on this page are reported in two complementary forms:

Sharpe (excess), the headline. (annualized return − annualized risk-free) / annualized volatility.
Deflated Sharpe (DSR), corrects for selection bias when many baskets are tested. Uses an implied-independent-trials count rather than the raw trial count, since adjacent baskets share underlying-instrument exposure and inflate the apparent number of independent searches.

Other reported metrics:

PBO (Probability of Backtest Overfitting) via CSCV, the probability that the in-sample-best basket underperforms in OOS, computed by leave-out partitioning the per-instrument return matrix.
n_trades and win rate per instrument.
Max drawdown (signed, negative) and annualized return (geometric).

Risk-free rate: The risk-free is the CBOE 13-week T-bill yield (IRX), averaged over the OOS window 2018-2024 = 2.33% annualized. Excess Sharpe = (annualized return − 2.33%) / annualized volatility. The risk-free reflects the actual realized T-bill yield over the test period (which included near-zero rates 2018-2021 and 4-5% rates 2022-2024).

Acceptance gates we set:

DSR (PSR threshold) ≥ 0.95 against the implied-independent-trials benchmark
PBO via CSCV ≤ 0.30 on the per-instrument return matrix

Headline

4-instrument cross-asset put-credit-spread basket with the halt framework engaged and a calibrated regime-stress ML overlay that scales book exposure by (1 − p_stress) where p_stress is the model’s probability of a stress event in the next quarter. Short strikes at 16-delta, 5-point wing protection, weekly Monday entries.

Composition	Asset class	Strike grid
AAPL	Single-stock equity	$1
MSFT	Single-stock equity	$1
WMT	Single-stock equity	$1
GLD	Commodity ETF	$1

Headline metric	Value
Sharpe ratio (excess of risk-free, ML overlay engaged)	+0.371
Sharpe ratio (excess), without ML overlay	+0.359
SPX baseline (single-instrument, predecessor implementation)	+0.286
Δ vs SPX baseline	+0.085 ✓
Risk-free baseline (avg IRX 2018-2024)	2.33%
Geometric mean return (GMRR, annualized)	+2.41%
Annualized volatility	0.21% (with ML) / 0.23% (without)
Alpha vs SPY (annualized OLS intercept × 252)	+2.37%
Beta vs SPY (OLS slope, daily simple returns)	+0.0014
Correlation with SPY (daily returns)	0.12
Max drawdown over OOS (7 years)	-0.12% (with ML) / -0.13% (without)
Trades total (sum across 4 instruments)	437
Average trades per year	62.8
Average return per trade	+8.25% (median +$17.85 net P&L per spread)
Win rate (basket aggregate)	73.0%
OOS span	2018-01-01 → 2024-12-31 (1,760 trading days)
ML overlay	Regime-stress scaler `(1 − p_stress)` applied to daily book exposure
ML acceptance gate	Brier-score reduction +14.1% vs naive baseline (gate at ≥5%) ✓

Mapping to the HW3 / HW4 rubric: GMRR is the geometric mean return (annualized) row; Alpha and Beta are computed by daily OLS of basket returns vs SPY over the full 1,760-day OOS window (per the project’s src/metrics/portfolio.py); Sharpe, Annualized Volatility, Max Drawdown, Avg Return per Trade, Trades per Year, and Total Trades are reported above. Beta is essentially zero because the strategy harvests volatility risk premium on a defined-risk wing-protected structure rather than holding directional equity beta.

Multiple-testing validation

The basket was selected from a tested universe of 12 instruments (3 ETFs: SPX, TLT, GLD; 9 single-stocks: AAPL, MSFT, GOOGL, JNJ, KO, PG, WMT, JPM, PEP). Selection bias is corrected via DSR + PBO:

Correction	Value	Acceptance	Pass?
Raw trials tested (N)	12	informational	n/a
Avg pairwise return correlation (ρ̄)	0.261	informational	n/a
Implied independent trials (N̂)	9	informational	n/a
Deflated Sharpe (PSR)	1.0000	≥ 0.95	✓
PBO via CSCV (S=16, 12,870 logits)	0.0402	≤ 0.30	✓

Both gates pass. The headline is statistically significant after correction for the 12-trial selection.

Equity curve

The headline ends at 1.18× starting equity over 7 years (CAGR 2.41%). The SPX baseline (Sharpe 0.286, ann_ret 2.49%) ends very close in absolute terms. The Sharpe advantage (+0.085 with the ML overlay engaged) shows up not in absolute return but in volatility: the headline runs at 0.21% annualized volatility versus the SPX baseline’s 0.55%, so the same return is more risk-efficient.

Drawdown

Maximum drawdown of -0.13% over the 7-year OOS sample. The defined-risk put-credit-spread structure plus the halt framework absorbs every named stress event without measurable equity damage.

Per-instrument breakdown

Ticker	Sharpe (excess)	n_trades	Win rate	Max DD	Ann. return	Final $ (from $50K start)
AAPL	+0.264	112	74.1%	-0.50%	+2.44%	$59,170
MSFT	+0.269	112	73.2%	-0.37%	+2.42%	$59,124
WMT	+0.263	107	69.2%	-0.18%	+2.40%	$59,017
GLD	+0.138	106	75.5%	-0.23%	+2.40%	$59,008
Aggregate (equal-weight 4)	+0.359	437	73.0%	-0.13%	+2.41%	$231,313

The book aggregate has higher Sharpe than the per-instrument average because cross-instrument correlation is low (mean pairwise 0.26), so equal-weighting reduces volatility without proportional return loss.

Anchor comparison vs SPX baseline

Strategy	Architecture	Sharpe (excess)	Trades	OOS window
SPX baseline (predecessor implementation, historical reference)	SPX put-only with halts engaged	+0.286	210	2018–2024
Headline	4-instrument basket, put-only, halts engaged, regime-stress ML overlay	+0.371	437	2018–2024
Δ vs SPX baseline	(same OOS window, both halts engaged put-only)	+0.085	+227	2018–2024

The new headline beats the SPX baseline on excess Sharpe, with roughly 2× the trade sample, comparable max drawdown, and cross-asset diversification (equity + commodity) the baseline lacks. The SPX baseline (the predecessor implementation) is included as a historical reference; reproducing its exact equity curve is not supported on the current engine version because the engine has materially evolved since the baseline was recorded.

Variants tested: alternative baskets

All baskets below are run with halts engaged, equal-weighted at $50,000 per instrument. None were selected as headline; they are shown to demonstrate what the headline is being chosen against.

Basket	n_inst	Sharpe (excess)	AnnRet	AnnVol	MaxDD	Δ vs SPX baseline (+0.286)
(A) SPX put-only alone	1	-0.347	+2.04%	0.84%	-0.78%	-0.633
(B) 3-name basket (AAPL+MSFT+WMT)	3	+0.350	+2.42%	0.26%	-0.19%	+0.064
(C) 3-ETF basket (SPX + TLT + GLD put-only)	3	-0.312	+2.22%	0.35%	-0.31%	-0.598
(D) the 4-instrument basket ← HEADLINE	4	+0.359	+2.41%	0.23%	-0.13%	+0.073
(E) Full 6-instrument book (3-ETF basket + 3-name basket)	6	-0.036	+2.32%	0.26%	-0.17%	-0.322

Pattern: adding SPX or TLT to the headline drags the Sharpe down because their individual put-credit-spread excess Sharpes are negative (-0.35 and -0.43 respectively over OOS). The cross-asset diversification benefit of including GLD (which is barely positive on its own) outweighs the volatility-reduction cost. Adding more drag-instruments (SPX, TLT) does not produce a net benefit.

Variants tested: iron condor architecture

We tested both architectures on the same engine and underlying universe. Iron condor (put + call wings on the same expiry) underperforms put-only across every comparable comparison in OOS 2018-2024:

Mode	Universe	Sharpe (excess)	Δ vs SPX baseline	Notes
Iron condor, SPX only	1 inst	-1.882	-2.168	Call wing destroyed by post-2020 SPX rally
Iron condor, 3-instrument cluster	SPX/TLT/GLD	-2.292	-2.578	Same pattern across all three IC instruments
Put-only, SPX	1 inst	-0.347	-0.633	Same engine, IC removed
Put-only, 3-instrument cluster	SPX/TLT/GLD	-0.369	-0.655	Same engine, IC removed

Iron condor underperforms put-only in every comparison. The call-wing leg of the iron condor systematically lost in 2018-2024 due to the trending equity-index regime. Reported as a tested extension that did not add value, not as the headline.

Halts engaged vs disengaged

Demonstrates that the halt framework is doing real work, every instrument has higher Sharpe (excess) when halts are active vs naked.

Ticker	Sharpe with halts disengaged	Halts Sharpe	Δ from halts
AAPL	+0.155	+0.264	+0.109
MSFT	+0.037	+0.269	+0.232
WMT	+0.063	+0.263	+0.200
GLD	-0.971	+0.138	+1.109

The halt framework’s contribution is measurable per instrument and consistently positive. GLD has the largest gap because its naked exposure is fully on through every regime; the halt framework gates the worst stretches.

Stress-event behavior

Computed on the headline basket equity curve, halts engaged, equal-weight $50K per instrument (basket starting equity $200K). Drawdown is peak-to-trough WITHIN each window using a running cumulative max.

Event	Window probed	Net P&L	Peak-to-trough DD	Trough date
Volmageddon	2018-01-22 → 2018-02-16	-$35	-0.099%	2018-02-07
Q4 2018 selloff	2018-11-26 → 2019-01-24	+$783	0.000%	(curve monotonic up)
COVID crash	2020-02-24 → 2020-04-23	+$102	0.000%	(curve monotonic up)
2022 bear market	2022-01-03 → 2022-12-30	+$4,223	0.000%	(curve monotonic up)
Banking crisis	2023-02-13 → 2023-04-13	+$1,652	0.000%	(curve monotonic up)

The 0.000% peak-to-trough entries are not measurement error. They reflect a structural feature of the halt-gated put-credit-spread architecture: during these stress windows the halt framework reduced or paused new entries, the open positions either expired profitably or hit stop-loss within their wing-width bound, and the unutilized capital continued earning the realized T-bill rate. Net trading P&L plus cash carry was positive on every trading day through these windows, so the equity curve never made a new low.

Volmageddon is the one exception. The early-2018 timing meant the basket was fully deployed when the VIX spike hit, and the resulting -$35 net P&L (-0.099% peak-to-trough) is the largest intra-event dip the strategy registered across all five named events.

Trade fates and rates

Every trade exits one of five fates. The distribution across the 437-trade headline blotter:

Fate	Trigger	Count	% of trades
`profit_target`	Exit debit ≤ 50% of entry credit	280	64.1%
`stop_loss`	Exit debit ≥ 200% of entry credit (gap-aware fill)	81	18.5%
`time_exit`	DTE ≤ 21	75	17.2%
`emergency`	\|short_delta\| > 0.50	1	0.2%
`eos_force`	End-of-OOS forced close	0	0.0%
Total		437	100%

Reading the rates the rubric asks for:

Rate	Value
Success rate (P&L > 0)	73.0%
Stop-loss rate	18.5%
Timeout rate (`time_exit`)	17.2%
Emergency-exit rate	0.2%

Per-trade summary statistics

Metric	Value
Total trades	437
Winning trades	319
Losing trades	118
Mean P&L per spread	+$3.42
Median P&L per spread	+$17.85
Standard deviation of P&L	$37.66
Largest single win	+$49.50
Largest single loss	-$186.00
Mean trade return	+8.25%
Mean trade lifetime	6.5 days
Median trade lifetime	7 days
Profit factor (gross win / gross loss)	1.26

The 6.5-day mean lifetime reflects how the strategy actually deploys capital: profit-target exits fire quickly in calm regimes, and the basket spends most of its capital sitting on T-bill carry between trade cycles. Median holding period is 7 days; the 25th-to-75th percentile window is 3 to 10 days; no trade exceeds 14 days because the time-exit rule forces a close at DTE ≤ 21 against the 30-45 DTE entry window.

Per-instrument breakdown

Ticker	Trades	Wins	Win rate	Mean P&L	Total P&L
AAPL	112	83	74.1%	+$4.11	+$460.23
MSFT	112	82	73.2%	+$3.54	+$396.33
GLD	106	80	75.5%	+$3.06	+$324.70
WMT	107	74	69.2%	+$2.92	+$312.87

P&L per trade across OOS

Trade-return distribution

The distribution is right-skewed by design: a 16-delta short put expires worthless ~84% of the time under lognormal assumptions, and profit_target closes wins early at 50% of credit. Losses are bounded by the wing-width stop. The 1.26 profit factor reflects mean-reversion of variance to the realized, not directional alpha on the underlying.

Ledger (monthly P&L sample)

Month	Trades closed	Net P&L	Cumulative P&L
2018-01	7	+$36.75	+$36.75
2018-02	5	-$193.88	-$157.13
2018-05	9	+$97.05	-$60.08
2018-08	17	+$191.15	+$179.58
2018-09	12	+$175.50	+$355.08
2018-10	11	-$290.25	+$64.83
2019-07	19	+$315.85	+$369.58
2019-12	18	+$321.70	+$529.38
2020-03	14	-$11.20	+$612.10
…	…	…	…
2024-08	5	-$361.65	+$1,237.08
2024-11	12	+$181.80	+$1,418.88
2024-12	12	+$75.25	+$1,494.13

Showing 12 of 84 months from January 2018 to December 2024. Net P&L in dollars per spread (per-contract basis at $100 multiplier). Full monthly ledger: monthly_ledger.csv (renders as a sortable table on GitHub).

Blotter

Random sample of 10 trades from the 437-row blotter (seed=42).

trd_prd	Entry	Exit	Ticker	Side	Qty	Entry credit	Exit debit	Fate	P&L	Return %	Success
2018.19	2018-05-07	2018-05-16	WMT	P	1	$22.45	$0.90	profit_target	$+21.55	+96.0%	True
2018.39	2018-09-24	2018-10-05	WMT	P	1	$35.80	$60.55	time_exit	$-24.75	-69.1%	False
2018.40	2018-10-01	2018-10-11	GLD	P	1	$15.05	$4.50	profit_target	$+10.55	+70.1%	True
2019.07	2019-02-11	2019-02-15	MSFT	P	1	$38.70	$15.30	profit_target	$+23.40	+60.5%	True
2019.30	2019-07-22	2019-08-01	WMT	P	1	$38.85	$88.55	stop_loss	$-49.70	-127.9%	False
2019.48	2019-11-25	2019-12-04	WMT	P	1	$29.85	$17.75	profit_target	$+12.10	+40.5%	True
2019.52	2019-12-23	2019-12-26	GLD	P	1	$19.70	$9.95	profit_target	$+9.75	+49.5%	True
2024.07	2024-02-12	2024-02-23	MSFT	P	1	$52.80	$34.25	time_exit	$+18.55	+35.1%	True
2024.23	2024-06-03	2024-06-07	GLD	P	1	$54.40	$147.50	stop_loss	$-93.10	-171.1%	False
2024.23	2024-06-03	2024-06-05	WMT	P	1	$14.45	$6.20	profit_target	$+8.25	+57.1%	True

The trd_prd index encodes year + ISO week as a single decimal. Entries on the same Monday share the same trd_prd.

Showing 10 of 437 trades. Full blotter: blotter.csv (all 437 entries, renders as a sortable table on GitHub).

How will you know the strategy is performing as expected?

A rolling 60-trade window of the realized win rate is compared against the OOS baseline of μ = 0.730. The Hoeffding inequality bounds the probability that observed underperformance is due to chance: while the bound stays at or above 50%, the strategy is operating within its modeled regime and trading continues at full size. Backtested over the 2018-2024 OOS sample, the bound was at or above 50% on 88% of post-warmup trades.

How will you quantify when the strategy stops working?

The same Hoeffding bound. When the bound drops below 25% the position size is cut, when it drops below 10% entries are halted entirely and the strategy is reviewed. The thresholds are pre-set, distribution-free, and apply uniformly across the 4-instrument basket. The OOS sample produced no critical signal across 1,760 trading days; full bound-trace and worked example on the Live Monitoring page.

For the data sources behind these numbers, see Data and Sources.

--- title: "Results" --- ## How to read this page Sharpe ratios on this page are reported in two complementary forms: - **Sharpe (excess)**, the headline. `(annualized return − annualized risk-free) / annualized volatility`. - **Deflated Sharpe (DSR)**, corrects for selection bias when many baskets are tested. Uses an implied-independent-trials count rather than the raw trial count, since adjacent baskets share underlying-instrument exposure and inflate the apparent number of independent searches. Other reported metrics: - **PBO** (Probability of Backtest Overfitting) via CSCV, the probability that the in-sample-best basket underperforms in OOS, computed by leave-out partitioning the per-instrument return matrix. - **n_trades** and **win rate** per instrument. - **Max drawdown** (signed, negative) and **annualized return** (geometric). **Risk-free rate:** The risk-free is the CBOE 13-week T-bill yield (IRX), averaged over the OOS window 2018-2024 = **2.33% annualized**. Excess Sharpe = (annualized return − 2.33%) / annualized volatility. The risk-free reflects the actual realized T-bill yield over the test period (which included near-zero rates 2018-2021 and 4-5% rates 2022-2024). Acceptance gates we set: - DSR (PSR threshold) ≥ 0.95 against the implied-independent-trials benchmark - PBO via CSCV ≤ 0.30 on the per-instrument return matrix ## Headline {#headline} **4-instrument cross-asset put-credit-spread basket** with the halt framework engaged and a calibrated regime-stress ML overlay that scales book exposure by `(1 − p_stress)` where `p_stress` is the model's probability of a stress event in the next quarter. Short strikes at 16-delta, 5-point wing protection, weekly Monday entries. | Composition | Asset class | Strike grid | |---|---|---:| | AAPL | Single-stock equity | $1 | | MSFT | Single-stock equity | $1 | | WMT | Single-stock equity | $1 | | GLD | Commodity ETF | $1 | | Headline metric | Value | |---|---:| | **Sharpe ratio (excess of risk-free, ML overlay engaged)** | **+0.371** | | Sharpe ratio (excess), without ML overlay | +0.359 | | SPX baseline (single-instrument, predecessor implementation) | +0.286 | | **Δ vs SPX baseline** | **+0.085** ✓ | | Risk-free baseline (avg IRX 2018-2024) | 2.33% | | **Geometric mean return (GMRR, annualized)** | **+2.41%** | | Annualized volatility | 0.21% (with ML) / 0.23% (without) | | **Alpha vs SPY (annualized OLS intercept × 252)** | **+2.37%** | | **Beta vs SPY (OLS slope, daily simple returns)** | **+0.0014** | | Correlation with SPY (daily returns) | 0.12 | | Max drawdown over OOS (7 years) | **-0.12%** (with ML) / -0.13% (without) | | Trades total (sum across 4 instruments) | 437 | | Average trades per year | 62.8 | | Average return per trade | +8.25% (median +$17.85 net P&L per spread) | | Win rate (basket aggregate) | 73.0% | | OOS span | 2018-01-01 → 2024-12-31 (1,760 trading days) | | ML overlay | Regime-stress scaler `(1 − p_stress)` applied to daily book exposure | | ML acceptance gate | Brier-score reduction +14.1% vs naive baseline (gate at ≥5%) ✓ | Mapping to the HW3 / HW4 rubric: **GMRR** is the geometric mean return (annualized) row; **Alpha** and **Beta** are computed by daily OLS of basket returns vs SPY over the full 1,760-day OOS window (per the project's `src/metrics/portfolio.py`); **Sharpe**, **Annualized Volatility**, **Max Drawdown**, **Avg Return per Trade**, **Trades per Year**, and **Total Trades** are reported above. Beta is essentially zero because the strategy harvests volatility risk premium on a defined-risk wing-protected structure rather than holding directional equity beta. ### Multiple-testing validation The basket was selected from a tested universe of 12 instruments (3 ETFs: SPX, TLT, GLD; 9 single-stocks: AAPL, MSFT, GOOGL, JNJ, KO, PG, WMT, JPM, PEP). Selection bias is corrected via DSR + PBO: | Correction | Value | Acceptance | Pass? | |---|---:|---|:-:| | Raw trials tested (N) | 12 | informational | n/a | | Avg pairwise return correlation (ρ̄) | 0.261 | informational | n/a | | Implied independent trials (N̂) | 9 | informational | n/a | | Deflated Sharpe (PSR) | 1.0000 | ≥ 0.95 | ✓ | | PBO via CSCV (S=16, 12,870 logits) | 0.0402 | ≤ 0.30 | ✓ | Both gates pass. The headline is statistically significant after correction for the 12-trial selection. ### Equity curve ```{=html} <iframe src="charts/equity_headline_vs_anchor.html" width="100%" height="560" frameborder="0"></iframe> ``` The headline ends at 1.18× starting equity over 7 years (CAGR 2.41%). The SPX baseline (Sharpe 0.286, ann_ret 2.49%) ends very close in absolute terms. The Sharpe advantage (+0.085 with the ML overlay engaged) shows up not in absolute return but in volatility: the headline runs at 0.21% annualized volatility versus the SPX baseline's 0.55%, so the same return is more risk-efficient. ### Drawdown ```{=html} <iframe src="charts/drawdown_headline.html" width="100%" height="460" frameborder="0"></iframe> ``` Maximum drawdown of -0.13% over the 7-year OOS sample. The defined-risk put-credit-spread structure plus the halt framework absorbs every named stress event without measurable equity damage. ### Per-instrument breakdown | Ticker | Sharpe (excess) | n_trades | Win rate | Max DD | Ann. return | Final $ (from $50K start) | |---|---:|---:|---:|---:|---:|---:| | AAPL | +0.264 | 112 | 74.1% | -0.50% | +2.44% | $59,170 | | MSFT | +0.269 | 112 | 73.2% | -0.37% | +2.42% | $59,124 | | WMT | +0.263 | 107 | 69.2% | -0.18% | +2.40% | $59,017 | | GLD | +0.138 | 106 | 75.5% | -0.23% | +2.40% | $59,008 | | **Aggregate (equal-weight 4)** | **+0.359** | 437 | 73.0% | **-0.13%** | +2.41% | $231,313 | The book aggregate has higher Sharpe than the per-instrument average because cross-instrument correlation is low (mean pairwise 0.26), so equal-weighting reduces volatility without proportional return loss. ```{=html} <iframe src="charts/per_instrument_sharpe.html" width="100%" height="490" frameborder="0"></iframe> ``` ## Anchor comparison vs SPX baseline | Strategy | Architecture | Sharpe (excess) | Trades | OOS window | |---|---|---:|---:|---| | SPX baseline (predecessor implementation, historical reference) | SPX put-only with halts engaged | +0.286 | 210 | 2018–2024 | | **Headline** | **4-instrument basket, put-only, halts engaged, regime-stress ML overlay** | **+0.371** | 437 | 2018–2024 | | **Δ vs SPX baseline** | (same OOS window, both halts engaged put-only) | **+0.085** | +227 | 2018–2024 | The new headline beats the SPX baseline on excess Sharpe, with roughly 2× the trade sample, comparable max drawdown, and cross-asset diversification (equity + commodity) the baseline lacks. The SPX baseline (the predecessor implementation) is included as a historical reference; reproducing its exact equity curve is not supported on the current engine version because the engine has materially evolved since the baseline was recorded. ## Variants tested: alternative baskets All baskets below are run with halts engaged, equal-weighted at $50,000 per instrument. None were selected as headline; they are shown to demonstrate what the headline is being chosen against. | Basket | n_inst | Sharpe (excess) | AnnRet | AnnVol | MaxDD | Δ vs SPX baseline (+0.286) | |---|:-:|---:|---:|---:|---:|---:| | (A) SPX put-only alone | 1 | -0.347 | +2.04% | 0.84% | -0.78% | -0.633 | | (B) 3-name basket (AAPL+MSFT+WMT) | 3 | +0.350 | +2.42% | 0.26% | -0.19% | +0.064 | | (C) 3-ETF basket (SPX + TLT + GLD put-only) | 3 | -0.312 | +2.22% | 0.35% | -0.31% | -0.598 | | **(D) the 4-instrument basket ← HEADLINE** | **4** | **+0.359** | **+2.41%** | **0.23%** | **-0.13%** | **+0.073** | | (E) Full 6-instrument book (3-ETF basket + 3-name basket) | 6 | -0.036 | +2.32% | 0.26% | -0.17% | -0.322 | Pattern: adding SPX or TLT to the headline drags the Sharpe down because their individual put-credit-spread excess Sharpes are negative (-0.35 and -0.43 respectively over OOS). The cross-asset diversification benefit of including GLD (which is barely positive on its own) outweighs the volatility-reduction cost. Adding more drag-instruments (SPX, TLT) does not produce a net benefit. ## Variants tested: iron condor architecture We tested both architectures on the same engine and underlying universe. Iron condor (put + call wings on the same expiry) underperforms put-only across every comparable comparison in OOS 2018-2024: | Mode | Universe | Sharpe (excess) | Δ vs SPX baseline | Notes | |---|---|---:|---:|---| | Iron condor, SPX only | 1 inst | -1.882 | -2.168 | Call wing destroyed by post-2020 SPX rally | | Iron condor, 3-instrument cluster | SPX/TLT/GLD | -2.292 | -2.578 | Same pattern across all three IC instruments | | Put-only, SPX | 1 inst | -0.347 | -0.633 | Same engine, IC removed | | Put-only, 3-instrument cluster | SPX/TLT/GLD | -0.369 | -0.655 | Same engine, IC removed | Iron condor underperforms put-only in every comparison. The call-wing leg of the iron condor systematically lost in 2018-2024 due to the trending equity-index regime. Reported as a tested extension that did not add value, not as the headline. ## Halts engaged vs disengaged Demonstrates that the halt framework is doing real work, every instrument has higher Sharpe (excess) when halts are active vs naked. | Ticker | Sharpe with halts disengaged | Halts Sharpe | Δ from halts | |---|---:|---:|---:| | AAPL | +0.155 | +0.264 | +0.109 | | MSFT | +0.037 | +0.269 | +0.232 | | WMT | +0.063 | +0.263 | +0.200 | | GLD | -0.971 | +0.138 | +1.109 | The halt framework's contribution is measurable per instrument and consistently positive. GLD has the largest gap because its naked exposure is fully on through every regime; the halt framework gates the worst stretches. ## Stress-event behavior Computed on the headline basket equity curve, halts engaged, equal-weight $50K per instrument (basket starting equity $200K). Drawdown is peak-to-trough WITHIN each window using a running cumulative max. | Event | Window probed | Net P&L | Peak-to-trough DD | Trough date | |---|---|---:|---:|---| | Volmageddon | 2018-01-22 → 2018-02-16 | -$35 | **-0.099%** | 2018-02-07 | | Q4 2018 selloff | 2018-11-26 → 2019-01-24 | +$783 | 0.000% | (curve monotonic up) | | COVID crash | 2020-02-24 → 2020-04-23 | +$102 | 0.000% | (curve monotonic up) | | 2022 bear market | 2022-01-03 → 2022-12-30 | +$4,223 | 0.000% | (curve monotonic up) | | Banking crisis | 2023-02-13 → 2023-04-13 | +$1,652 | 0.000% | (curve monotonic up) | The 0.000% peak-to-trough entries are not measurement error. They reflect a structural feature of the halt-gated put-credit-spread architecture: during these stress windows the halt framework reduced or paused new entries, the open positions either expired profitably or hit stop-loss within their wing-width bound, and the unutilized capital continued earning the realized T-bill rate. Net trading P&L plus cash carry was positive on every trading day through these windows, so the equity curve never made a new low. Volmageddon is the one exception. The early-2018 timing meant the basket was fully deployed when the VIX spike hit, and the resulting -$35 net P&L (-0.099% peak-to-trough) is the largest intra-event dip the strategy registered across all five named events. ## Trade fates and rates Every trade exits one of five fates. The distribution across the 437-trade headline blotter: | Fate | Trigger | Count | % of trades | |---|---|---:|---:| | `profit_target` | Exit debit ≤ 50% of entry credit | 280 | **64.1%** | | `stop_loss` | Exit debit ≥ 200% of entry credit (gap-aware fill) | 81 | 18.5% | | `time_exit` | DTE ≤ 21 | 75 | 17.2% | | `emergency` | \|short_delta\| > 0.50 | 1 | 0.2% | | `eos_force` | End-of-OOS forced close | 0 | 0.0% | | **Total** | | **437** | 100% | Reading the rates the rubric asks for: | Rate | Value | |---|---:| | **Success rate** (P&L > 0) | **73.0%** | | **Stop-loss rate** | 18.5% | | **Timeout rate** (`time_exit`) | 17.2% | | **Emergency-exit rate** | 0.2% | ## Per-trade summary statistics | Metric | Value | |---|---:| | Total trades | 437 | | Winning trades | 319 | | Losing trades | 118 | | Mean P&L per spread | +$3.42 | | Median P&L per spread | +$17.85 | | Standard deviation of P&L | $37.66 | | Largest single win | +$49.50 | | Largest single loss | -$186.00 | | Mean trade return | +8.25% | | Mean trade lifetime | 6.5 days | | Median trade lifetime | 7 days | | Profit factor (gross win / gross loss) | 1.26 | The 6.5-day mean lifetime reflects how the strategy actually deploys capital: profit-target exits fire quickly in calm regimes, and the basket spends most of its capital sitting on T-bill carry between trade cycles. Median holding period is 7 days; the 25th-to-75th percentile window is 3 to 10 days; no trade exceeds 14 days because the time-exit rule forces a close at DTE ≤ 21 against the 30-45 DTE entry window. ## Per-instrument breakdown | Ticker | Trades | Wins | Win rate | Mean P&L | Total P&L | |---|---:|---:|---:|---:|---:| | AAPL | 112 | 83 | 74.1% | +$4.11 | +$460.23 | | MSFT | 112 | 82 | 73.2% | +$3.54 | +$396.33 | | GLD | 106 | 80 | 75.5% | +$3.06 | +$324.70 | | WMT | 107 | 74 | 69.2% | +$2.92 | +$312.87 | ## P&L per trade across OOS ```{=html} <iframe src="charts/pnl_per_trade.html" width="100%" height="510" frameborder="0"></iframe> ``` ## Trade-return distribution ```{=html} <iframe src="charts/trade_return_distribution.html" width="100%" height="490" frameborder="0"></iframe> ``` The distribution is right-skewed by design: a 16-delta short put expires worthless ~84% of the time under lognormal assumptions, and `profit_target` closes wins early at 50% of credit. Losses are bounded by the wing-width stop. The 1.26 profit factor reflects mean-reversion of variance to the realized, not directional alpha on the underlying. ## Ledger (monthly P&L sample) | Month | Trades closed | Net P&L | Cumulative P&L | |---|---:|---:|---:| | 2018-01 | 7 | +$36.75 | +$36.75 | | 2018-02 | 5 | -$193.88 | -$157.13 | | 2018-05 | 9 | +$97.05 | -$60.08 | | 2018-08 | 17 | +$191.15 | +$179.58 | | 2018-09 | 12 | +$175.50 | +$355.08 | | 2018-10 | 11 | -$290.25 | +$64.83 | | 2019-07 | 19 | +$315.85 | +$369.58 | | 2019-12 | 18 | +$321.70 | +$529.38 | | 2020-03 | 14 | -$11.20 | +$612.10 | | ... | ... | ... | ... | | 2024-08 | 5 | -$361.65 | +$1,237.08 | | 2024-11 | 12 | +$181.80 | +$1,418.88 | | 2024-12 | 12 | +$75.25 | +$1,494.13 | *Showing 12 of 84 months from January 2018 to December 2024. Net P&L in dollars per spread (per-contract basis at $100 multiplier). Full monthly ledger: [monthly_ledger.csv](https://github.com/mariotrev120/FinalProject_FinTech533/blob/mario/data-pipeline/website/data/monthly_ledger.csv) (renders as a sortable table on GitHub).* ## Blotter Random sample of 10 trades from the 437-row blotter (seed=42). | trd_prd | Entry | Exit | Ticker | Side | Qty | Entry credit | Exit debit | Fate | P&L | Return % | Success | |---|---|---|:-:|:-:|---:|---:|---:|---|---:|---:|:-:| | 2018.19 | 2018-05-07 | 2018-05-16 | WMT | P | 1 | $22.45 | $0.90 | profit_target | $+21.55 | +96.0% | True | | 2018.39 | 2018-09-24 | 2018-10-05 | WMT | P | 1 | $35.80 | $60.55 | time_exit | $-24.75 | -69.1% | False | | 2018.40 | 2018-10-01 | 2018-10-11 | GLD | P | 1 | $15.05 | $4.50 | profit_target | $+10.55 | +70.1% | True | | 2019.07 | 2019-02-11 | 2019-02-15 | MSFT | P | 1 | $38.70 | $15.30 | profit_target | $+23.40 | +60.5% | True | | 2019.30 | 2019-07-22 | 2019-08-01 | WMT | P | 1 | $38.85 | $88.55 | stop_loss | $-49.70 | -127.9% | False | | 2019.48 | 2019-11-25 | 2019-12-04 | WMT | P | 1 | $29.85 | $17.75 | profit_target | $+12.10 | +40.5% | True | | 2019.52 | 2019-12-23 | 2019-12-26 | GLD | P | 1 | $19.70 | $9.95 | profit_target | $+9.75 | +49.5% | True | | 2024.07 | 2024-02-12 | 2024-02-23 | MSFT | P | 1 | $52.80 | $34.25 | time_exit | $+18.55 | +35.1% | True | | 2024.23 | 2024-06-03 | 2024-06-07 | GLD | P | 1 | $54.40 | $147.50 | stop_loss | $-93.10 | -171.1% | False | | 2024.23 | 2024-06-03 | 2024-06-05 | WMT | P | 1 | $14.45 | $6.20 | profit_target | $+8.25 | +57.1% | True | The `trd_prd` index encodes year + ISO week as a single decimal. Entries on the same Monday share the same `trd_prd`. *Showing 10 of 437 trades. Full blotter: [blotter.csv](https://github.com/mariotrev120/FinalProject_FinTech533/blob/mario/data-pipeline/website/data/blotter.csv) (all 437 entries, renders as a sortable table on GitHub).* ## How will you know the strategy is performing as expected? A rolling 60-trade window of the realized win rate is compared against the OOS baseline of μ = 0.730. The Hoeffding inequality bounds the probability that observed underperformance is due to chance: while the bound stays at or above 50%, the strategy is operating within its modeled regime and trading continues at full size. Backtested over the 2018-2024 OOS sample, the bound was at or above 50% on 88% of post-warmup trades. ## How will you quantify when the strategy stops working? The same Hoeffding bound. When the bound drops below 25% the position size is cut, when it drops below 10% entries are halted entirely and the strategy is reviewed. The thresholds are pre-set, distribution-free, and apply uniformly across the 4-instrument basket. The OOS sample produced no critical signal across 1,760 trading days; full bound-trace and worked example on the [Live Monitoring](monitoring.qmd) page. For the data sources behind these numbers, see [Data and Sources](data_and_literature.qmd).