Backtest Engine — MacroFXModel

Click "Load all from R2" then press Run Backtest.

M5 + M30 required · M1 optional (TP/SL exit precision)

Summary Statistics

Trades / Wins / Losses

Raw trade counts. A healthy backtest needs at least 200+ trades for statistical significance. Below 100 trades, any win rate figure is mostly noise.

Win Rate

Percentage of trades that closed at TP (positive R). At a 2.2R target you need ~32% to break even, ~40%+ to be profitable after costs. Win rate alone is meaningless — it must be interpreted alongside the R-multiple.

Profit Factor

Total gross profit ÷ total gross loss. >1.5 is good, >2.0 is excellent. Values below 1.0 mean the strategy loses money overall. Unlike win rate, PF captures both the frequency and magnitude of wins/losses.

Mean R

Average R per trade (net of costs). Positive mean R is the core requirement. Multiply by expected trades per year to estimate annual R return. E.g. 0.15R mean × 250 trades = ~37R/year.

Sharpe Ratio

Mean R ÷ standard deviation of R, annualised. Measures return per unit of risk. >1 is acceptable, >2 is strong, >3 is exceptional. High Sharpe means consistent returns without wild swings.

Calmar Ratio

Annualised return ÷ maximum drawdown. >1 means you earn back your max drawdown every year. Higher is better. Low Calmar with high Sharpe suggests a slow grind; high Calmar suggests efficient use of risk.

Max Drawdown

Largest peak-to-trough decline in the equity curve (as a % of peak equity at 1% risk per trade). This is the number that determines whether you can psychologically and financially survive the strategy's worst losing run. <15% is manageable; >30% is very hard to trade live.

Kelly %

The theoretically optimal fraction of account to risk per trade to maximise geometric growth. In practice use half Kelly or less. If Kelly is <1% or negative, the strategy's edge is marginal or negative.

CAGR

Compound Annual Growth Rate at 1% risk per trade on a £1 starting account. A normalised measure of annual return that accounts for compounding. Use this to compare strategies across different test lengths.

AI Analysis

Equity Curve (cumulative R, 1% risk per trade)

Monte Carlo — 1,000 path resamples (P5 / P25 / P50 / P75 / P95)

Monte Carlo simulation randomly reorders the actual trade outcomes 1,000 times to show the range of possible equity paths the strategy could produce. It answers: "if the trades happened in a different sequence, how different would the outcome be?"

Bands (P5 to P95)

P50 — median path. Half of all simulations did better, half worse.
P75/P95 — lucky sequences. Shows upside if you hit a good run of trades.
P25/P5 — unlucky sequences. P5 is the worst 5% of paths — this is what you plan for.

What to look for

If P5 final equity is still positive, the strategy survives its worst realistic sequence.
A wide band between P5 and P95 means outcome is highly sequence-dependent — consider reducing position size.
A narrow band means the edge is consistent and sequence risk is low.

Max drawdown distribution

The table below the chart shows the simulated max drawdown percentiles. The P95 max drawdown is your "plan for this" number — size your account so you can absorb it without blowing up.

Bayesian Feature Accuracy

For each feature, this table tracks every trade where that feature fired (voted non-null) and whether the trade was a winner. This is the empirical win rate conditional on that feature agreeing with the direction taken.

Fires

Number of trades where this feature cast a vote (long or short — not abstained). A feature with very few fires may not have enough data to be reliable.

Win Rate

% of fired trades that were winners. >55% suggests the feature genuinely improves signal quality. <45% suggests it is adding noise — consider disabling it. Close to 50% means it has no predictive edge.

How to use this

Disable features with win rates consistently below 45% across multiple runs.
Increase the weight of features with win rates above 60% — they deserve more influence on the conviction score.
If a feature has <50 fires, its win rate is too noisy to trust — enable more data or a longer date range first.
Re-run after changing weights and compare: if overall win rate and PF improve, the weight change was beneficial.

Note: win rate here counts the trade outcome (TP vs SL), not whether the feature itself was "right" about direction. A feature that fires correctly but entry timing was poor may still show low win rate.

SD Level Breakdown

Aggregates every trade across the full run by the Fibonacci SD level where entry occurred. Use this to find which levels have the highest win rate and best R expectancy — and which to filter out.

SD Level — Fibonacci projection from the Asia session range. SD0 = range low, SD1 = range high, SD±0.5 = midpoints inside, SD2+ = outside the range (extension zones).
Win Rate — % of trades at that level that hit TP. Higher is better. Levels below 40% consistently are candidates for disabling via the Fib filter.
Total R / Avg R — net R earned at this level across all trades. A high win rate with negative total R means winners are smaller than losers — check the R:R setting.

Trade Log (last 200 trades)

Shows the most recent 200 trades in chronological order. Click any column header to sort. Click ▶ on a row to expand its feature vote breakdown.

Columns

Date — London date the trade was entered.
Dir — trade direction: ↑ Long (buying) or ↓ Short (selling).
Entry / Exit — exact price at entry (confluence level) and at close (TP, SL, or EOD price).
SD Level — the Fibonacci SD level where entry occurred (e.g. SD1.5 = 150% fib above range). ↔ shows yesterday's fib if different from today's.
Result — TP hit take profit · SL hit stop loss · EOD closed at market at 21:00.
R — profit/loss in R-multiples. +2.2R means you made 2.2× your risk. −1R means you lost your full stop.

Feature vote pills (expanded row)

Each pill shows one feature's vote for that trade: green = agreed with direction taken, red = disagreed, faded = abstained. The number at the right of each pill (+1, −1, 0) is the weighted contribution to the conviction score.

Use this to understand why a trade was taken and which features were wrong on losing trades — this is the key to refining feature weights.

⚡ tag

Trades tagged ⚡flip were opened by the Flip on SL feature — the original trade hit SL and was immediately reversed. Compare flip trade win rates to initial trade win rates to see if flipping adds value.

Profitability by Year & Month click year to expand