Avoiding Overfitting
=====================

Overfitting is the #1 reason why strategies that look great in backtesting fail in live trading.
Understanding it will save you from making expensive mistakes.

----

.. contents:: On this page
   :local:
   :depth: 1

----

What is overfitting?
---------------------

Overfitting means your strategy's parameters have been tuned so precisely to historical data
that they capture *noise* rather than real market patterns.

Think of it this way: if you flip a coin 10 times and get 7 heads,
you wouldn't conclude the coin is biased — there's too little data.
The same applies to backtesting: a strategy with 15 trades and Sharpe 3.0 is meaningless.
The parameters just happened to fit those 15 specific dates.

**The symptom:** Excellent backtested performance that evaporates completely in live trading.

----

In-sample vs. out-of-sample
-----------------------------

These two concepts are the foundation of honest strategy evaluation:

.. list-table::
   :header-rows: 1
   :widths: 30 70

   * - Term
     - Meaning
   * - **In-sample (IS)**
     - The date range used to *develop and optimise* the strategy. Results here are biased upward.
   * - **Out-of-sample (OOS)**
     - A date range *never seen* during development. Results here are honest.

**The golden rule:** Never use out-of-sample data to make any parameter decisions.
The moment you look at OOS results and adjust your strategy, that data becomes in-sample.

**Recommended split:**

.. code-block:: text

   Full data:     2000 ─────────────────────────────── 2024
   In-sample:     2000 ──────────────── 2018
   Out-of-sample:                       2018 ─────── 2024

Hold back the most recent 20–30% of your data as OOS.
Recent data is most relevant to future performance.

----

Warning signs of overfitting
------------------------------

Watch for these red flags in your backtest results:

**1. Too few trades**

A strategy needs enough trades for statistics to be meaningful.
As a minimum guideline:

.. list-table::
   :header-rows: 1
   :widths: 25 75

   * - # Trades
     - Reliability of metrics
   * - < 20
     - Not reliable at all. Results are mostly noise.
   * - 20 – 50
     - Weak. Use with caution.
   * - 50 – 100
     - Acceptable.
   * - > 100
     - Good statistical basis.

**2. Sharpe Ratio too high**

A Sharpe above 2.5 on a diversified equity strategy is highly suspicious.
Real-world systematic strategies run by professionals typically achieve 0.5–1.5.
If your backtest shows Sharpe 4+, check your logic carefully for look-ahead bias.

**3. Many parameters, few trades**

If your strategy has 8 optimisable parameters and only 40 trades,
the optimizer had more "degrees of freedom" to fit the data than there were data points.
Rule of thumb: you need at least **10 trades per free parameter**.

.. list-table::
   :header-rows: 1
   :widths: 30 70

   * - Parameters
     - Minimum trades needed
   * - 2
     - 20
   * - 4
     - 40
   * - 8
     - 80

**4. Equity curve only rises in one specific period**

If the backtest covers 20 years but 90% of the profit came in a single 2-year window,
the strategy may have just accidentally captured one bull market run — not a repeatable edge.

**5. Isolated best parameter**

In an optimisation run, if the best parameter set is surrounded by much worse results,
it's an overfit spike:

.. code-block:: text

   Period 12:  Sharpe 0.8
   Period 13:  Sharpe 0.9
   Period 14:  Sharpe 2.6   ← overfit spike
   Period 15:  Sharpe 1.0
   Period 16:  Sharpe 0.9

A robust parameter sits inside a *plateau* of good results, not a spike.

----

Look-ahead bias
----------------

Look-ahead bias is a specific type of overfitting where your strategy inadvertently uses future data.
It produces unrealistically perfect results.

**Common causes in block strategies:**

- Using ``Offset = 0`` when you should use ``Offset = 1``.
  If you check a condition and enter a trade on the *same* bar, you're using closing price information
  that wasn't available at the moment the trade would have been placed.
- Calculating indicators on the full dataset before slicing by date.

.. tip::
   Use ``Execution = sod`` (start of day) for entries to simulate executing at the *next* day's open,
   which is more realistic than executing at the bar's close.

----

Degrees of freedom
-------------------

Each parameter you add to your strategy "uses up" a degree of freedom —
another dimension along which the optimizer can accidentally fit the data.

A simple strategy with 2 parameters and 200 trades is far more robust than
a complex strategy with 10 parameters and the same 200 trades.

**Principles:**

1. Start simple — get the core idea working with 2–3 parameters.
2. Only add complexity if there is a logical reason for it, not just because it improves backtest results.
3. Every indicator you add should have an *economic explanation* for why it should predict returns.

----

The out-of-sample test
------------------------

The only honest final test is a single run on data you have never touched:

1. Develop and optimise on in-sample data (e.g. 2000–2018).
2. **Lock** your parameters — do not change them again.
3. Run exactly **one** backtest on out-of-sample data (2018–2024).
4. Report both results side-by-side.

A healthy strategy shows:

.. list-table::
   :header-rows: 1
   :widths: 35 30 30

   * - Metric
     - In-sample
     - Out-of-sample
   * - Sharpe Ratio
     - 1.40
     - 1.15  ✓ (within ~30%)
   * - Max Drawdown
     - -22%
     - -28%  ✓ (similar scale)
   * - Total Return
     - 185%
     - 67%   ✓ (shorter period)

An overfit strategy shows:

.. list-table::
   :header-rows: 1
   :widths: 35 30 30

   * - Metric
     - In-sample
     - Out-of-sample
   * - Sharpe Ratio
     - 2.80
     - 0.15  ✗ (collapsed)
   * - Max Drawdown
     - -8%
     - -55%  ✗ (much worse)

----

Practical checklist
--------------------

Before trusting any strategy result, run through this checklist:

.. list-table::
   :header-rows: 1
   :widths: 10 90

   * - ✓
     - Check
   * -
     - The strategy has at least **50+ completed trades** in the test period.
   * -
     - The strategy was **not** optimised on the final 20% of the date range.
   * -
     - The Sharpe Ratio is **below 2.5** (or you have a clear explanation if it's higher).
   * -
     - The number of parameters is **≤ trades / 10**.
   * -
     - The best parameter is **not an isolated spike** in the optimisation table.
   * -
     - The strategy's logic has a **clear economic rationale** (not just "these numbers worked").
   * -
     - Performance was tested in both **bull and bear** market periods.
   * -
     - Commissions and slippage are included in the results.