Purged and Embargo



This implementation is based on Chapter 7 of the book Advances in Financial Machine Learning.

The purpose of performing cross validation is to reduce the probability of over-fitting and the book recommends it as the main tool of research. There are two innovations compared to the classical K-Fold Cross Validation implemented in sklearn.

  1. The first one is a process called purging which removes from the training set those samples that are build with information that overlaps samples in the testing set. More details on this in section 7.4.1, page 105.

purging image

Image showing the process of purging. Figure taken from page 107 of the book.

2. The second innovation is a process called embargo which removes a number of observations from the end of the test set. This further prevents leakage where the purging process is not enough. More details on this in section 7.4.2, page 107.

embargo image

Image showing the process of embargo. Figure taken from page 108 of the book.


Implementation

Code implementation demo
Code implementation demo

Research Notebook

Notebook demo
Notebook demo

Presentation Slides

../_images/lecture_4.png

Note

These slides are a collection of lectures so you need to do a bit of scrolling to find the correct sections.

  • pg 14-18: CV in Finance

  • pg 30-34: Hyper-parameter Tuning with CV

  • pg 122-126: Cross-Validation