Barriers to Entry


As most of you know, getting through the first 3 chapters of the book “Advances in Financial Machine Learning” by Marcos López de Prado is challenging as it relies on HFT data to create the new financial data structures. Sourcing the HFT data is very difficult and thus we have resorted to purchasing the full history of S&P500 Emini futures tick data from TickData LLC.

We are not affiliated with TickData in any way but would like to recommend others to make use of their service. The full history cost us about $750 (2019) and is worth every penny. They have really done a great job at cleaning the data and providing it in a user friendly manner.

Sample Data

TickData does offer about 20 days worth of raw tick data which can be sourced from their website link.

For those of you interested in working with a two years of sample tick, volume, and dollar bars, it is provided for in the sample data.

You should be able to work on a few implementations of the code with this set.

Additional Sources

Searching for free tick data can be a challenging task. The following three sources may help:

  1. Dukascopy. Offers free historical tick data for some futures, though you do have to register.

  2. Most crypto exchanges offer tick data but not historical (see Binance API). So you’d have to run a script for a few days.

  3. Blog Post: How and why I got 75Gb of free foreign exchange “Tick” data.