Note

The following implementation and documentation closely follow the work of Dr. Gautier Marti: CorrGAN: Sampling Realistic Financial Correlation Matrices using Generative Adversarial Networks.

Warning

In order to use this module, you should additionally install TensorFlow For more details, visit our MlFinLab installation guide.

CorrGAN




Dr. Gautier Marti proposed a novel approach for sampling realistic financial correlation matrices. He used a generative adversarial network (a GAN, named CorrGAN) to recover most of the known “stylized facts” about empirical correlation matrices based on asset returns.

It was trained on approximately 10,000 empirical correlation matrices estimated on S&P 500 returns sorted by a permutation induced by a hierarchical clustering algorithm.

Empirical vs Generated Correlation Matrices

(Left) Empirical correlation matrix estimated on stocks returns; (Right) GAN-generated correlation matrix. (Courtesy of marti.ai)

Gautier Marti found that previous methods for generating realistic correlation matrices were lacking. Other authors found as well that “there is no algorithm available for the generation of reasonably random financial correlation matrices with the Perron-Frobenius property. […] we expect the task of finding such correlation matrices to be highly complex”

The Perron-Frobenius property is one of many “stylized facts” that financial correlation matrices exhibit and it is difficult to reproduce with previous methods.

Being able to generate any number of realistic correlation matrices is a game changer for these reasons.

CorrGAN generates correlation matrices that have many “stylized facts” seen in empirical correlation matrices. The stylized facts CorrGAN recovered are:

  1. Distribution of pairwise correlations is significantly shifted to the positive.

  2. Eigenvalues follow the Marchenko-Pastur distribution, but for a very large first eigenvalue (the market).

  3. Eigenvalues follow the Marchenko-Pastur distribution, but for a couple of other large eigenvalues (industries).

  4. Perron-Frobenius property (first eigenvector has positive entries).

  5. Hierarchical structure of correlations.

  6. Scale-free property of the corresponding Minimum Spanning Tree (MST).


Implementation

Code implementation demo

Example

Note

Due to the CorrGAN trained models being too large to be included in the mlfinlab package (> 100 MB). We included them as a downloadable package here. Extract the corrgan_models folder and copy it in the mlfinlab folder.

Note

The higher the dimension, the longer it takes CorrGAN to generate a sample. For more information refer to the research notebook.

Code example demo

Research Notebook

The following research notebook can be used to better understand the sampled correlation matrices.

Notebook demo

Research Article



Presentation Slides