Investable and Interpretable Machine Learning for Equities (2022)
Yimou Li, Zachary Simon, David Turkington
The Journal of Financial Data Science Winter 2022, 4(1), URL
It has been a while since my last post on the applications of machine learning in asset management. Regular readers of this blog know that this is one of my favorite topics and I recently found new interesting material. This week’s AGNOSTIC Paper is the first of two studies and examines an important issue with machine learning models in great detail: interpretability…
Everything that follows is only my summary of the original paper. So unless indicated otherwise, all tables and charts belong to the authors of the paper and I am just quoting them. The authors deserve full credit for creating this material, so please always cite the original source.
Setup and Idea
For a deeper introduction to machine learning in investing, I recommend to read my post on Big Data & Machine Learning in Asset Management before continuing with this one. For those who don’t want to do that, here is a very brief summary to get everyone on the same page. Machine learning is a sub-theme of artificial intelligence and refers to a collection of advanced algorithms and models. Most importantly, machine learning models typically overcome limitations of simpler statistical models and are specifically designed to predict out-of-sample. Their ability to combine a lot of information into one forecast is very interesting for investing because ultimately, that is exactly our job as investors. Collecting information, processing it into a forecast, and trade on it.
As a consequence, both the academic literature and the investment industry left the “traditional” world of linear econometrics some years ago and adapted to more sophisticated machine learning models. This is of course not really surprising as we would expect that investment processes continue to evolve and must apply the latest methodology to remain competitive. The literature and most practitioners regard machine learning as the logical next step of quantitative/systematic investing but caution that its abilities tend to be overestimated.
Specifically, financial markets come with three important problems for machine learning. First, by the standards of other areas like image recognition, we simply don’t have much data in the low-frequency space. Second, competition among investors naturally leads to low signal-to-noise ratios. Third and finally, the rules in markets are constantly changing and what worked in the past could stop working at some point in the future.
Another important issue for asset management is the “black-box” problem. Using more sophisticated non-linear models inevitably comes with more complexity and it is often hard to explain forecasts of the machine. This is a problem for asset managers because they have a fiduciary duty. Even if it might be the honest answer, it probably won’t help to tell your clients “Our computer decided to buy Apple with 5% of your money but we really have no idea why.” Contrary to some other disciplines, asset managers must therefore find a balance between applying more sophisticated models and preserving enough interpretability to educate their clients about what is going on with their money.The sales pitch for this is the transformation of “black-boxes” into interpretable “glass-boxes”… This week’s paper focuses almost exclusively on this issue and presents a methodology to explain the forecasts of common machine learning models.
Data and Methodology
To be honest, the empirical analysis in the paper is (in my opinion) more of an illustrative example to show the authors’ methodology for interpreting the machine learning models. They use four machine learning models and simple linear regression as a benchmark to predict stock returns in the S&P 500 universe between December 1992 and September 2020. By our common standards, this is both a narrow and short sample for US stocks. However the authors use this liquid universe on purpose to ensure that their strategies are investable in the real-world. But as I mentioned, rather view this as an illustration than a comprehensive backtest…
The authors feed their models with 16 variables which mostly come from the factor investing literature (past prices, fundamentals, risk measures, etc.). The data comes from Refinitiv and StateStreet, so data quality shouldn’t be an issue. To fit their models, the authors follow the best practices of machine learning and use the period from 1992 to 2014 as training data. The out-of-sample testing data ranges from January 2015 to September 2020. This is again quite short, but as I mentioned, please regard it as an illustration not as a thorough test.
The heart of the paper is the model fingerprint methodology to understand what the machine is doing. According to the authors, this methodology isolates the linear, non-linear, and interaction effects of each input variable. The fingerprints provide information about the importance of individual features and therefore help to interpret the model. I will not go into the technical details of this methodology, but the authors explain that fingerprints show “[…] the average extent to which changes in the predictor influence the prediction globally.” They also explain that fingerprints are computationally more efficient than Shapley values, the current standard methodology to interpret machine learning models. Next week’s authors, however, criticize fingerprints as too simple compared to Shapley values. In practice, it is probably best to look at both of them to get a broader perspective.
To sum up, the authors present fingerprints as one particular method to interpret return forecasts of machine learning models and illustrate it with machine learning based stock selection in the S&P 500 universe.
Important Results and Takeaways
Machine learning models outperform simpler methods
The authors use linear regression (OLS) and four machine learning models (LASSO, Random Forest, Boosted Trees, Neural Network) to predict the next month’s return for each stock in the S&P 500 index. For their hypothetical trading strategy, they rank stocks according to this forecast and go long (short) an equal weighted portfolio of the 20% of stocks with the highest (lowest) forecasted returns. Given the S&P 500 universe, the strategy holds about 100 stocks on both the long and the short side at each point in time. The strategy rebalances monthly and the following table summarizes the annualized performance.
Given their superior attributes, I think it is no surprises that the more sophisticated machine learning models performed better than simpler methods. Random Forest, Boosted Trees, and Neural Networks generated higher returns and better risk-adjusted performance than OLS and the model-free “Equal Factor Weights” benchmark. While impressive, it is worth mentioning that those returns come with more turnover. The authors state turnover as a multiple of trading costs. For example, if a round-trip transaction costs 20 basis points, the Neural Network strategy generates trading costs of 4.8 x 0.20% = 0.96% per year.
Different models learn different investment approaches
Given that we have seen machine learning models outperforming simpler methods, let’s go to the heart of the paper and examine how they did it. Panel A of the following chart shows model fingerprints for total returns over the next month. The dark blue bars are linear effects, the light blue bars are non-linear effects, and the orange bars are interactions. An easy way to interpret the chart is “The larger the bars, the more important the input for the prediction.” From the authors’ (and my personal) perspective, the most interesting observation is that different models “learn” different approaches. For example, Random Forests seem to focus on Volatility and Beta, whereas Boosted Trees put more emphasize on Value and Momentum. This is pretty similar to human portfolio managers, we all follow somewhat different approaches…
Panel B of the chart further shows that the importance of variables also depends on the time frame. When the models are fitted to predict the return over the next 12 months, the picture changes quite considerably. For example, the importance of Value and Momentum tends to increase and there seems to be a bit more agreement among the models.
Conclusions and Further Ideas
I already mentioned it several times, but I think it is important to repeat that the data analysis of this week’s paper is nothing special and we shouldn’t put too much emphasis on specific numbers. Despite that, it is very interesting to see what machine learning models are doing under the hood. And given the fiduciary duty of asset managers, this is a very important aspect of the adoption of machine learning in the investment world.
In my opinion, it is great that the authors illustrate a potential solution for this trade-off. Additionally, I also find it interesting that machine learning models seem to focus on the same overall patterns as factor investors. We cannot directly conclude this from this week’s paper because the authors don’t give the models much different data to learn, but other researchers came to the same conclusion. Overall, this underlines the current consensus that machine learning is the logical evolution of quantitative investing and helps to implement the underlying ideas of factor investing even better.
- AgPa #56: The Equity Risk Premium of Small Businesses
- AgPa #55: Backtests in the Age of Machine Learning
- AgPa #54: Transitory Inflation
- AgPa #53: Investing in Interesting Times
This content is for educational and informational purposes only and no substitute for professional or financial advice. The use of any information on this website is solely on your own risk and I do not take responsibility or liability for any damages that may occur. The views expressed on this website are solely my own and do not necessarily reflect the views of any organisation I am associated with. Income- or benefit-generating links are marked with a star (*). Please also read the Disclaimer.
|1||The sales pitch for this is the transformation of “black-boxes” into interpretable “glass-boxes”…|