AgPa #65: Machine-Learned Manager Selection (1/4)

Machine-Learning the Skill of Mutual Fund Managers (2022)
Ron Kaniel, Zihan Lin, Markus Pelger, Stijn Van Nieuwerburgh
NBER Working Paper 29723, URL

To conclude the posts on manager selection, at least for the moment, I will dive into one of the most recent research frontiers in this area. Since the application of machine learning in investment management has been intensively studied among equities for more than three years now, it is not surprising that researchers also start to apply such algorithms to other asset classes. A natural candidate for this are equity mutual funds and this is exactly where this and the next four week’s AGNOSTIC Papers come in.

  • Week 1: US Mutual Funds – Alphas
  • Week 2: US Mutual Funds – Long Only
  • Week 3: US Mutual Funds – Total Returns
  • Week 4: Hedge Funds

Everything that follows is only my summary of the original paper. So unless indicated otherwise, all tables and charts belong to the authors of the paper and I am just quoting them. The authors deserve full credit for creating this material, so please always cite the original source.

Setup and Idea

There is not much to say about the setup and idea. Although financial markets come with certain challenges for the successful application of machine learning, investors and fund managers have no chance but to adapt to those more advanced models. For more details about machine learning and its applications in the investment industry, I recommend to read my post from December 2021. While the possible use-cases are wide-ranging, the “holy grail” of machine learning for portfolio managers obviously remains to train a model that somehow reliably predict future returns. The problem: compared to common machine learning applications like image-recognition, this is a very difficult task. Picking an outperforming stock is much more difficult than recognizing a cat and the evidence suggests that even most humans fail to do it successfully. Nonetheless, the literature still suggests that machine learning techniques are better for it than “traditional” econometrics or simple rankings of stocks.

Given those promising results within equities, it is not surprising that people start to apply the methods in other asset classes. Investors want to make money and researchers want to publish novel papers, that is just how it works. In this week’s paper, the authors apply a neural network, one particular machine learning algorithm, on a sample of US mutual funds and find surprisingly robust patterns.

Data and Methodology

The authors use the CRSP Mutual Fund Database to obtain a sample of 3,275 US mutual funds for the period between January 1980 and January 2019. They also source holdings for those funds from the Thomson Financial Mutual Fund Holdings database. Both are professional vendors, so data quality shouldn’t be an issue. Most importantly, those databases control for survivorship-bias which is critically important for mutual funds.

To evaluate the performance of funds and to train their machine learning model, the authors estimate the funds’ alpha against the Carhart 4-factor-model (1997) which includes the overall stock market and the well-known factors value, size, and momentum. The authors calculate those 4-factor-alphas from simple time-series regressions over a rolling 36-month-window for each fund and month.

Table 1 of Kaniel et al. (2022).

To feed their model with information, the authors use a set of 60 well-known characteristics for which the literature found significant relations with future returns. Specifically, they use 46 variables that capture the characteristics of the underlying stock holdings of a fund (see 1-46 in the table above), and 13 variables that are related to the fund itself (see 47-59). They also include two variants of a time-series variable that captures investor sentiment and the macroeconomic regime. Overall, this is a pretty comprehensive set of features and many of them actually capture what we know from the major factors.1Feature is the machine learning jargon for independent variable.

When it comes to the machine learning algorithm, the authors follow the results of the landmark-paper by Gu et al. (2020) and use an artificial neural network as this appear to be the best model for predicting stock returns.2Some practitioners disagree with this result. I cannot judge who is right because I don’t have the resources to replicate the results myself. I am not an expert in machine learning, however, I believe it would have been more robust to also use at least some of the other popular models like for example a random forest. At least to the best of my knowledge, an ensemble of multiple models is considered best-practice among practitioners. In addition to that, I think we simply don’t know how transferable machine learning models are between asset classes. To their credit, the authors provide some result from gradient boosted trees in the Internet Appendix and find results that are very similar to the neural network

Before going into the results, one further important note. As I mentioned in the introduction, the application of machine learning to mutual funds is a very recent topic and thus in a quite early stage. This week’s paper is therefore not yet published or peer-reviewed. Having said that, the authors enjoy good reputations and the paper has been discussed in top-institutions like Oxford or Stanford. The results are therefore probably not final, but I think they still give a reliable indication on what is possible for mutual fund investors.

Important Results and Takeaways

Machine learning helps to identify outperforming funds

The heart of the paper is the model to predict future fund performance. For that purpose, the authors predict the next month’s alpha of each fund and sort them into decile-portfolios. For each of those fund-of-fund portfolios, they calculate two weighting-schemes. The first one are simple equal-weights, the second one ranks funds according to the magnitude of the 4-factor-alpha prediction. The following charts summarize the cumulative performance over the last 40 years.

Figure 5 of Kaniel et al. (2022).

The results are pretty clear. The top-decile of mutual funds earned a cumulative 4-factor-alpha of 72% for prediction-weights, and 48% for equal-weights. In contrast, the worst-decile generated negative alphas of -119% and 93%, respectively. So there are two takeaways. First, machine learning models appear to be indeed helpful to identify alpha-generating funds. Second, avoiding the losers seems to be more profitable than picking the winners.3Unfortunately, investors can not really capitalize on the negative alpha because it is hardly possible to short mutual funds.

The authors further note that the results remain robust for net-returns, i.e. after deducting the expense ratio of the fund. They show that funds across the decile-portfolios don’t have materially different expense ratios and repeat the analysis with net returns. The results are generally similar and even more pronounced for the worst-decile of funds. For the prediction-weighted portfolios, the cumulative 4-factor-alpha of the best funds remains positive at 37% while the negative alpha of the worst funds decreases to a staggering -170% (compared to 72% and -119% before fees). Apart from the predictability, those results once again show the strong impact of fees. Even the best-performing mutual funds capture slightly more than half of their gross-alpha as fee.

For even more robustness and to test their model in a more long-term oriented setting, the authors next examine a fund-selection-strategy with holding periods of more than one month. The following charts present several statistics. Most importantly, Panels (a) and (d) show the average monthly 4-factor-alpha and the associated t-statistics. The results are again pretty strong. While the alpha obviously decreases over longer holding-periods, it remains significantly positive at about 10bps even 36 months after the prediction of the model. A monthly alpha of 10bps translates into about 1.2% per year which is a still a very decent outperformance. Note, however, that this analysis is based on hypothetical long-short portfolios of the best and worst-decile of mutual funds. Shorting mutual funds is hardly possible in practice, so these results are rather theoretical. Nonetheless, the general result that the prediction is quite long-lived and doesn’t necessarily require fast trading or high turnover remains important and valid.

Figure 9 of Kaniel et al. (2022).

There are a whole number of further analyses in the paper which are beyond my summary here. Most of them, however, are just variants of the overall pattern of return-predictability among mutual funds. But before we continue, one further comment that I find quite interesting. As we discussed above, the best fund-of-fund decile-portfolio generates significant 4-factor-alpha. But what does this mean? Well, it means that the machine learning model apparently selects funds according to patterns that are beyond well-known factors like value, momentum, or size. Given that some investors nowadays also apply the major factors to ETFs or mutual funds, I think it is very interesting that the machine-learning prediction goes beyond that.

Less is more – not all information is necessary

In good scientific fashion, the authors next examine what information actually drives their results. For that purpose, they split their features into four groups and repeat their analyses for each of them. The charts below show the results.

Figure 7 of Kaniel et al. (2022).

Although it is not a formal test, the comparison between Panels (a)/(b) and (c)/(d) clearly suggests that fund-characteristics are more important than the characteristics of the underlying stock-holdings. After reading it the first time, this was honestly somewhat surprising to me. If you have a fund that owns Apple and Microsoft, the characteristics of those stocks should matter for your fund-return, right? Yes and no. The authors explain that much of the information about stock-holdings are already included by the fact that the model is trained on alphas. By calculating 4-factor-alphas for each fund, you already control for value, size, momentum, and the overall stock market. So you don’t necessarily need those holding-characteristics again to predict the alpha of the fund. In fact, this process makes the whole manager-selection much easier but more on that below.

Figure 11 of Kaniel et al. (2022).

In some further analyses, the authors also use some state-of-the-art approaches from the machine learning literature to identify the most important variables. I will not go into the technical details, but the chart above provides a good overview. The most relevant variables are the sentiment indicator and the fund characteristics. The authors also argue that especially the interaction effects between the two are important. Needless to say, such interaction effects among a large group of variables are the reason why machine learning models are superior to simpler methods when it comes to the best possible out-of-sample forecast.

Alpha is easier to predict than total returns

I already touched the issue above, but an important point for the whole paper is the observation-unit. The authors specifically focus on risk-adjusted abnormal returns and train their model on 4-factor-alpha. While this is the theoretically most correct way to evaluate fund-performance, it is not always realistic and critically depends on the sophistication of the investor. For example, you can have positive alpha but still lose money which is hardly what investors are aiming for.

Having said that, predicting alpha comes with a significant advantage. As we all (hopefully) agree on, stock returns are pretty volatile and no one can say where the market will be 1 year from now. This makes it inherently difficult to train a model that gives you a prediction like “This fund will make 1% next month”. It is much easier to remove the most volatile return component and predict how the fund will do relative to the market or in this case, the 4-factor-model. Of course, the problem remains that you don’t know what the market and the factors will do in the future. If we have a crash like in early 2020 and you lose 15% while the market or the factor model loses 20%, you consider that a success even though you are poorer than before. That’s the curse and blessing of relative performance.

Figure 14 of Kaniel et al. (2022).

Despite those problems, the authors present their analyses for the total returns of the funds in the chart above. The magnitude of returns suggests that the model is still useful, but it is not as pronounced as for the 4-factor-alphas. Also note that when looking at total returns of funds, the stock-characteristics of the holdings become important again. This follows the same logic that I mentioned earlier. You implicitly control for holding-characteristics when calculating the alpha of a fund. So if you don’t do that and focus on the total return, you should consider them in your prediction.

Conclusions and Further Ideas

As I mentioned in the introduction, the paper is not yet published and we therefore need to be somewhat cautious with the results. Yet, I think the overall direction is clear. Given the successes of machine learning methods in other asset classes, I think it is just logical that greedy fund investors and publication-seeking researchers apply the tools in other areas like mutual funds. All irony aside, this is of course great for us as research-consumers because we learn more about the methodology and can use the insights to improve our investment process.

Coming back to my issue of manager selection, I think the results of this week’s paper clearly suggest that some form of machine learning is something fund selectors will need to integrate into their strategies. Given that this is not the only working paper that is currently discussed in this field, I decided to make a little series and will cover three other papers on Machine-Learned Manager Selection over the next weeks. At the time of writing this, I have not read all of them so I cannot tell you what the overall conclusion is going to be. My hope, of course, is that a variety of papers will give us a more robust impression about what we can expect from machine learning for fund and manager selection.

This content is for educational and informational purposes only and no substitute for professional or financial advice. The use of any information on this website is solely on your own risk and I do not take responsibility or liability for any damages that may occur. The views expressed on this website are solely my own and do not necessarily reflect the views of any organisation I am associated with. Income- or benefit-generating links are marked with a star (*). All content that is not my intellectual property is marked as such. If you own the intellectual property displayed on this website and do not agree with my use of it, please send me an e-mail and I will remedy the situation immediately. Please also read the Disclaimer.


1 Feature is the machine learning jargon for independent variable.
2 Some practitioners disagree with this result. I cannot judge who is right because I don’t have the resources to replicate the results myself.
3 Unfortunately, investors can not really capitalize on the negative alpha because it is hardly possible to short mutual funds.