**Extreme Stock Market Performers, Part IV: Can Observable Characteristics Forecast Outcomes (2020)***Hendrik Bessembinder*

SSRN Working Paper, URL

This is the fifth of seven AGNOSTIC Papers on the extreme concentration in stock markets. As promised last week, this one will finally examine how to identify the few big winners *ex-ante* (at least it will try). So it might actually be the most interesting and relevant post for real-world investing. As we will see, *future* big winners have some distinct fundamental characteristics that are indeed observable *today*. That said, the picture remains noisy and it’s very difficult to find them systematically. But before getting too depressed right in the beginning, let’s look at the paper.

By now you should know what is coming at this point. I recommend to read the posts of this series chronologically. But as always, feel free to use the following list to do whatever feels good for you.

- Week 1: Concentration in the US Stock Market between 1926 and 2019
- Week 2: Concentration in Global Stock Markets between 1990 and 2019
- Week 3: Dominance of the Tech-Industry?
- Week 4: Characteristics of Big Winners?
**Week 5: Identifying Big Winners Upfront?**- Week 6: Even Big Winners had Bad Drawdowns
- Week 7: The Same Pattern for US Mutual Funds

Everything that follows is only my summary of the original paper. So unless indicated otherwise, all tables and charts belong to the authors of the paper and I am just quoting them. The authors deserve full credit for creating this material, so please always cite the original source.

## Data and Methodology

The data and methodology is mostly identical to the last two weeks. The author uses stock market data from CRSP to construct a sample that includes 26,285 US companies between 1950 and 2019. So it is very similar to the US sample of the original paper. For fundamental data, he uses Compustat. Both services are state of the art for finance-research, so data quality shouldn’t be an issue. The two main concepts remain total-returns in excess of cash^{1}Total return means that all dividends are reinvested. For brevity, I will just refer to “returns” or “decade-returns” in the following. and Shareholder Wealth Creation (SWC).^{2}SWC measures the total wealth that a company creates over its lifetime in excess of risk-free treasuries. Most importantly, it accounts for the fact that dividends cannot be reinvested in aggregate. Therefore, it takes the perspective of a hypothetical investor who owns the entire company.

The author again calculates returns and SWC for non-overlapping decades. But since he now focuses on predictive analyses, he can only use the six decades between 1960 and 2019. The first decade has to be excluded because there is no preceding data to predict returns and SWC. The following chart (hopefully) clarifies this structure.

For each of the six decades, the author identifies the best and worst 200 companies^{3}Each top- and bottom-list now consists of 1,200 companies. 200 firms for six decades: 200 x 6 = 1,200 in terms of total returns and SWC for each decade (Top 200 and Bottom 200, respectively). Finally, he determines the average fundamentals for each company over the respective *prior* decade. As illustrated above, this generates a dataset of future-returns and past-fundamentals for predictive analyses.

With respect to observable characteristics, the author uses the same 22 fundamental- and market-variables as in the previous paper. Most of them are fairly self-explanatory, so I will not repeat them at this point and you will find them in the tables below. Just one reminder: the fundamental variables are somewhat arbitrary but they cover many aspects of the underlying business (for example Asset Growth or Income-to-Asset Ratios). Overall, the selection is fairly robust and the author hasn’t cherry-picked a few variables that worked well in the past.

Despite its problems with outlier-affected data, standard linear regression is again the main methodology of the paper. Specifically, the author regresses indicator variables for the Top- and Bottom 200 on the fundamental characteristics of the prior decade.^{4}Such indicator- or dummy-variables are 1 if the company is among the Top-/Bottom 200 and 0 else. As I mentioned last week, I don’t think linear regression is the right methodology for this problem. But since the paper is structured that way, I will present the results anyway. However, I encourage everybody to focus more on the overall pattern and discount the results appropriately.

## Important Results and Takeaways

### Future top-performers tend to be younger, produce higher drawdowns, and spend more on R&D

Let’s start with the firms that generated the highest- and lowest decade-returns in the sample. The following table summarizes the regression results for the Top- and Bottom 200. As I mentioned before, there are some methodological problems, so I will do my best and focus on the few variables with the highest statistical significance.

Based on those results, future top-performers tend to be younger companies (*t = -3.01*), produce higher drawdowns (*t = 5.46*), and spend relatively more on R&D (*t = 3.78*). For all remaining variables, the t-statistics do not exceed the author’s conservative threshold of +/- 3.^{5}I appreciate the author’s conservative position on his results, especially because of the weaknesses of the simple linear regression. However, some other variables come fairly close and may also be relevant to forecast top-performers. For example, asset growth net of issues (*t = 2.26*) and the standard deviation of stock returns (*t = -2.17*).

On the other end, there are slightly more significant effects for the Bottom 200. Those firms tend to be less profitable (*t = -5.24*), spend less on R&D (*t = -4.59*), have more volatile returns (*t = 5.74*), and are more indebted (*t = 3.94*). They also tend to be younger (*t = -2.93*), so company age is unfortunately (but also unsurprisingly) not a distinct characteristic of top-performers.

In my opinion, most of those statistical patterns are quite intuitive and certainly help to avoid stupidities. The few statistically significant effects suggest that *future* top-performers are fairly difficult stocks *today*. They tend to have short histories, spend a lot on (uncertain) R&D, and suffer from higher drawdowns. This doesn’t sound like the easiest companies to invest in. But obviously, if you catch one of the few companies that make it through this stage you get rewarded by outstanding returns. I think the regressions are therefore quite consistent with the overall results of this paper-series.

One final remark: I suspect that the results are substantially driven by the few very successful tech-companies. Firms like Amazon, Google, or Facebook are fairly young (at least compared to titans like Exxon), spend insane amounts on R&D, and crashed several times on their way to the top. However, note that those are just my personal thoughts. I haven’t seen the data and therefore cannot proof anything.

### Future wealth-creators tend to be older, more levered, and pay higher dividends

Having looked at the top-performers in terms of decade-returns, let’s now move to the top wealth-creators. It is basically the same table except that the Top- and Bottom 200 are now ranked by SWC instead of returns. Since SWC is an absolute concept, the coefficients for market capitalization are heavily significant for both groups (*t > 20*). Given that size matters for absolute wealth-creation, this is not really surprising.

Future wealth-creators tend to be older (*t = 12.99*), pay higher dividends (*t = 7.03*), had smaller drawdowns (*t = -4.71*), have more debt (*t = 4.29*), and grow assets faster than sales (*t = -3.30*). Those patterns are quite different than those for top-performers. But given that SWC and total returns are different concepts, this is not necessarily a bad sign.

At the other end, future wealth-destroyers tend to be older (*t = 4.23*), grow their fixed assets (*t = 7.48*), trade at higher market-to-book ratios (*t = 7.53*), are also profitable (*t = 3.01*), and grow their assets faster than sales (*t = -6.04*). Interestingly, there are also signs for a long-term reversal. Firms that were among the top-performers in the *prior* decade tend to be wealth-destroyers in the *next* one (*t = 13.64*). However, this effect is limited to SWC and doesn’t exist for future winners in terms of returns.

### Identifying big winners remains challenging

Overall, there are clearly some statistical patterns to identify top-performers and wealth-creators *ex-ante*. But I think no real-world investor denies that identifying big winners remains a very challenging endeavor.^{6}If it would be easy, we would all do it, right? The R^{2} of the regression models above are 0.008 for the top-performers and 0.116 for the top wealth-creators, respectively.^{7}Generally, the R^{2} ranges from 0 to 1 and indicates the fraction of variance explained by the model. For example, a value of 0.2 means that fundamental characteristics explain 20% of the variation in SWC. The higher number for SWC is mostly due to the strong impact of market capitalization. Nonetheless, both R^{2} are fairly low and hardly enough for reliable forecasts.

Of course, this doesn’t mean that fundamentals are irrelevant. The F-statistics^{8}F-Statistics are used to test the joint statistical significance of all explanatory variables in a regression model. for all regressions suggest that all explanatory variables together are significantly related to the Top- and Bottom 200 indicators. Fundamentals are therefore an important, but unfortunately also only a small piece of the puzzle.

Finally, since it seems very hard to pick a few winners, one might come to the idea to weed out the worst losers. But unfortunately, the pattern isn’t much different for the Bottom 200. Avoiding the bad losers therefore appears equally difficult like picking the big winners.^{9}Once again, if it would be easy, we would all do it, right?

## Conclusions and Further Ideas

Forecasting the performance of a company over the next 10 years is very difficult, if not impossible. Therefore, I again suggest to discount all specific results with some skepticism. However, I believe the few statistical significant patterns draw a quite reasonable picture that is certainly useful.

All else equal, future top-performers tend to be difficult stocks today. They have limited histories, invest a lot in (often uncertain) R&D, and had some bad drawdowns before their great decade. Similarly, the worst future performers have undesirable fundamental characteristics like more debt, more volatile returns, and less profitability.^{10}Debt is of course not automatically bad. I would like to add *too much* here, but the regressions unfortunately doesn’t tell us specific debt levels that are problematic. Despite my criticism of the regression methodology, I think those general patterns are reasonable.

Future wealth-creators exhibit somewhat different characteristics, but overall they also tend to be of higher fundamental quality. In addition to that, there is some evidence for a long-term reversal: past top-performers tend to be future wealth-destroyers. Such a pattern is consistent with many other studies and has been documented before.^{11}Most prominently, De Bondt and Thaler (1985).

All those results are small pieces that (may) help to identify the next big winners. But it remains difficult and although fundamentals are important, they are far from everything. You most likely need some other skills and insights to identify the few big winners. In addition to that, identifying them is not even enough. You also need the gut to invest in them for the long-term. As we have seen, this is not necessarily easy because future winners are fairly difficult stocks today. Next week, I will dig a little deeper into this issue and we will see that their ride to the top is anything but smooth.

- AgPa #72: Machine-Reading of Private Equity Prospectuses
- AgPa #71: Go Where the Earnings (per Share) Are
- AgPa #70: Equal vs. Market Cap Weights
- AgPa #69: Rebalancing Luck

*This content is for educational and informational purposes only and no substitute for professional or financial advice. The use of any information on this website is solely on your own risk and I do not take responsibility or liability for any damages that may occur. The views expressed on this website are solely my own and do not necessarily reflect the views of any organisation I am associated with. Income- or benefit-generating links are marked with a star (*). All content that is not my intellectual property is marked as such. If you own the intellectual property displayed on this website and do not agree with my use of it, please send me an e-mail and I will remedy the situation immediately.* *Please also read the Disclaimer.*

## Endnotes

1 | Total return means that all dividends are reinvested. For brevity, I will just refer to “returns” or “decade-returns” in the following. |
---|---|

2 | SWC measures the total wealth that a company creates over its lifetime in excess of risk-free treasuries. Most importantly, it accounts for the fact that dividends cannot be reinvested in aggregate. Therefore, it takes the perspective of a hypothetical investor who owns the entire company. |

3 | Each top- and bottom-list now consists of 1,200 companies. 200 firms for six decades: 200 x 6 = 1,200 |

4 | Such indicator- or dummy-variables are 1 if the company is among the Top-/Bottom 200 and 0 else. |

5 | I appreciate the author’s conservative position on his results, especially because of the weaknesses of the simple linear regression. |

6 | If it would be easy, we would all do it, right? |

7 | Generally, the R^{2} ranges from 0 to 1 and indicates the fraction of variance explained by the model. For example, a value of 0.2 means that fundamental characteristics explain 20% of the variation in SWC. |

8 | F-Statistics are used to test the joint statistical significance of all explanatory variables in a regression model. |

9 | Once again, if it would be easy, we would all do it, right? |

10 | Debt is of course not automatically bad. I would like to add too much here, but the regressions unfortunately doesn’t tell us specific debt levels that are problematic. |

11 | Most prominently, De Bondt and Thaler (1985). |