AgPa #40: Size Effect – Fact and Fiction

Fact, Fiction, and the Size Effect (2018)
Ron Alquist, Ronen Israel, Tobias Moskowitz
The Journal of Portfolio Management Fall 2018, 45 (1) 34-61, URL/AQR

After examining several Facts and Fictions around factor investing in general, momentum, value, and low-risk, this week’s AGNOSTIC Paper tackles the final anomaly.1Note that my articles differ from the original publishing order. I found it more sensible to start with a general overview and then go into each factor separately. But feel free to read the posts in whatever order you like. The size effect received a lot of attention in both academia and the investment industry, probably because it is one of the oldest documented anomalies.2Just to clarify: I use size, size effect, and size factor interchangeably throughout this article. The idea is always the same. In this final paper of their Fact and Fictions series, the authors examine some myths around it.

Everything that follows is only my summary of the original paper. So unless indicated otherwise, all tables and charts belong to the authors of the paper and I am just quoting them. The authors deserve full credit for creating this material, so please always cite the original source.

Setup and Idea

From all factors, size is probably the simplest one. The size effect is the phenomenon that stocks from companies with smaller market capitalization (small stocks) historically outperformed those from companies with larger market caps (large stocks). As usual, the size premium is the return on a portfolio that goes long small stocks, and shorts the large ones. A very simplified example for such a strategy would be a portfolio that is long the Russell 2000 (small and mid caps) and short the Russell 1000 (large caps). Although this has been a terrible bet over the last decade, size still receives quite some attention in academia and practice. So let’s see if this is justified (spoiler: it is not)…

Data and Methodology

There is not so much to say about specific data or methodology. The paper is an overview about the size effect and some misconceptions around it. That said, the authors use a lot of data and references to back up their arguments. To be as transparent as possible, they use the well-known and publicly available size factor from Kenneth French’s website for many of their analyses.

Important Results and Takeaways

Fiction: Size is the strongest documented factor

The authors start straight-ahead. Despite being one of the oldest and most studied anomalies, empirical evidence for the size effect is actually quite weak. In particular, it is much weaker than for momentum, value, and low-risk. But let’s look at it step by step.

One of the first studies of the size effect is Banz (1981) who finds that “[…] smaller firms have had higher risk adjusted returns, on average, than larger firms.” in a sample of US stock returns between 1936 and 1975. In the first step, the authors examine the returns of Kenneth French’s Small Minus Big (SMB) factor over the same period. The results are disappointing. The hypothetical long-short portfolio returned only 1.9% per year at an annualized volatility of almost 10%. This yields a fairly low Sharpe ratio of 0.19. In addition to that, the returns are not statistically significant. The t-statistic stands at only 1.21 which is way below common thresholds.31.96 is the common threshold for a 5% significance level under normality. However, to guard against data-mining, some finance-researchers even suggest t-statistics >3 to ensure robust findings. No matter how you look at it, the evidence for a size premium is weak. Both economically and statistically.

But it gets even worse. In the next step, the authors examine the risk-adjusted return of the size factor during the same sample period. Specifically, they calculate the annual alpha with respect to the market return from Kenneth French’s website. The results show a slightly negative alpha that is statistically indistinguishably from zero. What does this mean? Well, it means that even though small caps performed slightly better than large caps in the past, this is was not a genius alpha-producing strategy. Since there are no positive alphas, the modest historical size premium simply comes from higher betas of small stocks. So you can (in theory) use small caps to run a portfolio with more risk and higher expected returns, but you shouldn’t expect alpha just from investing in smaller companies.

Obviously, the question arises why Banz could publish his paper in 1981 although we are attacking it as weak evidence today. Did the scientific method not work back then? Quite the opposite. The authors explain that most of the early size studies were, without knowing it at the time, based on erroneous data. For example, CRSP, the go-to database for US stock returns, suffered from a substantial delisting bias until 1997. The database therefore systematically overestimated the returns of delisted stocks. Quite intuitively, this lead to an inflated size premium as delistings are much more common among smaller companies. If you run the analysis with today’s improved database, you therefore get different results for the same sample period than the researchers back in 1981. Again, this doesn’t mean the old papers are bad. It just means we improved the data quality and know more about markets than 40 years ago. If anything, this is a good thing and shows that science works exactly as it should.

Finally, the authors compare the size premium to other factors over the longest possible sample for each factor. Throughout all measures (return, volatility, alpha, Sharpe ratio, statistical significance), size is considerably weaker than the other ones. So no matter how you look at it. With today’s understanding of markets and data, there is only weak evidence for a size premium. But this is just the beginning. Many of the following facts and fictions present further evidence against the existence of a size premium…

Fact: The size effect weakened since its discovery

To strengthen their arguments further, the authors next look at the out-of-sample returns of the size factor after its original discovery in the 1980s. Size had a fabulous decade from the decade after the original sample period (1976-1986) and achieved a Sharpe Ratio of almost 1. However, during the subsequent >30-year period until 2017, the results are substantially weaker and sometimes even negative. With the benefit of hindsight, we also know that small caps continued to underperform large caps until recently.

There are plausible theories for the fading size premium like increasing activity of small cap funds and ETFs, but this doesn’t make the factor more attractive. In fact, this is weak out-of-sample evidence on top of already weak in-sample evidence. This doesn’t speak for a robust empirical pattern and questions the size premium further.

Fiction: The size effect is robust across different measures

The next robustness-test addresses measurement. As already mentioned in the previous posts, different measures for the same phenomenon should lead to similar results. You don’t want to believe in an effect that only works if measured by one particular variable. That would almost scream for data-mining… Furthermore, if there is no theory or other plausible reasons to favor one measure over others, a composite of different ones is usually the best choice to ensure robust results.

How does this look like for size? Virtually all studies (and probably most investment products) simply rank companies by market capitalization. If the size premium is truly robust, other measures of company size should lead to a similar return premium. The authors test this ideaand compare alphas of size portfolios based on market capitalization (the standard), total assets, book equity, sales, physical capital (PPE), and employees. All are plausible measures for company size, but everyone except market capitalization and PPE generated negative and insignificant alphas (for some measures up to -9.2% per year). This shows that the size premium is not robust across different measures. You know what is coming, this is further evidence against the existence of a size premium.

For the next fact, the authors cite some of the earliest research on the size effect and explain that most of the historical size premium comes from Januaries. In fact, this seasonality is so pervasive that it became known as January effect for itself.

In a very simple test, the authors compare the performance of two hypothetical timing strategies. The first one invests in the SMB factor during Januaries and moves to cash for the rest of the year. The second one holds cash in January and invests in SMB for the rest of the year. The results are astonishing. Basically the entire SMB factor premium comes from Januaries. For all other months, the size premium is at best flat, but historically even slightly negative. No matter what you think about seasonalities, the results suggest that if there is any size premium, it exclusively comes from Januaries.

Fiction: Size also works in international equity markets

Next, the authors turn to the mother of all out-of-sample tests. If you discover an effect in US data, an evident way to stress-test your findings is to go international. I mean, would you take a medical treatment that works (without plausible reasons) in the US but not in Europe? I certainly wouldn’t. So why invest in a strategy that produces materially different results in other markets?

The authors follow the methodology of Kenneth French and calculate a SMB premium within 24 equity markets between 1984 and 2017. The results are once again not very promising. With respect to statistical significance, none of the 24 size premiums reaches the threshold for the commonly used 5% significance level. In fact, they are not even close. In 12 of the 24 countries, the size premiums were even negative over the sample period. A 50-50 distribution of insignificant positive and negative size premiums – this is anything but robust evidence and strongly questions the existence of the size premium.

Fact: Size does not work within other asset classes

If international markets were the mother of all out-of-sample tests, this one goes even one step further. If size is a systematic driver of return differences (no matter if the effect comes from a risk-based or behavioral mechanism), it should also explain returns of other assets than stocks. Before we come to the results, the authors already note that size is not so intuitive for other assets. You can easily calculate the price momentum of a commodity, but what is the size of oil? This is not so easy and I know few people who trade commodities and try to come up with a size measure (but I know quite some who look at momentum measures)…

To still run some tests, the authors suggest the following methodology for different asset classes. For equity indices, they calculate size as the total market capitalization of the respective country and go long (short) an equal-weighted portfolio of the smallest (largest) ones. For corporate bonds, they follow the same methodology as for stocks and use the market capitalization of the underlying firm. Finally, they also use the GDP of 24 countries to calculate a size premium for currencies.

Arguably, some parts of the methodology sound a bit strange. I mean, who really trades on the aggregate market capitalization of equity indices? However, In my opinion, this strangeness already shows that there is no appealing story underneath the size premium. The results for the other asset classes support that. The authors find no significant size premium in any of the three asset classes. So this is another out-of-sample test that the size premium did not survive.

Fact: Most of the size effect are micro cap stocks

Unsurprisingly, the size factor is long the smallest companies of the equity universe. By construction, it therefore also includes extremely small micro caps that are difficult and costly to trade. A common critique of the size effect is thus that once you control for the small and illiquid micro caps, there is not much left. The authors support this argument and show that the the historical size premium disappears after removing the smallest 5% of firms from the portfolio. According to their calculations, this corresponds to excluding companies with a market cap below $18M. This is very small and definitely no unrealistic filter for real-world investors. Further adding to the concerns about the evidence for size, the results suggest that even if there is a premium, it is hardly implementable in practice.

Fact: Size is difficult to implement in real-world portfolios

Closely related to the previous fact about micro caps are of course trading costs. A factor that only works “on paper” may be appealing academically, but for real-world investors it should also remain profitable after implementation costs (and management fees). To test the issue for size, the authors report different measures for liquidity along market-cap deciles of stocks.

As expected, smaller stocks are more expensive to trade and transaction costs decrease with market caps. The magnitude is in fact quite striking. For some measures, the smallest companies are six times more expensive to trade than their largest counterparts. Unsurprisingly, transaction costs therefore heavily impact the real-world returns of the size factor. Using their most optimistic estimates, the authors show that you can invest about $5B in a size strategy before transaction costs eat up the entire (historical) premium. Once again, this result shows that even if there is a size premium, it is very difficult to exploit it at institutional scale.

Fiction: The size effect is more than just a liquidity effect

In this fiction, the authors again take on the issue of transaction costs and test if there is more to the size effect than just a liquidity premium. Using their previous transaction cost measures, they construct liquidity factors and relate them to the returns of the size factors. In line with the idea of a liquidity premium, they find that transaction costs and liquidity measures basically explain the entire size premium.

This suggests that the historical size premium was mostly a compensation for holding some very illiquid micro caps. This is fine. All else equal, an illiquid asset should compensate for that with a return premium. However, the fact that the size premium disappears after controlling for liquidity once again suggests that there wasn’t a dedicated premium in the first place. Instead, size / market capitalization just seems to be a very good proxy for liquidity.

Fiction: There are economic theories for the size effect

Related to the previous fiction, the authors next take the question of other theories behind the size premium. As shown before, there is hardly anything left unexplained after controlling for liquidity. But the literature (and practitioners) still came up with other ideas…

However, a common element of such theories remains the idea that size is a proxy for something else. For example liquidity, earnings growth, or distress risk. As I mentioned before, there is nothing generally wrong with that. But if I know what size proxies for, why not use it directly? If you want to own illiquid stocks to earn a premium, you can just look at liquidity measures. Most of such theories are thus not really convincing because they eventually come back to other, well-known effects like an illiquidity-premium.

Behavioral explanations and limits to arbitrage are also not very appealing.4Limits to arbitrage are factors that prevent investors from exploiting mispricing. For example, if you cannot short an overvalued stock, this is a limit to arbitrage (for example GameStop in early 2021). It is generally true that mispricing is probably more likely among smaller stocks. However, behavioral finance also suggests that overpricing is more likely than underpricing. The same holds for limits to arbitrage. If you find an undervalued small cap, you can just buy it. But if you find an overvalued one, you cannot always short it. All else equal, this may lead to elevated prices among small caps. If anything, higher prices suggest lower expected returns which is exactly the opposite of the size premium. According to the authors, some of the big ideas in behavioral finance thus also don’t support the size premium.

Fiction: Size works because other factors are stronger among small cap stocks

Although it might be confusing in the first place, this is a very important one. By now, there is hopefully enough evidence that the size effect by itself is not very convincing. However, it is true that many factors (and other investing strategies) tend to work better among small caps. There are several plausible reasons for this. For example, less competition because small caps offer too low capacity for large institutions, higher volatility, or a generally larger opportunity set. But this doesn’t change the weak evidence for a standalone size effect.

Fact: There are reasons to overweight small caps even without the size effect

Building on the previous fiction, it is important to understand the following difference. Holding small companies just because they are small doesn’t make too much sense (at least according to this article). Holding small companies because they look attractive on other metrics, however, is fine and in line with the empirical evidence. Having said that, many papers find that larger factor premiums among small caps don’t consider trading costs. Depending on the strategy, the advantage may narrow or even disappear after cost. But in general, you can overweight small caps without believing in the size effect. As long as you combine it with another strategy, this is not inconsistent.

Fact and Fiction: The size effect is stronger when controlling for other factors

The fact that many factors work better among small caps already suggests interactions between size and other factors. In this section, the authors analyze this issue in more detail. It is named Fact and Fiction because the truth of the statement critically depends on which other factors you control for.

Specifically, the authors go back to their different measures of size and calculate annual alphas with respect to the CAPM (the base model) and several other factor models. They find a very interesting pattern… Controlling for value and momentum doesn’t change much. The size premium remains statistically insignificant and weak.

However, as soon as you add some type of quality factor, the picture changes dramatically. Controlling for the profitability and investment factor of the Fama-French five factor model already yields a much more robust size premium. If you replace the Fama-French factors by the Quality Minus Junk factor, the results are even more extreme. Controlling for the overall market, value, momentum, and quality yields statistically significant size alphas with t-statistics of more than 4 throughout all size measures. This is now very robust and the other factors seem to resurrect the size premium.

After drifting to the jargon of multi factor models, back to the normal world. What do those results mean? Well, the quality simply distorts the relation between company size and expected returns. From the quality premium we know that companies with better fundamentals tend to outperform crappier companies on average (quality is better than junk). The problem of size: many small companies are crappy whereas large companies are often fundamentally better.5Quite intuitive because a crappy firm is unlikely to grow large in the first place…

Following the size effect and simply going long small stocks and short large ones, is therefore an implicit bet against the quality premium. Given that there is strong evidence for a quality premium, this is a bad strategy. The quality premium therefore resurrects the size effect in the sense that small cap quality tends to outperform large cap quality on average.

Note, however, that this is not evidence for a standalone size effect, but just another example how other factors work better among smaller companies. Just buying small companies because they are small is a bad idea as you are going to be long junk and short quality.

Fact: Size receives a lot of attention despite weak evidence

The authors already mentioned it in the introduction, but it is true that despite its weak evidence, size received a lot of attention in the literature and investing public. Using data from Google Scholar, they show that there are more papers and citations on the size factor than on other, more robust, factors like momentum or low-risk. The authors explain this pattern by some type of path dependency. Size was the first anomaly that challenged the CAPM back in the 1980s. As a consequence, many researchers hopped on the issue and produced papers. Similarly, the investment industry tried to capitalize on the findings early on and launched legions of small cap funds. In addition to that, but that is my personal opinion, the idea of size is also simply easier to understand (and sell) than a multi-factor portfolio.

Conclusions and Further Ideas

Given that this post got longer than originally expected, let me again summarize a few important points. Although most of them hopefully came through in the respective Facts and Fictions, I think it is still important to mention them explicitly.

First, the evidence for a standalone size effect is weak and much of the early results come from erroneous data. However, most of this criticism addresses the risk-adjusted size premium. If there ever was a premium on smaller companies, most of it simply comes from higher betas. This is nice if you want to take more risk and don’t have access to leverage, but it is not alpha.

Second, just because the evidence for a standalone size premium is weak doesn’t make the factor useless altogether. We have seen that there are meaningful interactions with other factors (especially quality) and the factor is also useful to evaluate what active portfolio managers are doing.

In the end, it is quite fascinating that the factor which received the most attention is actually one of the weakest out there. Ironically, that seems to be the way the (investment) world sometimes works…



This content is for educational and informational purposes only and no substitute for professional or financial advice. The use of any information on this website is solely on your own risk and I do not take responsibility or liability for any damages that may occur. The views expressed on this website are solely my own and do not necessarily reflect the views of any organisation I am associated with. Income- or benefit-generating links are marked with a star (*). All content that is not my intellectual property is marked as such. If you own the intellectual property displayed on this website and do not agree with my use of it, please send me an e-mail and I will remedy the situation immediately. Please also read the Disclaimer.

Endnotes

Endnotes
1 Note that my articles differ from the original publishing order. I found it more sensible to start with a general overview and then go into each factor separately. But feel free to read the posts in whatever order you like.
2 Just to clarify: I use size, size effect, and size factor interchangeably throughout this article. The idea is always the same.
3 1.96 is the common threshold for a 5% significance level under normality. However, to guard against data-mining, some finance-researchers even suggest t-statistics >3 to ensure robust findings.
4 Limits to arbitrage are factors that prevent investors from exploiting mispricing. For example, if you cannot short an overvalued stock, this is a limit to arbitrage (for example GameStop in early 2021).
5 Quite intuitive because a crappy firm is unlikely to grow large in the first place…