After questioning the added value of mutual fund managers, this week’s AGNOSTIC Paper examines a probably even more controversial topic: Environmental, Social, and Governance a.k.a. ESG. Broadly speaking, ESG refers to the idea that investors should consider those three dimensions in their decisions and thereby contribute to a more sustainable economy. For example, a large sovereign wealth fund might forgo a profitable investment in an oil company because it emits a lot of CO2.
In general, this idea is not bad because capital markets are a powerful tool to efficiently allocate resources in an economy. The implementation, however, is not so easy. To guide any decisions, there must be a common idea what ESG actually is. But as we will see in this week’s paper, there isn’t much agreement about that right now.
Everything that follows is only my summary of the original paper. So unless indicated otherwise, all tables and charts belong to the authors of the paper and I am just quoting them. The authors deserve full credit for creating this material, so please always cite the original source.
Setup and Idea
The most common way to implement ESG in the investment industry are ratings. Similar to credit ratings, data providers analyze the ESG dimensions of companies and (try to) quantify them in a rating.1In many cases, the large established data providers like S&P Global, MSCI, or Bloomberg also offer such ESG ratings to their clients. Investors then use those ratings to manage their portfolios with respect to ESG and its dimensions. For example, you could engineer a portfolio with a better carbon footprint than the benchmark index. So except that it’s not necessarily a financial variable, the mechanics shouldn’t be new for portfolio managers.
Together with the overall awareness for climate change, the industry around ESG experienced enormous growth over the last years. There are now a lot of sources for ESG data, but the market remains dominated by the few large financial data providers. The way investors use those services is very diverse. Some (and probably most) rely on ESG ratings to simply exclude the worst companies from their portfolios.2For example, most mutual funds no longer invest in companies with “bad” products like gambling or tobacco. Others target specific ESG scores on the portfolio level. Still others deliberately invest in “bad” companies but engage with boards to make positive changes. And finally, some only invest in companies that are actively contributing to sustainability.3This is often called “Impact Investing” and the highest end of “goodness” in the investment industry. Depending on the approach, the role of ESG ranges from just being an additional restriction to being the primary objective.
So much about the current state of the industry. But despite all advances, there are two fundamental problems with ESG.4There are a lot more problems and absurdities with ESG. For example, that some asset managers claim you can score well on ESG and consistently have better returns. But this is an issue for another post. First, because of its high growth and still early stage, the industry is quite messy. There are now so many data providers and consultants that it’s really hard to keep track. Each of them interprets ESG different and a common standard is lacking.
Second, this lacking common standard is unfortunately a fundamental feature of ESG. Consider again the comparison to credit ratings. The objective here is straight forward: credit ratings should tell us something about the probability of default. Each rating agency has of course it’s own methodology. But in the end, it’s a simple issue: companies must make enough money to pay their interest and notionals. This is not so simple for ESG. Of course, we may agree that CO2 emissions are bad and companies should reduce them. But what is more important? CO2 emissions or fair treatment of employees and suppliers? Or maybe good governance with a “fair” compensation for the CEO? But who decides what a “fair” compensation is?
There are no right or wrong answers to those questions. It all depends on the values and convictions of the individual person. So in the end, everyone must decide for herself about the “right” ESG definition. This is again very different for credit ratings. A default is a precisely defined event that leaves little room for interpretation – either you meet your obligations or not. By construction, we don’t (and may never) have this clarity for ESG. This week’s paper documents the extent of this issue with data on ESG ratings for S&P 500 companies.
Data and Methodology
The authors construct a sample of ESG ratings for S&P 500 companies5The final coverage is slightly smaller. Depending on the ESG data provider, there are between 438 and 468 companies in the sample (see table below). that ranges from 2010 to 2017. Although this sample appears quite small for the usual standards, remember that ESG is a rather new phenomenon. So I would say that 2010 to 2017 is actually quite comprehensive. The ESG ratings are from Asset 4 (acquired by Refinitiv), Sustainalytics, Inrate, Bloomberg, FTSE, KLD, and MSCI IVA. These are of course not all ESG data providers out there, but according to the authors, their sample includes the most important ones. Finally, the authors match the ESG data with stock returns and fundamental data from CRSP and Compustat.6Both are state-of-the-art services for data on US companies. The following table summarizes the sample and the methodology of the seven ESG data providers.
To compare the different rating scales, the authors rank companies by each of the seven ratings and use the percentile ranks as adjusted ESG scores. So keep in mind that all of the following results are based on ranks and not the original ratings.
Important Results and Takeaways
ESG ratings disagree: the average correlation is just 0.45
The table below presents descriptive statistics and pairwise correlations for the seven ESG data providers. Panel A shows the results for the aggregate ESG score and panels B-D show the same data for each dimension separately (E, S, and G). Strikingly, the average correlation of ESG scores among the seven data providers is just 0.447. Compared to the 0.99 correlation of credit ratings, this number is quite low and suggests fairly different methodologies among the data providers.7The authors quote the 0.99 correlation for credit ratings from the paper I will discuss in the next post.
There are also some notable differences beneath the surface. First, for some pairs of data providers, the correlation is actually quite high. For example, it amounts to 0.75 for Bloomberg and Asset4. Second, there are considerable differences between ESG dimensions. The average correlation for Environmental scores is with 0.455 the largest of all and also slightly higher than for the aggregate score. On the other end, the average correlation reaches only 0.155 for Governance. The Social dimension is with 0.33 right in-between.
In my (and the authors) opinion, these results are not surprising. Environmental issues are relatively easy to quantify and there is a broad consensus that less pollution/emissions/wastage is better than more. In contrast, it is more difficult to agree on and to quantify social- and governance issues. The data providers therefore use quite different methodologies to rate those two dimensions. This translates into lower correlations.
To get into even more detail, the authors also look at the average correlations of ESG scores within industries.8They use the Fama-French 12 Industry classification, a well-known standard in the literature. The following chart summarizes the results. There is clearly some heterogeneity among industries. For example, the correlation of total ESG scores is considerably lower for Consumer Durables and Telecommunications. But the overall pattern remains unchanged: correlations of ESG scores are not very high.
To sum up the main result once again. The average correlation among ESG ratings is relatively low and suggests considerable disagreement among data providers. For the Environmental dimension, this problem is least severe while it is worst for Governance. Again, these results are not surprising. For a large part, ESG is a very subjective concept and depends on individual views, values and convictions. The modest correlations just reflect this fundamental (and hardly solvable) problem.
There is less ESG disagreement for larger, more profitable firms with credit ratings
While it is not their primary focus, the authors present some interesting statistics on how firm characteristics relate to the extent of ESG disagreement. They measure ESG disagreement by the standard deviation of the seven ESG scores for each company at each point in time. The higher this number, the more disagreement among the data providers. The authors then regress this variable on a variety of firm characteristics.
According to the results, ESG disagreement is significantly related to the following three fundamentals. First, ESG disagreement tends to be lower for more profitable companies (higher gross profitability). On the contrary, it tends to be higher for companies without a credit rating. And finally, it also tends to be higher for larger companies (higher market capitalization).
The authors don’t provide specific explanations but suggest that the results are reasonable and in-line with the literature. More profitable firms may have the resources to be more transparent with their ESG engagements. On the other hand, larger companies are more complicated to analyze and certainly covered by more ESG analysts (which automatically leads to more opinions and some disagreement).
Investors demanded a risk premium for ESG uncertainty
In the final part of the paper, the authors examine the relation of ESG disagreement and stock returns. To do that, they first run predictive regressions of monthly returns on their ESG disagreement proxy and also control for other known return-predictors. They find a significantly positive relation of ESG disagreement and stock returns with a quite remarkable magnitude. Moving from the first quartile of ESG disagreement to the third (one interquartile range) tends to increase equity returns by 92 basis points on average. So the extent of ESG disagreement appears to be related to firms’ cost of capital.
More practically, the authors also examine the effect with a backtest. They sort companies by ESG disagreement into quintiles and construct equal weighted portfolios that are rebalanced each January (based on the lagged ESG scores of December). To rule out that returns are driven by industry exposures, they also adjust ESG disagreement by de-meaning the variable within each industry. The following table presents average returns, Sharpe ratios, and alphas versus common factor models for the strategy. Also note that those statistics are after estimated trading costs and thus quite practical.
The performance statistics support the previous result. Companies with higher ESG disagreement earned a statistically significant return premium over those with less disagreement. The average monthly return of the corresponding long-short portfolio was about 0.18% per month after transaction costs (2.16% per year). Alphas versus common factor models are also significant and of similar magnitude. The effect somewhat diminishes for each individual ESG dimension but the overall pattern remains intact.
What is driving this return premium? Well, one explanation is simply that investors don’t like (ESG-) uncertainty and demand a premium for bearing it. So in that sense, ESG disagreement would be just another source of risk that investors are pricing in.
Conclusions and Further Ideas
In my opinion, the results are very helpful to better navigate through the confusing world of ESG and sustainable investing. As I mentioned in the introduction, the general idea to use financial markets to transition towards a sustainable economy is a good one. However, the authors show impressively that the practical implementation is not so easy. For a large part, ESG is simply to subjective and hard to agree upon. Investors, data providers and the general public should therefore focus on the few ESG issues that are easy to measure and where there is some consensus.9In a recent Special Report, The Economist comes to a very similar conclusion and even suggests to limit all ESG efforts to just carbon emissions. Full disclosure: this special report brought me to the idea of writing this post and I don’t think this idea is too bad. The authors don’t mention this specifically, but in my opinion, it’s an evident implication of the results in their paper.
Of course, this paper is just a (very good) first step in the right direction. The sample is limited to S&P 500 companies and the period from 2010 to 2017. This is a long period for a new phenomenon like ESG. But at the same time, the sample doesn’t cover the last four and a half years which were arguably even more important for ESG. So before making overall statements, I would like to see similar results for international companies and different time periods.10Spoiler: next week’s paper will provide such an out-of-sample test. So I am actually quite confident that the results are not just statistical artifacts. For example, European companies might be ahead with ESG disclosures because of stricter regulation in the EU.
All limitations aside, the results offer some very important insights. First, the ESG profile of a company is related to its characteristics and its cost of capital. As a consequence, shareholders and boards should care about ESG. Second, not only the ESG rating matters but also (and probably even more) the disagreement among different data providers. Investors should therefore (if budget permits) always try to incorporate multiple ESG ratings and also look at the dispersion of them. They maybe even have the chance to earn some return premium by focusing on stocks where ESG disagreement is highest.
Since I became somewhat interested in the topic, I will also devote next week’s post to the disagreement of ESG ratings. So it’s again a little series and the next paper will go into more detail about how and why ESG ratings are so different.
- AgPa #66: Machine-Learned Manager Selection (2/4)
- AgPa #65: Machine-Learned Manager Selection (1/4)
- AgPa #64: Fund Manager Multitasking
- AgPa #63: Fire the Winners and Hire the Losers
This content is for educational and informational purposes only and no substitute for professional or financial advice. The use of any information on this website is solely on your own risk and I do not take responsibility or liability for any damages that may occur. The views expressed on this website are solely my own and do not necessarily reflect the views of any organisation I am associated with. Income- or benefit-generating links are marked with a star (*). Please also read the Disclaimer.
|1||In many cases, the large established data providers like S&P Global, MSCI, or Bloomberg also offer such ESG ratings to their clients.|
|2||For example, most mutual funds no longer invest in companies with “bad” products like gambling or tobacco.|
|3||This is often called “Impact Investing” and the highest end of “goodness” in the investment industry.|
|4||There are a lot more problems and absurdities with ESG. For example, that some asset managers claim you can score well on ESG and consistently have better returns. But this is an issue for another post.|
|5||The final coverage is slightly smaller. Depending on the ESG data provider, there are between 438 and 468 companies in the sample (see table below).|
|6||Both are state-of-the-art services for data on US companies.|
|7||The authors quote the 0.99 correlation for credit ratings from the paper I will discuss in the next post.|
|8||They use the Fama-French 12 Industry classification, a well-known standard in the literature.|
|9||In a recent Special Report, The Economist comes to a very similar conclusion and even suggests to limit all ESG efforts to just carbon emissions. Full disclosure: this special report brought me to the idea of writing this post and I don’t think this idea is too bad.|
|10||Spoiler: next week’s paper will provide such an out-of-sample test. So I am actually quite confident that the results are not just statistical artifacts.|