Friday, May 31, 2013

Dynamic Asset Allocation for Practitioners Part 1: The Many Faces of Momentum

A (Very) Short History of Dynamic Asset Allocation

The field of tactical or dynamic asset allocation has grown dramatically since Mebane Faber published what is perhaps the first broadly accessible paper on the topic in 2007, 'A Quantitative Approach to Tactical Asset Allocation'. Faber's original paper utilized a simple 10 month moving average as a signal to move into or out of a basket of 5 major global asset classes. Over the period 1970 through the paper's 2009 update, this technique generated better returns than any of the individual assets in the sample universe - U.S. and EAFE stocks, U.S. real estate, Treasuries and commodities - and with substantially lower risk than the equal weight basket or a 60/40 stock/Treasury portfolio.


In 2009 Faber published a follow-up paper called 'Relative Strength Strategies for Investing' which introduced the concept of price momentum as a way to distinguish between strong and weak assets in the portfolio. That paper applied an intuitive method of capturing asset class momentum that involved averaging each asset's rate of change (ROC) across five lookback horizons, specifically 1, 3, 6, 9 and 12 months. By averaging across lookback horizons, this approach captures momentum at multiple periodicities, and also identifies acceleration by implicitly weighting near-term price moves more heavily than price moves at longer horizons.


In May of 2012 we published a whitepaper entitled, "Adaptive Asset Allocation: A Primer", a quantitative systematic methodology integrating the simple ROC based momentum concepts introduced in Faber's 'Relative Strength' paper with techniques derived from the portfolio optimization literature. Specifically, the paper explained how applying a minimum variance optimization overlay to a portfolio of high momentum assets serves to stabilize and strengthen both absolute and risk-adjusted portfolio performance.


Article Series


We are going to range far and wide in our exploration of global dynamic asset allocation. This article, the first in our series, will explore a variety of methods to rank assets based on price momentum. The second article will introduce several approaches to rank assets based on risk-adjusted momentum measures. The third article will introduce a framework for thinking about portfolio optimization, including several heuristic and formal optimization methods.


Our fourth article will discuss ways of combining the best facets of momentum with the best techniques for portfolio optimization to offer a coherent framework for global dynamic asset allocation. The objective here will be robustness and logical coherence rather than utilizing optimization for best in-sample simulation performance. 


Lastly, we are considering introducing some ensemble concepts and adaptive frameworks as a cherry on top, but we aren't sure how far we want to go yet, so we'll just get started and see where it takes us.


The following illustrates the proposed framework for this article series:




Methodology


This first article will explore a variety of methods for identifying trend strength for asset allocation, with the goal of comparing and contrasting the various methods under different assumptions of portfolio concentration and asset universe specifications.


We take the position that portfolio concentration is a source of potential data mining bias because the results for dynamic asset allocation approaches can vary widely depending on the number of top assets that are held in the portfolio at each rebalance. Some approaches do better with more concentration, and others with less. We will test with concentrations of top 2, 3, 4 and 5 assets and average the results.


The asset universe can serve as a source of potential 'curve fitting' as well, as it is easy and compelling to want to remove assets from the universe that drag down returns in simulation, or add assets with strong results over the backtest horizon.


To avoid this trap, we run our simulations on a diversified universe of 10 global asset classes, as well as ten other asset universes where we drop one of the ten original assets. This helps to control for the chance that strong performance is simply the result of one dominant asset class over the period.


The ten asset classes we will use for all testing are:

  • Commodities (DB Liquid Commoties Index)
  • Gold
  • U.S. Stocks (Fama French top 30% by market capitalization)
  • European Stocks (Stoxx 350 Index)
  • Japanese Stocks (MSCI Japan)
  • Emerging Market Stocks (MSCI EM)
  • U.S. REITs (Dow Jones U.S. Real Estate Index)
  • International REITs (Dow Jones Int'l Real Estate Index)
  • Intermediate Treasuries (Barclays 7-10 Year Treasury Index)
  • Long Treasuries (Barclays 20+ Year Treasury Index)
Importantly, this article is NOT about parameter optimization; for all tests we use the exact same lookback parameter lengths (where applicable) to avoid the distraction of searching for a priori local parameter optima which will almost certainly NOT prove to be true optima out of sample.

It is important to decide how we will evaluate the relative efficacy of the various approaches before we start testing. For each strategy we will show the average statistics for all simulations with 2, 3, 4, and 5 holdings, and across all 11 asset universes. Recall that we are testing the full 10 asset class universe, as well as 10 other 9 asset class universes where one of the original assets is removed. So the statistics for each strategy will actually represent an average (median) of 44 simulations (4 portfolio concentrations x 11 universes). We will then present modified histograms to illustrate the range of outcomes for each strategy. This represents a rare test of robustness across methodologies.


Toward the bottom of this article, we demonstrate how combining all of the indicators into a naive ensemble delivers better performance than any of them individually.


Momentum Metrics

For the purpose of this article we used 8 indicators for measuring trend strength. All of the metrics rank assets at monthly rebalance periods based on an average of values observed over the lookback windows described above, which were chosen to be consistent with Mebane Faber's original momentum paper.


The intuition behind testing a variety of momentum techniques relates to the ability of different measures to stabilize the estimate using simple or advanced ensemble process.


The following list describes the mechanics of each method of momentum calculation, where t is the current date, n is the lookback parameter, and N is the number of assets in the testing universe. Note that each indicator is calculated at each of the 5 lookbacks, and then the indicators are averaged across lookbacks to generate the final measure.

  • Total return - this is the most common measure of momentum, where assets are ranked on their historical total returns.


  • SMA Differential - this technique uses the differential between a shorter term and longer term moving average as the momentum measure. We used our standard parameters to define the length of the longer SMAs. The length of corresponding short SMAs was simply 1/10th of the length of the longer SMA. So for example, for the 120 day parameter, we measured the differential between the 120 day SMA and the 12 day SMA.


  • Price to SMA Differential - similar to SMA Differential, except that this technique uses the differential between the current price and the n-day SMA rather than using a shorter moving average. 


  • SMA Instantaneous Slope - For this metric we derived the instantaneous slope of each moving average. Essentially this measures the rate of change of each SMA using the difference between yesterday's SMA and today's SMA.


  • Price Percent Rank - This metric captures the location of the current price relative to the security's range over each lookback period. The lowest price over the period would have a rank of 1, while the highest would have a rank of 100. The median price over the period would have a rank of 50.

  • Z-Score - Analogous to the Price Percent Rank, z-score captures the magnitude that the current price deviates from the average price over the period.

  • Z-Distribution - This method transforms the z-score to a percentile value on the cumulative normal distribution. Under this framework the trend strength measure will accelerate in magnitude as the price strays further away from the mean. The function to perform this translation is complicated, but it can be easily generated in Excel using the Norm.S.Dist(z, TRUE) function.


  • T-Distribution - The normal distribution is valid when the sample size is large enough so that the sample is likely to be representative of the population. Under conditions where the sample size is small and the parameters that describe the distribution are unknown, a more appropriate choice is the Student's t-distribution. The t-distribution transforms a t-score into a percentile given the number of degrees of freedom. The degrees of freedom are equal to (n - 1).


It is worth noting that the SMA-differential approaches described above are related to typical moving average crossover systems applied in trend following. The critical difference in our proposed framework is that, unlike trend following approaches, which measure the 'state' of a trend, our momentum indicators measure the 'strength' of a trend. Crossover systems are either long or short (triple crossovers can be neutral too), which means they are 'binary' variables, whereas the momentum indicators provide discrete variables that allow us to compare the relative strength of the trends. 

Importantly, because we measure trend strength across five lookback time horizons, the cross sectional measure of price momentum needs to be standardized. It is silly to average an annualized 20 day ROC with a 250 day ROC because the 20 day ROC will deliver much more extreme values, on average, than the 250 day ROC, and will dominate the momentum measure commensurately.


There are a number of ways to standardize the momentum measure across lookbacks, but the method we used was to calculate each asset's proportion of total absolute cross sectional momentum across all assets over each lookback horizon, holding the sign constant.




Again, standardized momentum scores are then averaged across all lookbacks to determine the final momentum score for each asset.

Results


Table 1. displays the salient statistics for tests of each of the momentum methods (indicators) described above. Each cell describes the median performance across all 44 combinations (holding 2 - 5 positions, 11 universe combinations) that we tested for each methodology. 


Chart 1. Median performance summary


Data sources: Bloomberg

The instantaneous slope method seems to deliver the best median performance statistics all around, with the highest overall returns, the highest Sharpe, and the lowest median Maximum Drawdown of all methods. But the median is just one point on the distribution; let's see what the range of outcomes looks like for each system.


Performance Distribution


Charts 1 through 9 below show all 44 of the equity lines (4 concentrations x 11 universe combinations) that were used to calculate the median performance measures in Table 1 for each momentum indicator. The first highlighted chart shows all 352 equity lines derived from all 44 portfolio combinations across all 8 indicators.


Charts 1 - 9: Equity lines for indicators across 44 universe/concentration combinations


Source: Bloomberg


A few observations stand out from these charts. First, they all look a little different, with waves of surges and drawdowns occurring at slightly different times across momentum measures, though all of them show a drawdown in 2008. Second, notice that in some charts the equity lines all cluster together in a narrow range - z-score and instantaneous slope stand out in this respect - while others exhibit quite a wide range of outcomes - price to SMA differential and SMA differential for example.


Charts 10 through 43 below quantify the distribution of return, Sharpe, and Maximum Drawdown outcomes across all 44 versions of each momentum system using a cumulative histogram. In our opinion, the most realistic way to evaluate the performance of a system is on the basis of performance statistics near the bottom of the distribution. The worst outcome could be an outlier, but traditional tests of statistical significance focus on 5th percentile outcomes, so that is where we focus our attention. In each series of charts, we have highlighted the approach with the best results at the 5th percentile.


Charts 10 - 43: Distribution of performance metrics across 44 universe/concentration combinations.



Range of CAGR by Indicator






Source: Bloomberg


Range of Sharpe(0%) by Indicator

Source: Bloomberg


Range of Max Drawdowns by Indicator

Source: Bloomberg

(You will note that the numbers in the 50% column of the charts above are the same as the numbers in Table 1 summary, as the median is simply the 50th percentile value).

Charts 24 through 26 show the average performance of each methodology with portfolio concentrations of 2 holdings through 5 holdings across all 11 asset universes tested. It is interesting to see that, while more concentrated portfolios tend to deliver higher returns, the highest Sharpe ratios are derived from portfolios with 3 or 4 holdings, and these more diversified portfolios tend toward much lower drawdowns as well.


Chart 24. Average indicator returns across 11 asset universe combinations with different portfolio concentration


Source: Bloomberg

Chart 25. Average indicator return/risk ratios across 11 asset universe combinations with different portfolio concentration

 Source: Bloomberg

Chart 26. Average indicator Max Drawdowns across 11 asset universe combinations with different portfolio concentration

Source: Bloomberg

Indicator Diversification


We know from charts 1 - 9 that the different momentum indicators, universe and concentration combinations all deliver slightly different results, where equity rises and falls at slightly different rates. But how different is the performance across indicators really?


Matrix 1. shows the correlations between daily returns for all indicator combinations. The daily returns for all universe and concentration combinations were averaged to generate the final return series for each indicator.


Matrix 1. Pairwise correlations between indicators (average of 44 combinations for each indicator)

Source: Bloomberg



The correlations range from about 0.8 between the t-distribution method and the ROC and instantaneous slope methods, to 0.99 between instantaneous slope and ROC. The average pairwise correlation is 0.936. With correlations so high, to what extent can we take advantage of the different methods to create a diversified system composed of all 352 combinations? Chart 27 and Table 2. give us the answer.


Chart 27. Aggregate Index of all 352 indicator/universe/concentration combinations, equal weight.

Source: Bloomberg


Table 2. Summary statistics for Aggregate Index



Source: Bloomberg

Despite the high average correlations between the different momentum systems, the aggregate equity line provides a material boost to all risk adjusted statistics. While the returns are about 1% below the returns derived from the best individual indicators, the Sharpe(0%) ratio is better than all of them. Further, the average volatility of the 8 individual systems is 12.42%, while the volatility of the aggregate system is the lowest of all at just 11.1%. Lastly, the aggregate system exhibits the lowest drawdowns and the highest percentage of positive rolling 12-month periods.


Obviously it is impractical to run 352 models in parallel, even if they are all closely related. Moreover, this approach is far from the best method to aggregate all of the information from the different indicators; we will touch on different methods of aggregation in our fourth instalment of this series. However, there is clearly value in finding ways to blend various momentum factors to create a more stable allocation model.


Conclusions and Next Steps


In this article we have explored a variety of methods to measure the price momentum of a universe of asset classes for the purpose of creating global dynamic asset allocation models. Given our objective to avoid as much optimization as possible, we tested each momentum method using 4 different levels of portfolio concentration, and 11 slightly modified asset class universes. This approach provided 44 distinct tests for each indicator, which allowed us to investigate the stability of each indicator across parameters. 


Among the individual momentum indicators we tested, the instantaneous slope method delivered the best performance in terms of median returns, Sharpe(0%), drawdowns and percent positive 12-month periods. 


We investigated the impact of different levels of portfolio concentration on performance and discovered, perhaps not surprisingly, that more concentrated portfolios deliver stronger returns, while some diversification does improve risk-adjusted outcomes.


We examined the correlations between the indicator systems and determined that they are all closely related, with average pairwise correlations of about 0.93. However, even with such high average correlations, aggregating the systems into one system composed of all 352 indicator/concentration/ universe combinations delivered the most stable results of all.


Article 2 in our series will perform a similar analysis of several risk-adjusted momentum measures, such as Sharpe ratio, Omega ratio, and Sortino Ratio. As in this article, the Article 2 will hold all portfolio positions in equal weight, but Article 3 will introduce methods to optimize the weights of portfolio holdings to further improve absolute and risk adjusted returns - quite significantly. 


We are just scratching the surface of what is possible with tactical alpha. Chart 27 and Table 3. offer a glimpse of what's to come. Stay tuned.


[Update: Charts and Performance are updated through end of May]

Chart 27. Mystery system
Source: Bloomberg 

Table 3. Mystery system stats

Wednesday, May 29, 2013

Triumph of the Ostriches

Obviously the title of this post is a riff on the classic book, Triumph of the Optimists by Dimson, Marsh and Staunton (2002). The book describes 100 years of investment returns across a range of markets, making a number of important conclusions. Unsurprisingly, the financial marketing complex narrowly focused its attention on the strong returns the authors observed in equity markets over the post WWII period. 

However, more balanced readers would have noted the authors' belief that the historical indexes analyzed in the book overstate long-term performance because they are contaminated by survivorship bias. The authors are clearly of the opinion that long-term stock returns are seriously overestimated, due to a focus on periods that with hindsight are known to have been successful, while markets that were unsuccessful were not included. In other words, the analysis assumes that an investor would have known which markets would succeed and which would fail in advance.

Many of the most important countries at various stages of the past dozen decades were not included in the study because of a lack of consistent data. For example, Russia, China, Latin America, Eastern Europe and Southeast Asia were all largely ignored. Several of these countries experienced terminal collapses of equity markets and/or bond markets and/or currencies, which explain some of the discontinuities in the data.

Triumph of the Ostriches refers to the current situation in global markets where 'Ostrich' investment managers have been killing it. Recall that (according to the untrue but pervasive myth) an ostrich sticks his head in the sand at any sign of danger; presumably the ostrich perceives that if it doesn't see the danger, then it doesn't exist. In the same way, Ostrich managers ignore any and all signs of market risk in the hope that these risks won't materialize on their watch.

Throwing caution to the wind has been very profitable - so far. But if history is any guide, there are many reasons for investors to consider taking a much more cautious stance.

Here are the facts: according to every valuation metric that matters (i.e. with statistical significance through history), stocks are quite expensive. Further, when stocks are this expensive, returns to in future periods have been very low. 

How low? The following table summarizes the statistical midpoint of future returns over the next 5, 10, 15, and 20 years based on an ensemble of valuation metrics including the Q ratio, cyclically adjusted PE, aggregate corporate market capitalization to GNP, and long-term price residuals. Those wishing to explore the mechanics behind this analysis are invited to read the full report here. Those looking for a second, third, fourth or fifth opinion from other well known firms will find them here.

Table 1. Statistical Return Forecasts for U.S. Stocks Over Relevant Investment Horizons
Source: Shiller (2013), DShort.com (2013), Chris Turner (2013), World Exchange Forum (2013), Federal Reserve (2013), Butler|Philbrick|Gordillo & Associates (2013)

If you read the report, you will note that the valuations metrics cited most often by analysts and commentators to suggest the market is cheap - price to current or forward earnings for example - have almost no statistical significance. It doesn't matter at all whether these valuations suggest that markets are fairly priced; they carry no information about likely future stock market returns. 


Critically, the table above has no bearing whatsoever on what will happen to markets over the next year or two, or perhaps longer. A. Gary Shilling said in 1993, "Markets can remain irrational longer than you can remain solvent", and they certainly did. In 1994 the U.S. stock market traded to valuation levels never before witnessed over the prior century, but price multiples almost doubled again from the lofty 1994 levels before the U.S. market peaked in early 2000.

Of course, despite the most aggressive global monetary experiment in modern history in 2000 - 2003 and 2008 - present, markets since the 2000 peak have delivered very poor returns, in the range of 3% per year through the end of April 2013. What will happen to stock market returns as interest rates eventually normalize? History suggests investors will be in for a rough ride.

We have yet to see any evidence-based argument for why the valuation based analysis presented above is not relevant. What do we mean by 'evidence based'? Show us numbers to support an alternative hypothesis, and then show us how those numbers have served to forecast returns in other periods with statistical significance

Those Ostriches who do attempt to defend their perpetually bullish stance often cling to arguments based on the Equity Risk Premium (ERP), but this methodology has a serious flaw in the current environment which invalidates it. Specifically, the ERP is measured as a spread to risk-free rates; but  risk free rates have been held at artificially low levels by central banks - this is their express goal, and they have committed well over $150 billion per month to this objective. How can we calculate a meaningful spread where one end of the spread is corrupt?

Less analytical market prophets loudly proclaim that the current environment has no analog in modern financial history, so comparisons with other periods are not useful in making judgments about expectations. To these Ostriches we humbly ask, "If we can't use historical context to frame the current market environment, what exactly are we supposed to use?"

Other memes relate to the idea of a 'permanently high plateau' (incidentally, the great 20th century economist Irving Fisher coined that phrase in 1929, just three days before the crash that preceded the Great Depression). Purveyors of this delusion cite the current 'pollyanna' environment for global corporations as validation for stratospheric equity valuations. "Corporations have high record cash positions", they crow, "get ready for the great buy back and merger wave that's coming!" "Profit margins are high, corporate taxes are near all-time lows, wage pressures are non-existent - corporations have never had it better! Oh and financing is effectively free!"

Unfortunately the wailing equity zealots do not factor in Stein's Law, which states, "If something cannot go on forever, it will stop." In a period of record fiscal duress, what is the probability that corporations will continue to receive favourable tax status? According to GMO's analysis, corporate profit margins are one of the most mean-reverting series in finance, so why would be value markets under the assumption that they will stay high forever? Further, how valuable is the cash on corporate balance sheets if there is an equally large debt balance on the other side of the ledger (there is)? 

The Ostriches aren't concerned with valuation metrics or Stein's Law, and let's face it, they've been right to stick their head in the sand - at least so far. The problem is that in markets we won't know who is right until the bottom of the final cyclical bear in this ongoing secular bear market. Only then will we see just how far from fundamentals the authorities have managed to push prices, and only then will we see whether it really is different this time.

Until then, investors can choose facts or faith. The facts say that investors are unlikely to be compensated at current valuations for the risks of owning stocks over the next few years. The church of equities says, 'don't worry about it'. So far the Ostriches have it, but all meaningful evidence suggests that over the next few years the Ostriches are going to feel like turkeys - at Thanksgiving.

Tuesday, May 14, 2013

The Whole is Greater than the Sum of the Parts

One of the most mind-blowing implications of portfolio theory is that a well conceived portfolio has the potential to be much better, in terms of risk adjusted performance, than what we might expect from the sum of the individual portfolio holdings.

Not incidentally, the name of our blog - GestaltU - relates directly to this concept. Contrary to the dominant framework of reductionism, which decries that the most effective way to understand something is to understand its parts, Gestalt theory asserts that many things can not be understood by understanding the components, because the 'whole' is greater than the sum of the parts.

This is more obvious in some fields than others. For example, can a person intuit the qualities of water from an understanding of the properties of hydrogen and oxygen (without a deep understanding of quantum mechanics)? Can you effectively comprehend the experience of carrot cake from an understanding of the ingredients?

The famous World Wildlife Found logo is an example of a Gestalt because the brain identifies that the conglomeration of irregular black shapes in the image is actually a panda bear. It is not the shapes themselves, but the orientation of the shapes and how they fit together that communicates the salient information contained in the image.


Most investors pay much more attention to the process of identifying the individual characteristics of the assets they want to own than they commit to the process of identifying how well the assets might fit together in a portfolio. But what if the individual characteristics of the assets are less important than the way they work together?

We've been meaning to get a post up on this topic for a while, but a recent paper published by Cass Business School and sponsored by institutional consultant AonHewitt provided the ammunition we've been looking for. Their paper, which is in two parts, is called 'An evaluation of equity indices". Part 1 examines 'Heuristic and optimized weighting schemes' and Part 2. explores 'Fundamental Weighting Schemes'. This framework works beautifully to illustrate the relative importance of portfolio optimization versus fundamental stock selection because it compares the realized excess risk adjusted performance of pure risk-based optimization methods to methods based on traditional fundamental security selection.

Note that the authors used a universe of the top 1000 stocks by market cap in each year from 1968 - 2012. 

In Part 1., the researchers describe a variety of ways to dynamically generate optimal stock portfolios where there is no effort to emphasize individual return characteristics at the security level. Rather, portfolios were assembled purely on the basis of how constituent stocks were expected to contribute to the overall risk of the portfolio, based on observations of the variance/covariance matrix over a trailing 60 month window. The authors then compared the performance of these optimized portfolios to the ubiquitous market capitalization weighted index, and the more competitive equal weight portfolio. 

It is beyond the scope of this article to describe the characteristics of the different optimizations applied by the authors (we highly recommend that you download the paper and read about the various optimizations), but Table 1. summarizes the results.

Table 1. Portfolio optimization results

Source: Cass Business School, 2013

Part 2. explores a variety of fundamental based methods of creating portfolios that emphasize the stocks with attractive fundamental characteristics. The authors create portfolios where stocks are weighted by qualities like dividends, cashflows, book values, and sales to see how these fundamentally weighted portfolios compare to the traditional market cap and equal weight indices. 

Table 2. Fundamental weighting results
Source: Cass Business School, 2013

It is interesting to compare the two methods of portfolio formation. Note that the best portfolio optimization method (in terms of Sharpe ratio), MVP (minimum variance), delivered 10.8% returns with volatility of 11.2% and a maximum drawdown over the full period of -32.5%. On the other hand, the best fundamental weighting method, Sales-weighted (Sharpe was tied with Dividend weighted, so used Sortino to break the tie), delivered returns of 11.4% with a volatility of 16.2% and a maximum drawdown of -52.6%. Note that the Sharpe ratio of the MVP strategy was 0.5 compared with 0.42 for the Sales-Weighted strategy, and the Sortino ratios were 0.59 and 0.53 respectively.

How is it that optimization alone can deliver better risk adjusted performance without any fundamental information about the relative prospects for portfolio constituents? Part of the answer is that optimization tends to indirectly tilt portfolios toward factors that are well known for adding excess returns over time. 

The following table quantifies the annualized difference between the return to the factor exposure of the alternative index relative to the market-cap index. You can see that the optimized portfolio derives meaningful alpha from a small-cap bias relative to the market-cap index. This is unsurprising. What is more surprising is that the optimizations tend to tilt portfolios toward the Fama French Value factor, and away from the momentum factor.

Table 3. The Returns to Factor Exposures
Source: Cass Business School, 2013

Clearly there is an opportunity to combine fundamental stock-picking factors with robust portfolio optimization to deliver better results than either method alone - another Gestalt!

The following table is taken from an S&P Capital IQ presentation published in December 2010. The authors imposed factor tilts on a minimum variance portfolio derived from constituents of the S&P 1500, with the results in Table 4.  Note improved Return/Risk ratios from a combination of FF Value and Earnings Quality tilt portfolios with minimum variance optimization. 

Table 4. Minimum Variance with Factor Tilts
Source: Capital IQ, 2010

Investors should take note of the opportunity for better risk-adjusted returns by considering more holistic methods of stock-picking rather than concentrating so much time and effort on identifying individual stocks with prospective characteristics. The whole really can be better than the sum of the parts.