BT IS A GREAT BUY!
27 Aug 2008
Learn Finance with the pros. Better articles, resources and screencasts for easier learning.
If you are sitting for a financial exam like the CFA or FRM, it's great practice to hand compute the correlation coefficient. In the spreadsheet below, I calculate the correlation between two recent hedge fund strategies: high-yield and equity hedge. To keep it simple, these are only monthly returns since the start of 2006, courtesy of Hedge Fund Research Inc. Here is the scatterplot:
It is good practice to first look at the scatterplot. Because not all relationships are linear. Correlation is a linear association and sometimes we mistakenly fit a linear relationship where something more complex is better. But we can see our relationship above is clearly linear. They are rarely so obvious!
The correlation coefficient (often denoted by Greek rho) is given by:
Correlation coefficient is the covariance divided by the product of standard deviations. The numerator is the covariance. It is a measure of co-movement between the two variables. For our scatterplot above, it happens to be about 0.4. But the problem with covariance is that it's hard to make sense of it. To make the covariance sensible, we divide it by the product of standard deviations in order to standardize the covariance and, in doing so, we translate it into a unitless number called the correlation coefficient (or just "correlation"). The correlation will run from -1.0 (perfect negative correlation) to 1.0 (perfect positive correlation).
For our scatterplot above, the correlation is about 0.78 or 78%. That's a number we can evaluate: it's very high. It's unitless, 5% or 10% would be low. Anything above 30% or 40% is a pretty strong correlation. (it depends who you ask. Some think correlations should be higher, but I think that anytime you can show any relationship to a single factor, you've got something. Most financial assets are exposed to a dizzying complex of factors, many unknown at the time of testing. If the you think about it, a single-factor model is sort of an absurd simplification of reality).
The steps to compute correlation are the following.
The first step is to compute the covariance, given by:
The numerator is the sum of cross-products. The x-bar and y-bar are the averages. So, we take the difference between the X-observation and the X-average, then multiply it by the difference between the Y-observation and the Y-average. That's a cross product. We sum those for all observations. Then divide by n-1. Dividing by n-1 is to treat the observations like a sample; i.e., they are not the full population of observations. You will see in the spreadsheet below, we can also just divide by (n). If we divide by (n) consistently, we get the same result.
Then we compute standard deviations for both the X and the Y series. For X, for example, it is the standard deviation of the variance:
Finally we divide the covariance by the product of the standard deviations:
The best way to understand this is to work through an example. Here is an EditGrid spreadsheet. The covariance is computed in yellow; the standard deviations in blue; and the correlation in green. You can open your own read/write copy here.
27 Aug 2008
26 Aug 2008
26 Aug 2008
Comments
Be the first to leave a comment!
Leave a Comment