What's new

Coskew and Cokurtosis in Miller Chapter 3 (P1 FRM)

David Harper CFA FRM

David Harper CFA FRM
Staff member
Thread starter #1
In analyzing Miller's spreadsheet, I realized his book displays the formulas for co-skew and co-kurtosis incorrectly. There is a thematic confusion between the (unstandardized) cross-central moments versus "skew" and "kurtosis" which are standardized: 3.42 and 3.47 forget to divide by N, so they aren't the expected values of (X - mu)^m, they are instead the sums. To review:
  • The 3rd central moment (aka, moment about mean) = E[X - mu]^3, so if you aren't using a probability distribution (but rather a sample), you need to take the average of the sum of [X-mu]^3 to get the Expected[X-mu]^3. But that's a "raw" or un-standardized 3rd central moment; in the same way that it's hard to make sense of the units of a variance b/c variance is the un-standardized second central moment E[X - mu]^2
  • Skew, then, is the standardized 3rd central moment, S = E[X - mu]^3/sigma(X)^3 = 3rd central moment/(to standardize)
    Kurtosis, then, is the standardized 4th central moment, K = E[X - mu]^4/sigma(X)^4
  • So 3.47 is incorrect: if it's a sample, in needs to divide the sum by (n). If it's a probability distribution, then to the same effect, the formula needs to be K = Sigma[p(i)*[x(i) - mu]^4]/sigma^4. And if we have probabilities, then we don't need the sample adjustment.
I prepared a draft worksheet, that I will eventually revert into the associated learning XLS and the study notes, below is a snapshot. As my (X) & (Y) are not small samples but rather probability distributions, i can avoid the sample adjustment. Notice that, consistent with Miller's Table 3.4, co-skew has two "versions" and co-kurtosis has three versions (non-trivial cross moments)

Here is the draft XLS, in case anybody wants to take a look @ http://trtl.bz/coskew-cokurt
(same as this dropbox file)

Last edited:

David Harper CFA FRM

David Harper CFA FRM
Staff member
Thread starter #2
I just wanted to append my correspondence with Mike Miller (author of Mathematics and Statistics for Financial Risk Management), who was generous and helpful in his reply.
Here is the final update
  • You can retrieve Mike's errata here at https://www.dropbox.com/s/2zh56al7vndv3ek/Errata for MSFRM.pdf?dl=0
    Note that, indeed, Chapter 3 of MSFRM has several corrections
  • The formulas in the text for skew (3.42), kurtosis (3.47), co-skew and co-kurtosis (3.51) do need correction
    (although I was initially wrong about the XLS Table 3.3: those values are correct)
I was not very clear, above, about standardization. For that, I think it's useful to contemplate covariance.
Covariance is the raw ("unstandardized") second cross-central moment, E[(X - mu[X]^a)*(X - mu[Y]^b)], where we happen to have a = 1, b = 1, such that a+b = 2
Which is just E[(X - mu[X])*(X - mu[Y])].
We can then standardize covariance into correlation by dividing it by the product of sigma(X)*sigma(Y).

Similarly, there are three ways to compute a fourth cross-central moment because a+b=4 in three different ways: a = 1, b=3 OR a=2, b=2 OR a=3, b=1 (a =4, b=0 OR a=0, b=4 gives us a fourth "univariate" central moment instead, not a cross-central moment). A fourth cross-central moment can be either of:
  • mu(XXXY) = E[(X - mu[X]^3)*(X - mu[Y]^1)]
  • mu(XXYY) = E[(X - mu[X]^2)*(X - mu[Y]^2)], or
  • mu(XYYY) = E[(X - mu[X]^1)*(X - mu[Y]^3)], or
Then we standardize these (similar to standardizing covariance into correlation) into co-kurtosis:
  • K(XXXY) = E[(X - mu[X]^3)*(X - mu[Y]^1)] / [sigma(X)^3*sigma(Y)^1]
  • K(XXYY) = E[(X - mu[X]^2)*(X - mu[Y]^2)] / [sigma(X)^2*sigma(Y)^2]
  • K(XYYY) = E[(X - mu[X]^1)*(X - mu[Y]^3)] / [sigma(X)^1*sigma(Y)^3]
Here is one way to look at this: correlation is just a special case of the standardized cross-central moment where a=1, b=1 and a+b=2.

In the XLS, I was thrown by the n/(n-1)(n-2) which multiplies by the sum of cross-products. As Mike's email explains (see above), this ratio both takes an average and applies the sample adjustment. To grasp this, I prefer to break this up into its two pieces; but keep in mind the sample adjustment applies only if we are computing from a sample, not when we have a probability distribution. But if we have a sample (as in his XLS):
  • The 3rd cross central moment = E[(X-E(X))^2(Y-E(Y))^2], in the case of a sample is given by:
  • mu(AAB) = Sum:[(A - avg A)*(A- avg A)*(B - avg B)] * 1/n
  • The sample adjustment is actually given by n^2/[(n-1)*(n-2)]
  • mu(AAB) = Sum:[(A - avg A)*(A- avg A)*(B - avg B)] * (1/n) * n^2/[(n-1)*(n-2)]
  • Which explains why we see (1/n) * n^2/[(n-1)*(n-2)] = n/[(n-1)*(n-2)] ...
  • My point is that n/[(n-1)*(n-2)] offers me no intuition at all, but parsing out the 1/n shows the sample adjustment as n^2/[(n-1)*(n-2)] and here, we can at least see, that it is "plussing up" the estimate not totally unlike 1/(n-1) for unbiased variance "plusses up." In the case of n=10, for example, the sample adjustment = 100/72.
I hope that's helpful,
Last edited:
I would like to thank you for the efforts you have made in writing this post. your provided information is easy to understand and implement.
Hi David
This discussion was very useful. I would like to investigate more on this. Could you please give me any references for standardize co-kurtosis formulas above (in your reply). Most of the finance research papers use ``Beta-comoments", for example, Beta-co-skewness = Cov(R_t, R_m^2)/ E[(R_m-mean(R_m))^3], (from Skewness preference and the valuation of risk assets, Kraus, Alan and Litzenberger, Robert H., 1976). I have not found a statistical procedure to test the significance of co-moments in stock returns. I'm trying to understand the difference in different formulas. Could you help me please.

David Harper CFA FRM

David Harper CFA FRM
Staff member
Thread starter #6
Hi @npremara I am glad the discussion above was useful. Candidly, the Miller text was introduced into the FRM in 2013, and it remains my only exposure to cross-central moments (eg, co-skew and co-kurtosis). Apologies, I don't have a reference that I can recommend :(
Hi David, Thanks for the reply. I'm just wondering how you figured out following. Could you please explain it little bit more.
  • The sample adjustment is actually given by n^2/[(n-1)*(n-2)]

David Harper CFA FRM

David Harper CFA FRM
Staff member
Thread starter #8
Hi @npremara It sounds like you didn't see the correspondence above at https://www.dropbox.com/s/rrtscj6kxazh07y/0313_MikeMiller_correspond.docx?dl=0 ?

I asked Michael Miller (author of http://amzn.to/1Tvz6uX ) about that adjustment because I couldn't exactly figure out the XLS. My thoughts above were informed by his response, note that he addressed my question about the sample adjustment, although he didn't provide the mathematical justification (new emphasis mine):
Thank you for your e-mail. I always appreciate feedback. One of the challenges in writing a textbook is that what may seem clear to the author, is not necessarily clear to the reader (a challenge which I'm sure you are familiar with from teaching).

I'm afraid the section on coskewness and cokurtosis was made slightly more confusing by my switching back and forth between cross central moments and standardized cross central moments. There are two point in particular, which I point out in the errata, where I was not as precise than I should have been (http://www.risk256.com/writing/Errata for MSFRM.pdf).

Just to make sure we have the terminology correct, E[(X-E(X))^2(Y-E(Y))] would be a third cross central moment, but E[(X-E(X))^2(Y- E(Y))] / (sigma(X)^2 * sigma(Y)) would be a coskewness. Similarly E[(X-E(X))^2(Y-E(Y))^2] would be a fourth cross central moment, but E[(X-E(X))^2(Y-E(Y))^2] / (sigma(X)^2 * sigma(Y)^2) would be a cokurtosis. (In both cases sigma is meant to be the standard deviation of either X or Y).

To answer your first question: in the workbook, your intuition is correct. You do want an average. What might not be clear is that (I13/((I13-1)*(I13-2))) is giving you that average (and correcting for a sample bias at the same time). (I13/((I13-1)*(I13-2))) is n/(n-1)(n-2), where n is the number of points in our sample. In the limit as n goes to infinity, this is equal to 1/n, so multiplying this by the sum gives us something very close to the average for large n. The reason we multiply by n/(n-1)(n-2) and not n is the same reason we multiply the sum of (X-E[X])^2 by 1/(n-1) instead of 1/n when calculating sample variance. ...It is far from intuitive and the math is tedious, but when we are calculating the third cross central moment or sample coskeness n/(n-1)(n-2) gives us an unbiased estimator, just like 1/(n-1) does with variance. ...If we were calculating the *population* third cross central moment or coskewness you could multiply by 1/n or just take the average instead of the sum, just like with population variance. In the workbook example, because we are calculating the sample coskeness the answer is correct.

For your second question ...
Last edited:
David, your explanation is very helpful. Thank you for your efforts.
I have one request,if I may, could you please explain the meaning of co skewness and co kurtosis in very simple terms? Also, what are their implications and uses in practical work scenarios in very brief. Just couldn't catch up on that part.
Thanks a lot.