Merton model, a summary of the issues

Discussion in 'David's Notebook' started by David Harper CFA FRM CIPM, Apr 10, 2012.

  1. Since we get a lot of questions on Merton (some issues are merely notational), I wanted to collect my observations into a single place. I hope you like. (fwiw, I don't endorse it, I blame the prominence of this on Stulz' influence in the FRM :rolleyes: ... the math of the BSM is admittedly seductive.)

    The Merton model for credit risk has two steps:
    1. Use the Black-Scholes-Merton option-pricing model (BSM OPM) to estimate the price (value) of the firm's equity
    2. Using the firm's equity value to assume the firm's asset value and asset volatility, estimate the probability of default (PD) under an assumption that the firm's asset price will follow a lognormal distribution
    What is the role of Black-Sholes-Merton (BSM OPM), here in Merton for credit risk?
    The Black-Scholes OPM solves for a European call option = S(0)*N(d1) - K*exp(-rT)*N(d2).
    1. BSM OPM is directly applied only in the first step, to get the firm's equity value (and maybe to get the firm's debt)
    2. In the second step, N(-d2) is used to estimate PD. It is the same d2, but with one key difference: The riskfree rate (r) in BSM is replaced with a real/physical firm drift (mu). This step uses a component of BSM, so it looks like BSM, but this step is NOT option-pricing at all. It is a simple statistical calculation.
      Again, N(-d2) is the analog to PD, except real asset drift replaces riskfree rate.
    What are the two steps, in more detail?
    Step 1 (derivatives valuation): price firm equity like a call option: The first step above employs the BSM OPM precisely because its central insight is to treat the firm's equity as a call option on the firm's assets. In this way:
    • S(0) is replaced by today's firm asset value, V(0), where V(0) = D(0) + E(0);
      i.e., S(0) in BSM replaced by --> V(0) in Merton
    • The face value of all debt (not "default threshold" here, that's step 2) replaces the strike price; it's total face value of debt because that is the "strike" that must be paid to retire debt and own the firm's assets.
      i.e., K or X in BSM replaced by --> F(t) in Merton
    • We retain the risk-free rate (r) in this step, we do not use the firm's (asset's) expected return. This is theoretically significant: by employing BSM to price equity as a call option, we rely on the brilliant risk-neutral valuation idea, which requires the risk-free rate as the option payoff can be synthesized with riskless certainty.
      i.e., riskfree rate (r) in BSM is retained in Merton
    Summary Step 1:
    • The Familiar BSM OPM which prices a call option on asset: c(0) = S(0)*N(d1) - K*exp(-rT)*N(d2), is re-purposed to:
    • Price the firm's equity as if it were an option (strike = debt face value) on firm's assets: E(0) = V(0)*N(d1) - F(t)*exp(-rT)*N(d2)
    Two details associated with the first step, that can be skipped
    • 1b. Less important, for FRM exam purposes, is that we solve for both equity value (which informs asset value) and firm asset volatility. The full first step is a simultaneous solution of two equations in two unknowns which produces an assumption for the capital structure (MV of debt + MV of equity = MV of assets) for the firm and the firm's asset volatility. This will not matter in the FRM, it is too tedious. You will instead just be given the assumptions for firm (asset) value and firm (asset) volatility.
    • 1c. More important is that a similar option-based insight can be used to price the value of the firm's debt: the value of the firm's "risky" debt = risk-free debt - put option on firm's assets with strike equal to same face value of debt, where risk-free debt is face value of debt discounted at the risk-free rate.
    Step 2 (risk measurement): PD = N(-DD)
    An FRM P2 candidate should try understand the relatively simple intuition of this step, which is not option pricing, it is just statistics. See image below (page 15, from video 6c). I am using de Servigny's numbers.

    [​IMG]

    From left-to-right:
    • Assume current price of assets (i.e., firm value), V(0) = $12.75
    • Assume assets drift at a rate of mu = +5% per annum
    • At the end of the period, firm will have an expected future value higher than today, due to positive drift. In this case, V(t) = ~ $13.34
    • Assume a future distribution, same assumption we use for equities: log returns are normal --> future prices are lognormal
    • If we are going to make a normal/lognormal assumption, we can treat either, but it is easier to treat the normal log returns. Our expected future firm value is +28.8% standard deviations above the default threshold = LN(13.34/10) = 28.8%. As our asset volatility is 9.6%, the implies our expected future firm value will be +3 sigma above the default threshold of $10.
      This final step merely produces a standard normal (Z) variable:
      LN(13.34/10)/9.6% sigma = Z of ~ 3.0 where 3.0 is the (standardized) distance to default
    • Under this series of unrealistic assumptions, future insolvency is characterized by a future firm value that is lower than the default threshold of $10; i.e., the area in the tail.
      PD = N(-DD) = N(-3.0) ~= 0.13%
    That's Step 2 and the Merton model. Two related ideas:
    • Risk-free rate (r) vs. asset drift (mu): In BSM, N(d2) = risk-neutral Prob[call option expires ITM] and in Merton N(-DD) = risk-neutral Prob[Insolvency; i.e., Asset expires OTM].
      BSM risk-neutral d2 = (LN[S(0)/K] + [r - sigma^2/2]*T)/[sigma*SQRT(T)],
      but Merton's step 2 wants real-world DD = (LN[V(0)/F(t)] + [mu - sigma^2/2]*T)/[sigma*SQRT(T)]
    • The usage of risk-free rate (r) in the first step and asset drift (mu) in the second step nicely illustrates Jorion's introductory (Chapter 1) distinction between derivatives pricing versus risk measurement. The 1st step above is derivatives pricing. The 2nd step is risk measurement, which he contrasts in five dimensions: 1. distribution of future values, 2. focused on the tail of the distribution [instead of the center, as in step one], 3. Future value horizon, 4. Requires LESS PRECISION (i.e., approximation), and 5. utilizing an ACTUAL (physical) distribution, rather than a risk-neutral
    Variation #1: use of lognormal prices instead of the more familiar normal log returns
    The more typical approach, above, derives a standard normal Z deviate by assuming log returns are normally distributed: if LN(S2/S1) is normal then S2 is lognormal. As such, the more typical distance-to-default above produced a standardized normal return-based DD of 3.0 = 28.8% continuous return / 9.6% per annum volatility. In BSM, the numerator of d2 is a continuous return, standardized by dividing by the annualized volatility in the denominator, to give a unitless standard normal deviate.

    Alternatively, the distance of default can be expressed as a function of the dollar difference between the future firm asset value and the threshold, in this case: $13.34 - $10 = $3.34. And then standardize that by dividing by the volatility to get the alternative distance to default:
    Lognormal price-based DD = [V(t) - Default]/[sigma*V(t)] = ($13.34 - $10) / (9.6% * $13.34) = 2.607

    This price-based lognormal DD of 2.607 is equivalent to the return-based normal DD of 3.0 (normal log returns --> lognormal prices). See row 31 of XLS 6.c.1. for dynamic translation/proof.

    With respect to the exam (I can't judge the testability of any of this, GARP has been uneven here, overall testability may well be low):
    • The historical/sample FRM questions tend to query the lognormal price-based DD maybe because it's a shorter formula: [$V(t) - $DefaultPoint]/[sigma*$V(t)]. You'll notice you can't easily retrieve the inverse lognormal CDF, so naturally this sort of questions only asks you the DD and stops short of asking for the PD.
    • You can confirm with an understanding of the above that this formula wants:
      1. Expected future asset value (end of period equity + debt), and
      2. The dollar volatility of V(t) is more correct than dollar volatility today [i.e., sigma*V(0)] but either is okay.
    • In the simple two-class Merton, MV equity + MV debt = MV of firm assets, V(0) or V(t) ... and debt directly informs the default threshold ... but, otherwise, this DD is entirely a function of firm assets, not equity: asset value today, V(0), drifting at the asset return (mu) to the future expected asset value, V(t), subject to asset volatility, sigma(asset).
    Variation #2: KMV (Merton but with two adjustments)

    The two steps above illustrate the Merton as (i) assuming the firm will default upon insolvency, asset(t) < face value of all debt(t) and (ii) inferring the area in insolvency tail as a function of a normal return (lognormal price) asset distribution. The KMV method, who I consulted to years ago, recognizes and addresses these two unrealistic assumption.
    1. First, debt consists of short-term obligations (including the short-term portion of long-term debt) and long-term debt. A firm has more time to recover with respect to the long-term debt. KMV's research led it to conclude that the default threshold point is really somewhere in between the short-term debt and the total debt. So, if LT/ST < 1.5, the default threshold = short-term debt + 0.5 * long-term debt.
    2. Second, as discussed above, the use of PD = N(-DD) assumes the asset log returns are normally distributed. Let me restate that in, I think, a more meaningful way: by using only the asset volatility, the Merton model tacitly assumes a lognormal distribution of the asset value. As always, this is probably incredibly unrealistic. So, rather than derivate the PD parametrically (i.e., inferring PD as the area under a parametric [lognormal] distribution), KMV resorts to history. Their historical database contains actual firms and their default rates; by back-computing the historical distance-to-defaults, they have a historical correspondence (mapping) of DDs and the actual default rates. For example, whereas parametric normal/lognormal tells us (above) that + 3.0 DD = 0.13% PD (area in the tail), maybe their database shows that +3.0 DD corresponds more nearly to a 0.42% default rate. So, this is a historical empirical translation of DD into PD.
    In summary, KMV applies Merton (is Merton-based) through 1.5 of the two steps, but abandons the PD = N(-DD) in favor of PD = historical default rate corresponding to DD. Also, KMV tweaks the default threshold from total face value of debt (Merton) to all short-term plus some fraction of long-term debt.
    • Like Like x 11
    • Informative Informative x 1
  2. rickm123

    rickm123 Member

    thanks david clears it up a lot..I asked you too many questions related this topic and this clears it up big time
    • Like Like x 2
  3. shanlane

    shanlane Active Member

    Hello,

    This is very helpful but it just misses the question Chad and I were talking about yesterday. The formula we were looking at was the more simplified version of DD, which is:

    (Expected return - Default point)/volatility of returns, which is on the Merton sheet (you use this formula for lognormal DD, but not for normal DD)

    The problem is that this first term is sometimes "Expected value of the assets (as you have used)" or "expected value of the equity". Accordingly, the volatility in the denominator will also change to reflect which ever term is used in the numerator.

    Our problem is that the formula just has too many variations. Is there one that is consistent with the exam?

    Also, it was great of you to include both the normal and lognormal DD, but which of these do we use for exam purposes?

    Finally, (sorry :() is the formula that I gave above (the same one that you used for lognormal DD) ever used for regular DD? I have seen "expected value of the assets" given in terms of simple growth (if u =8%, then expected value of the assets is just 1.08* present asset, or firm, value) and also just as "expected return on assets is $20M", so that the term is just $20M.

    Thanks (and many appologies for being a pest)!

    Shannon
    • Like Like x 1
  4. I just added the second section above...

    @Rick: thanks, glad it helps!

    @Shannon: does the new second section answer your question? Do I use "(Expected return - Default point)/volatility of returns" somewhere? Can you point me to it, b/c that looks wrong ... maybe an example will disprove me, but the numerator looks wrong for mixing a return with a dollar value. The "short (lognormal price-based) version" should be:
    [$V(t) - $DefaultPoint]/[sigma*$V(t)], where $V(t) is future, end of period, firm ASSET (=equity +debt) value
    • Like Like x 1
  5. shanlane

    shanlane Active Member

    This absolutely answers 90% of my question. THANK YOU!!

    The other examples that I saw (and these are from Kaplan and I know that they tend to make lots of mistakes) are in two different forms. One version says "the return on assets is X" and this X (in dollars) is used as the first term in the numerator. It also says that it can be (asset value - liability value) in the numerator.

    However, there is one thing that is still a little confusing. In the "lognormal price" variation, isn't the 13.34 that you use for the future price derived from the idea of lognormal returns? I may be getting things really mixed up right now, but shouldn't the future asset value be different if assuming lognormal price vs lognormal returns? Maybe 12.75*exp(.05) or something like that?

    Thanks again,

    Shannon
    • Like Like x 1
  6. Hi Shannon, great!

    I can't directly speak to the Schweser format, I just don't have it, but please note (I imagine you know this): liability is probably just a word for face value of debt; and return on assets is another way to express the drift. (Asset value - liability) looks to be the same as [V(t) - Default].

    With respect to 13.34, it is given by: V(0)* exp(mu - asset variance^2/2)*T = 12.74*exp(5%-9.6%^2/2)*1.0 = $13.34; note consistency with BSM d2 in subtracting variance/2. Yours is actually a fair future distributional moment, but the difference is non-trivial and best explained by Hull in his 14.3. I hope that explains, thanks!
    • Like Like x 2
  7. shanlane

    shanlane Active Member

    It does. Now for the $25,000 question (sorry for the old game show reference): if we are not explicitly told which variation to use, which one should be our default?

    Thanks!

    Shannon
  8. I finished the posted today, fwiw, by adding the KMV tweaks (variation #2)

    @Shannon: I've gone into more detail, above, to try and show why there is an equivalence. Candidly, previous exam (sample) questions appear to reflect unawareness of the distinction. For shorthand, I'd probably: try to use prices if you "can get away with it" given the information provided (why? because this is the more likely approach to be asked) and resort to the d2-type returns DD only if you are forced to by assumptions. For example, a question that implies the final step is PD = N(-DD) and asks for DD is obviously looking for the returns-based d2-type DD.

    On the other hand, I perceive the following sort of question is more likely:

    Question: If future expected asset value = $13.0 with volatility of 9.6%, and ST debt = 7.0 and LT debt = 6.0, what is the distance to default?
    ... This gives you no drift (ROA), you can't get d2, and in any case, priced-based works just fine: (13.0 - 10)/(13.0 * 9.6%) = 2.404 distance to default (and notice, exam-wise, nobody needs to care that this is a lognormal DD!)
    ... hopefully the question is precise to say "future expected" asset value. That tells us we can go the easy way.

    It's just my opinion (b/c the returns-based d2-type approach is actually the assigned approach, in de Servigny) but knowing what i know, practically, I would look for the easier price-based approach and only resort to returns-based if explicitly forced to. I hope that helps .... I will definitely add this to the for-GARP feedback forum (although of course it can have no effect on May). thanks!
    • Like Like x 3
  9. shanlane

    shanlane Active Member

    That was very thorough and very instructive.

    Thank you for taking the time to break this down into a much easier to interpret format than was presented by the readings.

    Shannon
    • Like Like x 2
  10. nikogeorgiev

    nikogeorgiev Member

    Hi David,

    I've been through your tutorial on Merton Model and I got carried away created a Matlab code to estimate Asset (firm) value and asset volatility with the BS using some optimisation with matlab. However, I used Anadarko Petroleum with market cap of $36.23bn, debt of $16.43 and sigma of 4, and RFR = 1.96%. The model gives me Firm value of $52.66bn and asset volatility of 2.77%.

    Using your instructions how to calculate the PD it gives me 0% Where am I making the mistake? shouldn't the PD be >0 with so much debt?

    many thanks
  11. Hi Niko,

    The combination of low leverage (52.66/16.43) and really low asset volatility (2.77%?) may give you a really low PD. The drivers are:
    • asset volatility of 2.77% per annum is suspiciously low, i'd check that calculation: with such a low volatility, the merton may return PD near to zero
    • low leverage (with extremely low volatility, over short horizon ... )
    • The distributional assumption. As above, if you assume normal/lognormal, then PD tends to zero pretty quickly even for just a few "distance to default" standard deviations. You don't have to assume normal/lognormal, can use heavy tailed distribution.
    I hope that helps,
    • Like Like x 1
  12. nikogeorgiev

    nikogeorgiev Member

    Hi David

    You're amazing! Thanks so much.

    I will check my model again, but with Equity volatility of 4, it is reasonable to think that the assets are less volatile than that no? or am I wrong?
  13. Hi niko, sure thing. Yes, it is very reasonable to expect volatility[assets] < volatility [equity]: with higher leverage, we can expect higher volatility [equity]. But, then, your volatility is suspiciously low. I am assume it is per annum, because typically the per annum volatility input is time-scaled by the sigma*SQRT(time), so you might want to check to ensure you aren't inputting (eg) a monthly volatility or somehow "double time scaling" the volatility. Thanks,
    • Like Like x 1
  14. nikogeorgiev

    nikogeorgiev Member

    Hi David,

    Many thanks once again. One last question I promise regarding your excel spreadsheet and model. When you refer to time, e.g. T=1, do you mean as T=1 day or as in one year time frame, because in my model I presume T=1 as in one year time frame, so I scale it by sqrt(1)

    huge thank you. You've helped me understand the model so much thoroughly
  15. Hi niko,

    Yes, 1 = 1 year. But any "periodicity" can be represented, the thing is to treat it consistently.

    The denominator in d2/DD is: volatility*SQRT(T). If the volatility input is per annum, eg., 30% per annum, then 30%*SQRT(1.0) scales over 1.0 year and 30%*SQRT(10/250) scales down to 10 days. If the volatility input were 1% standard deviation per day, which it currently is NOT, then 1.0%*SQRT(250) scales to one year. In this way, the periodicity of the (T) match the periodicity of the sigma.
    • Like Like x 2
  16. nikogeorgiev

    nikogeorgiev Member

    Thank you David,

    Indeed, I was using daily volatility e.g. 2.77% without scaling it. Silly mistake
    • Like Like x 2
  17. nikogeorgiev

    nikogeorgiev Member

    I can't let it go, sorry :)) I try to ask only questions that have not been answered before.

    I opened Servigny again (p.65) and he says "the firm is assumed to be finance with equity with value S and pure discount bonds with value P and maturity T". The firm can only default at Maturity or at time T. It is not reasonable then to assume very low values of T=1,2,3. So my questions is, how do we chose the maturity for the model's sake? Should we use something more like T=15 i.e. debt with maturity of 15 years.

    Can we expect questions of that sort in the exam?

    Many thanks
  18. Hi Niko,

    It is an EXCELLENT observation. My perspective is that your question implicates Merton, in a way: it speaks to weakness of the model (IMO, another glaring weakness is dependence on stock price --> equity value, such that default prediction varies inordinately due to market-based factors. As one of the readings says, it "over-reacts")

    The basic Merton is limited to a single period (single maturity) and the theory is to use the maturity equal to the long-term debt (or WAM of the LT debt). If debt is due in 5 years, then use T = 5. Is this *really* a 5-year cumulative PD? In my view: no, it is merely a "single" one-period 5-year PD. What i mean is: it's an unconditional 5-year PD, not a cumulative 5-year PD. The mitigant (the thing that salvages this as a proxy for a cumulative PD) is precisely the time scaling to which you previously referred: de Servigny's volatility per annum is 9.6%, so the 5-year volatility (in the d2/DD denominator) = 9.6%*SQRT(5) = 21.5%. See how the 5-year volatility scales to make a 5-year default SQRT(5) more likely than a 1-year default, for a given per annum volatility? In this way, for a reasonable drift, the longer maturity is "conservative" (producing a higher PD).

    (although: the positive expected return decreases distance to default, so what happens is, and this makes no sense: initially, the DD increases with maturity due to volatility effect, but then reverses due to drift effect. If de Servigny's example assumed ZERO asset return, the volatility and DD would only increase with maturity, but the positive return has the opposite effect, introducing overall ambivalence. Ultimately, strict adherence to Merton in this regard simply illustrates how silly it can be, just IMO, at some point the model asks us to suspend too much disbelief)

    From a practical/exam standpoint: it will only be one maturity, the question should tell you, and it will either correspond to the maturity of the long-term debt (e.g., "debt matures in 5 years ... what is the PD at the end of 5 years"), OR equally fine is "debt matures in 5 years .... what is Merton PD at the end of 1 year?" i.e., there is no violation to select a single period horizon that is shorter. Hope that helps,
    • Like Like x 2
  19. nikogeorgiev

    nikogeorgiev Member

    David,
    thank you so much. Very thorough explanation as always.

    It just needs time to settle in my brain now :)
    • Like Like x 2
  20. shanlane

    shanlane Active Member

    Hello,

    I am sorry to have to ask this question, but if the 5 yr PD is NOT a cumulative PD, as you stated above, how would we get the 1 yr PD from this 5 yr PD?

    Thanks!

    Shannon

Share This Page

loading...