What's new

computing probability from the t stat

Thread starter #1
Hi.
I was doing practice questions from BT Notes. There are questions here that require you to compute the probability from the t stat(Miller, Chapter 7). Can anyone help me with that?
Specifically, I'm referring to End of chapter Q&A Questions 1 and 2 on page 139.

Also(please refer the image attached alongwith), in Q1, shouldn't the probability that the true mean lies above 40 be 30%, instead of 70%?

70% should be the probability from negative infinity till 0.54. So the probability of mean being greater than 40 should be 30%. Where am I going wrong here?

Miller.png
 
Last edited:

David Harper CFA FRM

David Harper CFA FRM
Staff member
Subscriber
#2
H @Harshit Chawla It is true that the test statistic is given by (X - µ)/SE = (45 - 40)/[29.3/sqrt(10)] = 0.54 and that, if we want to consider the +0.54 quantile on the student's t distribution (with 9 df), then 70% of the area is to the left (i.e., your point that "70% should be the probability from negative infinity till +0.54") and only 30% of the distribution area is to the right of +0.54 (after all, +0.54 is right of the median!). But this is a good example of why we can't get too attached to memorization ... why not?

Because Miller setup this question in an atypical way! We normally get a question about accepting/rejecting a two-side null, or sometimes about rejecting the one-side null, but this question asks about the probability of accepting a one-side alternative which, I admit, gives me a pause if only because we don't usually see it this way. Notice how the null hypothesized mean is 40.0 and we are asked for the probability that the null is greater than 40.0 in light of an observation that is already above 40.0: a sample mean of 45.0 is pretty good corroboration that the null is above 40.0! This is not typically how the question is asked.

So, the null is µ = 40.0 and our observed sample mean is 45.0 which translates into a test statistic of 0.54 (i.e., a standardized standard deviation) and this tells us that our sample mean is only +0.54 standard errors greater than the hypothesized mean. The one-sided rejection region corresponds to a one-sided null H(0): µ ≤ 40.0 where the alternative is H(A): µ > 40.0.
  • As our sample mean of 45.0 is above the null hypothesized mean of 40.0, if the question were instead "what is the probability that the [population] mean is less than 40.0?" then we would have a more typical one-sided question such that the p-value (i.e., area in the one-sided rejection region) is about =T.DIST.RT(0.54,9) = 30.1%
  • But we are instead asked about the "other side" of that distribution, we are asked about the larger area of the acceptance region which is 69.9%. I hope that's helpful,
 
Thread starter #3
this question asks about the probability of accepting a one-side alternative which
Hi David, I'm sorry, but I still don't understand this. Specifically, shouldn't the probability of accepting a one sided alternative be equal to probability of rejecting a one sided null?

I defined my hypothesis as:
H(0): mu<=40
H(A): mu>40

Could you please explain from this hypothesis' point of view what the question is asking?

Also, how would you identify in the question that we are supposed to reject/not reject the null or accept the alternative?

Finally, can we be asked in the exam to compute our exact confidence? Doesn't that require us to compute the p value?
 

David Harper CFA FRM

David Harper CFA FRM
Staff member
Subscriber
#4
Hi @Harshit Chawla Maybe we should parse the math from the language, because it's Miller's question (I would not write a question like this, I do presume you understand this is Miller's question) and the language is imprecise (and I probably over-explained it above :rolleyes:).

The math is simpler than the language. The observed sample mean is 45 and the null value is 40 so that, to agree with you, a candidate one-sided hypothesis is
  • Null H(0): µ ≤ 40 --> Alternative H(1): µ > 40 (not a bad time to remind the reader that the null must contain the "'=")
  • The test statistic is (45 - 40) / [29/sqrt(10)] = 0.54; this signifies that our observation of 45 is "merely" 0.54 standard standard deviations away from a null hypothesized value of 40.0
  • The implied one-sided p-value = T.DIST.RT(0.54,9) = 30.12% and the two-sided (the more typical) = T.DIST.2T(0.54, D21-1) = 60.0%; large p-values expected because we are too close to the null to reject it!
If we really want to do this correctly, we should pause on these simple facts because the question is flawed. A technically correct question would be:
  • If the population mean truly is 40 or less (aka, conditional on a true one-sided null hypothesis), what is the probability of observing this sample mean of 45 or one that is more extreme? Answer: the p-value of 30.12%, hence our inability to reject the null.
  • A typical "shortcut paraphrase" of this correct question, in turn, would be: What is the probability the true mean is less thnn or equal to 40; i.e., what is the probability that the null is true? ... that's what I mean by the question we'd normally expect in this setup. Statistics students will understand that the shortcut paraphrase itself is imprecise, but at the same time, I would point out that it's common! A proper understanding should pause here because ...
Miller commits a fallacy: because the p-value is 30%, he assumes the probability that the alternative is true is (1 - p ) or 70.0%. Visually (on the distribution), it's understandable, but the one-sided p value of 30.0% is an exact significance level (aka, the probability of committing a Type I error which as part of its definition refers to an error that is conditional on a true null). The p-value of 30.0% does not imply a 70% probability that the alternative is true, which is what the questions asks, and hence commits a fallacy. (Re: Finally, can we be asked in the exam to compute our exact confidence? Doesn't that require us to compute the p value? Yes but we can infer p-value from the lookup table. Please search forum many many discussions and examples on this already). Okay, phew, I need to get to some other things now. I hope that is helpful,

(P.S. Although I do think your first impression reverses the basic distribution, so you may want to re-visit that. Our sample mean of 45 is to the right of the null hypothesized 40. The null of 40 is the median of the student's distribution; and to the right of it, not very far, is the quantile of 45.0, and the area in the tail to its right is ~30.0%; aka, the one-tailed rejection region is a large 30.0% tail).
 
Last edited:

David Harper CFA FRM

David Harper CFA FRM
Staff member
Subscriber
#6
@Harshit Chawla sure thing, it's the sort of bad question that ends up being instructive! btw, I just realized that I glossed right past the fact that we have here a small sample without any knowledge of the distribution: we are not actually justified in using the student's t! CLT would justify the Z/t if the sample were large, but this is a small sample so we need to be told the distribution is normal, see https://www.bionicturtle.com/forum/threads/confidence-intervals-p1-t2-miller-chpt-7.10669/post-52063
 
Top