In Selection Bias the fund manager drop the non performers from the list when reporting the past performance and selects only the best performers is a selection bias,manager is biased towards selection of best performers.
Whereas In Survivorship Bias the best funds performs well and they are the one that survives while the non performing funds are out of the list therefore the fund manager drops the non performing funds and only reports the returns of the funds that are Surviving is a Survivorship Bias therefore manager is biased towards selection of best performers funds that have survived but in fact he should report the performance of non-surviving funds as well.
This same question used to drive me crazy in the past. I think the "intent" of the source of data could be the subtle difference you are looking for. In a selection bias, given the voluntary nature of most funds, only those wanting to provide their track record do, these are typically funds that are performing well. In a Survivorship bias, only the "surviving" firms during a period are included; again, this could be more out of technicality and not necessarily out of intention to kick poor performers out. Ultimately though, they do lead to the same result, an upward bias in fund performance. It's just the way how I look at it.
As I don't see a crisp distinction in Constantinides, I looked in the CAIA text (see below, blue emphasis mine). I think this supports @ShaktiRathore and @Mkaim 's interpretations. Logically, it seems to me you could argue that survivorship bias might be a sub-class of selection bias as, here, selection bias is the (general) feature of a sample which is not representative of the population; although the implication is that sampling bias is due to the method (e.g., @Mkaim's intention) versus survivorship bias might be specifically due to fund death/performance. From CAIA Level I:
"8.8 Sampling and Testing Problems:
This section discusses potential problems when the sample being analyzed is not representative of the population or is not correctly interpreted.
8.8.1 Unrepresentative Data Sets
The validity of a statistical analysis depends on the extent to which the sample or data set on which the analysis is performed is representative of the entire population for which the analyst is concerned. When a sample, subsample, or data set is a biased representation of the population, then statistical tests may be unreliable. A bias is when a sample is obtained or selected in a manner that systematically favors inclusion of observations with particular characteristics that affect the statistical analysis. For example, as privately placed investment pools, the total population or universe of hedge funds is unknown. Suppose that a researcher forms a sample of 100 funds for an in-depth analysis. If the 100 funds were selected at random, then the sample would be an unbiased representation of the population. However, if the 100 funds were selected on the basis of size or years in existence, then the sample would not be representative of the general hedge fund population. Statistical inferences about the entire population should not be made based on this biased sample with regard to such issues as return performance, since return performance is probably related to size and longevity. If the sample tends to contain established and large funds, the sample is likely to contain an upward bias in long-term returns, since these large, established funds probably became large and established by generating higher long-term returns. This is an example of selection bias. Selection bias is a distortion in relevant sample characteristics from the characteristics of the population, caused by the sampling method of selection or inclusion. If the selection bias originates from the decision of fund managers to report or not to report their returns, then the bias is referred to as a self-selection bias.
A number of other related biases have been recognized in alternative investment analysis, especially with regard to the construction of databases of hedge fund returns. For example, survivorship bias is a common problem in investment databases in which the sample is limited to those observations that continue to exist through the end of the period of study. Funds that liquidated, failed, or closed, perhaps due to poor returns, would be omitted. " -- Chambers, Donald R.; Anson, Mark J. P.; Black, Keith H.; Kazemi, Hossein (2015-08-18). Alternative Investments: CAIA Level I (Wiley Finance) (Kindle Locations 6559-6572). Wiley. Kindle Edition.
Despite all the good explanation I am still confused about the definitions of the different biases. Those definitions don't seem mutually exclusive to me. As you mentioned survivorship bias might be a sub-class of selection bias, but it is not presented as such in the FRM curriculum.
1/ Sorry in advance to post a question from Scwheser: "Blue Sky Funds, a private equity fund, has suffered low returns for the last 5 years. As a result, the fund has decided to quit reporting returns. The fund did report returns each year for the last 10 years when performnce was strong. This problem of reporting leads to: A) Survivorship bias
B) Sample selection bias
C) Infrequent trading bias
D) Attrition bias"
Why is it Survivorship bias and not sample selection bias, since the fund did survive? Am I missing something?
2/ In the following study material, I do not see the difference between measurement bias and selection bias:
R80.P2.T8.Constantinides Study Notes
Say a hedge fund does not report to any data (like in the above example in Schweser). Then how should the basis be called?
- According to the above study material, it could be qualified as measurement bias: "and an unknown number (of hedge funds) do not report to any database, past and present. "
- According to Schweser it would be survivorship bias
- According to the above study material, it could be selection bias as well since selection bias = self-reporting bias.
3/ In the following study material, selection bias is defined differently:
R75.P2.T8.Ang "Sample selection bias results from the tendency of returns only to be observed when underlying asset values are high. For instance, buildings tend to be sold when their values are high – otherwise, many sellers postpone sales until property values recover "
Here the bias does not arise from the reporter deciding whether or not to report data, the bias comes from an observed tendency for certain assets to be sold only when the price is high. The concept sounds completely different to me.
Does "selection bias" means two different things depending on the context (illiquid assets vs hedge fund performance)?
Could you please help shed some lights on this topic?