Wed Mar 06 2024
False Discovery Rate (FDR) Control
FDR Introduction
For the relationship between the truth of hypotheses in reality (Null and Non-null) and the truth of test results (Reject and Not reject), we can use the following table to represent it:
| Null is True | Null is False | |
|---|---|---|
| Reject Null | Type I Error | True Positive |
| Not Reject Null | True Negative | Type II Error |
False Discovery — from this perspective — seems to be Type I Error. The difference is only in the denominator.
However, consider the following situation, for multiple tests with \alpha = 0.05, among 100 features there is one that is truly significant, but in the remaining 99 features that are actually not significant, no more than 5% of the tests result in false positives.
From this situation, we can discover — the confidence level in hypothesis testing cannot suppress the FDR, especially when the number of True Positives is actually very low. This situation is not uncommon in biostatistics and other high-dimensional data (sparsity?).
Benjamini-Hochberg Procedure (BH(q))
The most classic method to control FDR is the Benjamini-Hochberg Procedure. The basic idea of this method is: instead of setting a specific (e.g., 0.05), a function related to i (maximum index) and q (fixed constant) is used as the threshold for p-values in multiple hypothesis testing, to control the overall false discovery rate. The specific operations are as follows:
- Set a constant , for example, 0.1 (controversial, to be discussed later);
- Sort the p-values of multiple hypothesis tests from low to high, denoted as ;
- For the maximum index where :
- If : Reject
- Otherwise, accept
It requires to be independent. In this case, the expected FDR will have an upper bound limited by q.
Due to the similarity in definitions, FDR is often compared with FWER. The difference is that FWER is for all hypothesis tests, while FDR is for the rejected hypothesis tests. FWER tends to be more conservative as it attempts to control all types of errors, which may reduce the model's power (, the ability to discover true significance); the FDR method is more popular in dealing with high-dimensional data because it offers a better balance between controlling error rates and maintaining high statistical power.
The requirement for to be independent can be relaxed. When are only non-positively correlated, by introducing some correction terms, the Benjamini-Hochberg Procedure can still be effective.
Proof of the Upper Bound of FDR Expectation under BH(q)
Algebra
Define as the number of tests for which but is actually true. Then:
Let , then
Therefore, the rejection region for , having:
Let
And is known, thus:
Therefore:
Visual and Algebraic Integration
Noticing the image in the book, there seems to be an intuitive but not strictly rigorous proof that is hard to fault.

With a constant slope, we have:
For independently distributed , including true negatives, the probability of each being judged positive because it's below h is h, so:
Questions
During the proof process, one can roughly feel why the BH method can control FDR, but there are still some questions about the details of its criteria:
- Why find the largest index that satisfies the condition and reject all , instead of rejecting for that satisfies . The conclusions reached in the examples given in the book are consistent, but it is also easy to think of some counterexamples.
- Regarding the details of choosing q. Since FDR is more open compared to FWER—able to control error rates while maintaining high statistical power, why is the selection of q-value even more lenient than the threshold for p-value?
- The combination with empirical Bayes.
ref:
- https://cpb-us-w2.wpmucdn.com/blog.nus.edu.sg/dist/0/3425/files/2018/10/Understanding-Benjamini-Hochberg-method-2ijolq0.pdf
- Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing and Prediction