Abstract
Gaussian Graphical Models (GGMs) have been used to construct genetic regulatory networks where regularization techniques are widely used since the network inference usually falls into a high-dimension-low-sample-size scenario. Yet, finding the right amount of regularization can be challenging, especially in an unsupervised setting where traditional methods such as BIC or cross-validation often do not work well. In this paper, we propose a new method-Bootstrap Inference for Network COnstruction (BINCO)-to infer networks by directly controlling the false discovery rates (FDRs) of the selected edges. This method fits a mixture model for the distribution of edge selection frequencies to estimate the FDRs, where the selection frequencies are calculated via model aggregation. This method is applicable to a wide range of applications beyond network construction. When we applied our proposed method to building a gene regulatory network with microarray expression breast cancer data, we were able to identify high-confidence edges and well-connected hub genes that could potentially play important roles in understanding the underlying biological processes of breast cancer.
Original language | English |
---|---|
Pages (from-to) | 391-417 |
Number of pages | 27 |
Journal | Annals of Applied Statistics |
Volume | 7 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2013 |
Externally published | Yes |
Keywords
- FDR
- GGM
- High dimensional data
- Mixture model
- Model aggregation