Towards scalable and data efficient learning of Markov boundaries

  • Jose M. Peña
  • , Roland Nilsson
  • , Johan Björkegren
  • , Jesper Tegnér

Research output: Contribution to journalArticlepeer-review

194 Scopus citations

Abstract

We propose algorithms for learning Markov boundaries from data without having to learn a Bayesian network first. We study their correctness, scalability and data efficiency. The last two properties are important because we aim to apply the algorithms to identify the minimal set of features that is needed for probabilistic classification in databases with thousands of features but few instances, e.g. gene expression databases. We evaluate the algorithms on synthetic and real databases, including one with 139,351 features.

Original languageEnglish
Pages (from-to)211-232
Number of pages22
JournalInternational Journal of Approximate Reasoning
Volume45
Issue number2
DOIs
StatePublished - Jul 2007
Externally publishedYes

Keywords

  • Bayesian networks
  • Classification
  • Feature subset selection

Fingerprint

Dive into the research topics of 'Towards scalable and data efficient learning of Markov boundaries'. Together they form a unique fingerprint.

Cite this