Bayesian network basics

A short crash course on bayesian networks.

Julián Equihua https://example.com/norajones
10-09-2021

Bayesian networks

Bayesian networks were invented by the computer scientist Judea Pearl.

\ Paper introducing bayesian networks.

Figure 1:  Paper introducing bayesian networks.

BNs are graphical probabilistic models, a family in which conditional dependencies between random variables are expressed by means of a graph.

In particular BNs can only work with graphs that are directed and do not include cycles: directed acyclic graphs or DAGs.

\ A simple three node DAG.

Figure 2:  A simple three node DAG.

Each node (or vertex) represents a random variable.

Each edge (or arc) represents a conditional dependence between the variables it connects.

The DAG defines a factorized version of the joint probability distribution of the set of random variables.

\(P(X,Y,Z)= P(Y|X,Z)P(X)P(Y)\)

\ A simple three node BN.

Figure 3:  A simple three node BN.

Because of the Chain Rule of Probability.

\ Conditional independence in Judea Pearls paper.

Figure 4:  Conditional independence in Judea Pearls paper.

Any two nodes are conditionally independent given the values of their parents. Meaning, in the previous BN, if some state of \(Y\) is observed \(Z\) is no longer influenced by \(X\).

BNs allow the estimation of complex high dimensional probability distributions to be computationally feasible.

Suppose we have \(n\) nodes corresponding to binary random variables and \(m\) arcs which join them.

Differences in BN structure are very relevant.

\ Two apparently similar BNs are very different.

Figure 5:  Two apparently similar BNs are very different.

Why bayesian networks (revisited)?

\ Variables related in someway to EI.

Figure 6:  Variables related in someway to EI.

How do these variables influence EI?

Linear regression, for example, would study this assuming a weighted sum.

\ Lindear model caricature.

Figure 7:  Lindear model caricature.

But don’t we also want to explain how apex predators influence carbon capture?

\ What about relations between explanatory variables?.

Figure 8:  What about relations between explanatory variables?.

BNs allow us to propose a complex criss-cross structure for relations between variables.

\ Our current IE model.

Figure 9:  Our current IE model.

Types of BNs

Discrete: all nodes are categorical. The parameters of the local distribution are specified as conditional probability tables. Each column of such a table represents the probability distribution of the node conditional on a particular configuration of levels of the parents.

Continuous: all nodes are continuous. Continuous nodes are modeled as linear regressions in Gaussian Bayesian networks; as such, the relevant parameters of each local distribution are the regression coefficients (one for each of the parents) and the standard deviation of the residuals.

Hybrid: nodes may be categorical or continuous.