Tuesday, July 19, 2016

Independence Day


In Daphne Koller's excellent "Probabilistic Graphical Models", she talks of independence between variables and leaves as an exercise to the reader (viewer) to prove to themselves that of the following Conditional Probability Distribution (CPD), one represents independent variables and the other represents dependent variables.



Being a bit slow, it took me some time to work it out. One must remember that independence is defined as:

For events α, β then P ⊢ α  β (read: "P satisfies that α and β are independent") if:

P(α, β)  = P(α) P(β)
P(α | β) = P(α) 
P(β | α) = P(β)

These 3 equations are equivalent.

Now, back to our two CPDs, we can see that the ratio of d0 to d1 is the same for either i0 or i1 (in SQL-land, imagine a "group by"; it's called marginalization in probability distributions). That is, having been told the value of i, the probable value of d is unchanged. Therefore, the equations above are satisfied and i and d are independent.

The same is not true of the CPD on the right hand side, therefore in this CPD, i and d are not independent.

In other words, if we call left hand side graph of dependencies G1 and the right hand side G2 then I(G1) = { D ⊥ I } and I(G2) = ∅. This introduces the notion of I-Maps defined below.

I-Maps

"Let P be a distribution over X. We define I(P) to be the set of independence assertions of the form (X ⊥ Y | Z) that hold in P . We can now rewrite the statement that “P satisfies the local independencies associated with G” simply as I(G) ⊆ I(P). In this case, we say that G is an I-map (independency map) for P."

We define P as "a distribution P satisfies the local independencies associated with a graph G if and only if P is representable as a set of CPDs associated with the graph G" [1]

[1] Probabilistic Graphical Models: Principles and Techniques, Koller.

No comments:

Post a Comment