A Brief Intruction of Mutual Information and Demonstration With R
$\newcommand{\entropfrac}[2]{\frac{#1}{#2} \log \left( \frac{#1}{#2} \right)}$
Mututal Information (MI)
Introduction
Mutual Information (MI) distance is used to measure the distance between two genes vectors, for example $x_1 = {1, 0, 1, 1, 1, 1, 0}$ and $y_1 = {0, 1, 1, 1, 1, 1, 0}$. It is easily to transfer the two vectors into a binary table:
X/Y  1(Presence)  0(Absence)  Sum 

1(Presence)  a  b  a+b 
0(Absence)  c  d  c+d 
Sum  a+c  b+d  n=a+b+c+d 
Typically, here we give the example of two discrete variables, the mutual information between $x_1$ and $y_1$ is
The $\eqref{eq:1}$ is equal to
$p(x)$ is the probability that a symbol (here is 0 or 1) appears in the gene vector X regardless that what the symbol is in gene vector Y. $p(y)$ has a similar definition of $p(x). $$p(x, y)$ is probability of a symbol combination appears in gene vector X and Y. In this example, there are four kinds of symbol combination $(1, 1)$, $(1, 0)$, $(0, 1)$ and $(0, 0)$.
If we use the binary table to illustrate this equation, the $\eqref{eq:1}$ is:
The $\eqref{eq:3}$ is mathmatically equal to:
Example
We can use R to directly calculate the MI between two gene vectors mentioned above.
 Use basic R function
 Use R package bioDist
