In probability theory, when we assert that two events are independent, we intuitively mean that knowing whether or not one of them occurred makes it neither more probable nor less probable that the other occurred. For example, the events "today is Tuesday" and "it rains today" are independent.

Similarly, when we assert that two random variables are independent, we intuitively mean that knowing something about the value of one of them does not yield any information about the value of the other. For instance, the height of a person and their IQ are independent random variables. Another typical example of two independent variables is given by repeating an experiment: roll a die twice, let X be the number you get the first time, and Y the number you get the second time. These two variables are independent.

Independent events

We define two events E1 and E2 of a probability space to be independent iff

P(E1E2) = P(E1) · P(E2).
Here E1E2 (the intersection of E1 and E2) is the event that E1 and E2 both occur; P denotes the probability of an event.

If P(E2) ≠ 0, then the independence of E1 and E2 can also be expressed with conditional probabilities:

P(E1 | E2) = P(E1)
which is closer to the intuition given above: the information that E2 happened does not change our estimate of the probability of E1.

If we have more than two events, then pairwise independence is insufficient to capture the intuitive sense of independence. So a set S of events is said to be independent if every finite nonempty subset { E1, ..., En } of S satisfies

P(E1 ∩ ... ∩ En) = P(E1) · ... · P(En).

This is called the multiplication rule for independent events.

Independent random variables

We define random variables X and Y to be independent if

Pr[(X in A) & (Y in B)] = Pr[X in A] · Pr[Y in B]
for A and B any Borel subsets of the real numbers.

If X and Y are independent, then the expectation operator has the nice property

E[X· Y] = E[X] · E[Y]
and for the variance we have
Var(X + Y) = Var(X) + Var(Y).

Furthermore, if X and Y are independent and have probability densities fX(x)and fY(y), then (X,Y) has a joint density of
fXY(x,y)dx dy = fX(x)dx fY(y)dy.

Still need to deal with independence of sets of more than 2 random variables.