Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (6.13 MB, 1,644 trang )
262
10. MULTIVARIATE NORMAL
see that this joint density integrates to 1, go over to polar coordinates x = r cos φ,
y = r sin φ, i.e., compute the joint distribution of r and φ from that of x and y: the
absolute value of the Jacobian determinant is r, i.e., dx dy = r dr dφ, therefore
y=∞
x=∞
y=−∞
x=−∞
(10.1.2)
1 − x2 +y2
2
e
dx dy =
2π
2π
∞
φ=0
r=0
1 − r2
e 2 r dr dφ.
2π
∞
1
By substituting t = r /2, therefore dt = r dr, the inner integral becomes − 2π e−t 0 =
1
2π ; therefore the whole integral is 1. Therefore the product of the integrals of the
marginal densities is 1, and since each such marginal integral is positive and they are
equal, each of the marginal integrals is 1 too.
2
∞
Problem 161. 6 points The Gamma function can be defined as Γ(r) = 0 xr−1 e−x
√
1
Show that Γ( 2 ) = π. (Hint: after substituting r = 1/2, apply the variable transformation x = z 2 /2 for nonnegative x and z only, and then reduce the resulting integral
to the integral over the normal density function.)
Answer. Then dx = z dz,
normal density:
∞
(10.1.3)
0
dx
√
x
= dz
√
1
√ e−x dx = 2
x
√
2. Therefore one can reduce it to the integral over the
∞
e−z
0
2
/2
1
dz = √
2
∞
e−z
−∞
2
/2
√
√
2π
dz = √ = π.
2
10.1. MORE ABOUT THE UNIVARIATE CASE
263
A univariate normal variable with mean µ and variance σ 2 is a variable x whose
standardized version z = x−µ ∼ N (0, 1). In this transformation from x to z, the
σ
dz
1
Jacobian determinant is dx = σ ; therefore the density function of x ∼ N (µ, σ 2 ) is
(two notations, the second is perhaps more modern:)
fx (x) = √
(10.1.4)
1
2πσ 2
e−
(x−µ)2
2σ 2
= (2πσ 2 )−1/2 exp −(x − µ)2 /2σ 2 .
Problem 162. 3 points Given n independent observations of a Normally dis¯
tributed variable y ∼ N (µ, 1). Show that the sample mean y is a sufficient statistic for µ. Here is a formulation of the factorization theorem for sufficient statistics, which you will need for this question: Given a family of probability densities
fy (y1 , . . . , yn ; θ) defined on Rn , which depend on a parameter θ ∈ Θ. The statistic
T : Rn → R, y1 , . . . , yn → T (y1 , . . . , yn ) is sufficient for parameter θ if and only if
there exists a function of two variables g : R × Θ → R, t, θ → g(t; θ), and a function
of n variables h : Rn → R, y1 , . . . , yn → h(y1 , . . . , yn ) so that
fy (y1 , . . . , yn ; θ) = g T (y1 , . . . , yn ); θ · h(y1 , . . . , yn ).
(10.1.5)
Answer. The joint density function can be written (factorization indicated by ·):
(10.1.6)
(2π)−n/2 exp −
1
2
n
(yi −µ)2 = (2π)−n/2 exp −
i=1
1
2
n
i=1
n
(yi −¯)2 ·exp − (¯−µ)2 = h(y1 , . . . , yn )·g
y
y
2
264
10. MULTIVARIATE NORMAL
10.2. Definition of Multivariate Normal
The multivariate normal distribution is an important family of distributions with
very nice properties. But one must be a little careful how to define it. One might
naively think a multivariate Normal is a vector random variable each component
of which is univariate Normal. But this is not the right definition. Normality of
the components is a necessary but not sufficient condition for a multivariate normal
x
vector. If u =
with both x and y multivariate normal, u is not necessarily
y
multivariate normal.
Here is a recursive definition from which one gets all multivariate normal distributions:
(1) The univariate standard normal z, considered as a vector with one component, is multivariate normal.
x
(2) If x and y are multivariate normal and they are independent, then u =
y
is multivariate normal.
(3) If y is multivariate normal, and A a matrix of constants (which need not
be square and is allowed to be singular), and b a vector of constants, then Ay + b
10.3. BIVARIATE NORMAL
265
is multivariate normal. In words: A vector consisting of linear combinations of the
same set of multivariate normal variables is again multivariate normal.
For simplicity we will go over now to the bivariate Normal distribution.
10.3. Special Case: Bivariate Normal
The following two simple rules allow to obtain all bivariate Normal random
variables:
(1) If x and y are independent and each of them has a (univariate) normal
distribution with mean 0 and the same variance σ 2 , then they are bivariate normal.
(They would be bivariate normal even if their variances were different and their
means not zero, but for the calculations below we will use only this special case, which
together with principle (2) is sufficient to get all bivariate normal distributions.)
x
(2) If x =
is bivariate normal and P is a 2 × 2 nonrandom matrix and µ
y
a nonrandom column vector with two elements, then P x + µ is bivariate normal as
well.
All other properties of bivariate Normal variables can be derived from this.
First let us derive the density function of a bivariate Normal distribution. Write
x
x=
. x and y are independent N (0, σ 2 ). Therefore by principle (1) above the
y
266
10. MULTIVARIATE NORMAL
vector x is bivariate normal. Take any nonsingular 2 × 2 matrix P and a 2 vector
µ
u
µ=
= u = P x + µ. We need nonsingularity because otherwise
, and define
ν
v
the resulting variable would not have a bivariate density; its probability mass would
be concentrated on one straight line in the two-dimensional plane. What is the
joint density function of u? Since P is nonsingular, the transformation is on-to-one,
therefore we can apply the transformation theorem for densities. Let us first write
down the density function of x which we know:
(10.3.1)
fx,y (x, y) =
1
1
exp − 2 (x2 + y 2 ) .
2
2πσ
2σ
For the next step, remember that we have to express the old variable in terms
of the new one: x = P −1 (u − µ). The Jacobian determinant is therefore J =
x
u−µ
det(P −1 ). Also notice that, after the substitution
= P −1
, the expoy
v−ν
1
1
nent in the joint density function of x and y is − 2σ2 (x2 + y 2 ) = − 2σ2
1
− 2σ2
u−µ
v−ν
P −1 P −1
x
y
x
=
y
u−µ
. Therefore the transformation theorem of density
v−ν
10.3. BIVARIATE NORMAL
267
functions gives
(10.3.2)
fu,v (u, v) =
1
1 u−µ
det(P −1 ) exp − 2
2
2πσ
2σ v − ν
P −1 P −1
u−µ
v−ν
.
This expression can be made nicer. Note that the covariance matrix of the
u
] = σ 2 P P = σ 2 Ψ, say. Since P −1 P −1 P P = I,
transformed variables is V [
v
it follows P −1 P −1 = Ψ−1 and det(P −1 ) = 1/ det(Ψ), therefore
(10.3.3)
fu,v (u, v) =
1
2πσ 2
1
det(Ψ)
exp −
1 u−µ
2σ 2 v − ν
Ψ−1
u−µ
v−ν
.
This is the general formula for the density function of a bivariate normal with nonsingular covariance matrix σ 2 Ψ and mean vector µ. One can also use the following
notation which is valid for the multivariate Normal variable with n dimensions, with
mean vector µ and nonsingular covariance matrix σ 2 Ψ:
(10.3.4)
fx (x) = (2πσ 2 )−n/2 (det Ψ)−1/2 exp −
1
(x − µ) Ψ−1 (x − µ) .
2σ 2
Problem 163. 1 point Show that the matrix product of (P −1 ) P −1 and P P
is the identity matrix.
268
10. MULTIVARIATE NORMAL
Problem 164. 3 points All vectors in this question are n × 1 column vectors.
ε
ε
Let y = α+ε , where α is a vector of constants and ε is jointly normal with E [ε ] = o.
ε
Often, the covariance matrix V [ε ] is not given directly, but a n×n nonsingular matrix
T is known which has the property that the covariance matrix of T ε is σ 2 times the
n × n unit matrix, i.e.,
2
V [T ε ] = σ I n .
(10.3.5)
Show that in this case the density function of y is
1
T (y − α) T (y − α) .
2σ 2
Hint: define z = T ε , write down the density function of z, and make a transformation between z and y.
(10.3.6)
fy (y) = (2πσ 2 )−n/2 |det(T )| exp −
Answer. Since E [z] = o and V [z] = σ 2 I n , its density function is (2πσ 2 )−n/2 exp(−z z/2σ 2 ).
Now express z, whose density we know, as a function of y, whose density function we want to know.
z = T (y − α) or
(10.3.7)
(10.3.8)
(10.3.9)
z1 = t11 (y1 − α1 ) + t12 (y2 − α2 ) + · · · + t1n (yn − αn )
.
.
.
zn = tn1 (y1 − α1 ) + tn2 (y1 − α2 ) + · · · + tnn (yn − αn )
therefore the Jacobian determinant is det(T ). This gives the result.
10.3. BIVARIATE NORMAL
269
10.3.1. Most Natural Form of Bivariate Normal Density.
Problem 165. In this exercise we will write the bivariate normal density in its
most natural form. For this we set the multiplicative “nuisance parameter” σ 2 = 1,
i.e., write the covariance matrix as Ψ instead of σ 2 Ψ.
u
] in terms of the standard
• a. 1 point Write the covariance matrix Ψ = V [
v
deviations σu and σv and the correlation coefficient ρ.
• b. 1 point Show that the inverse of a 2 × 2 matrix has the following form:
a b
c d
(10.3.10)
−1
=
1
d −b
.
−c a
ad − bc
• c. 2 points Show that
(10.3.11)
(10.3.12)
q 2 = u − µ v − ν Ψ−1
=
u−µ
v−ν
1
(u − µ)2
u−µv−ν
(v − ν)2
− 2ρ
+
.
2
2
1 − ρ2
σu
σu
σv
σv
270
10. MULTIVARIATE NORMAL
• d. 2 points Show the following quadratic decomposition:
(10.3.13)
q2 =
1
σv
(u − µ)2
+
v − ν − ρ (u − µ)
2
2
σu
σu
(1 − ρ2 )σv
2
.
• e. 1 point Show that (10.3.13) can also be written in the form
2
(u − µ)2
σ2
σuv
+ 2 2 u
v − ν − 2 (u − µ) .
2
2
σu
σu σv − (σuv )
σu
√
• f. 1 point Show that d = det Ψ can be split up, not additively but multiplicatively, as follows: d = σu · σv 1 − ρ2 .
(10.3.14)
q2 =
• g. 1 point Using these decompositions of d and q 2 , show that the density function fu,v (u, v) reads
(10.3.15)
2
σv
(v − ν) − ρ σu (u − µ)
1
(u − µ)2
1
exp −
·
exp −
.
2
2
2
2
2σu
2(1 − ρ2 )σv
2πσu
2πσv 1 − ρ2
σv
2
The second factor in (10.3.15) is the density of a N (ρ σu u, (1 − ρ2 )σv ) evaluated
at v, and the first factor does not depend on v. Therefore if I integrate v out to
get the marginal density of u, this simply gives me the first factor. The conditional
density of v given u = u is the joint divided by the marginal, i.e., it is the second
10.3. BIVARIATE NORMAL
271
factor. In other words, by completing the square we wrote the joint density function
in its natural form as the product of a marginal and a conditional density function:
fu,v (u, v) = fu (u) · fv|u (v; u).
From this decomposition one can draw the following conclusions:
2
• u ∼ N (0, σu ) is normal and, by symmetry, v is normal as well. Note that u
(or v) can be chosen to be any nonzero linear combination of x and y. Any
nonzero linear transformation of independent standard normal variables is
therefore univariate normal.
• If ρ = 0 then the joint density function is the product of two independent
univariate normal density functions. In other words, if the variables are
normal, then they are independent whenever they are uncorrelated. For
general distributions only the reverse is true.
• The conditional density of v conditionally on u = u is the second term on
the rhs of (10.3.15), i.e., it is normal too.
• The conditional mean is
(10.3.16)
E[v|u = u] = ρ
σv
u,
σu
272
10. MULTIVARIATE NORMAL
i.e., it is a linear function of u. If the (unconditional) means are not zero,
then the conditional mean is
σv
(10.3.17)
E[v|u = u] = µv + ρ (u − µu ).
σu
Since ρ =
(10.3.18)
cov[u,v]
σu σv ,
(10.3.17) can als be written as follows:
E[v|u = u] = E[v] +
cov[u, v]
(u − E[u])
var[u]
• The conditional variance is the same whatever value of u was chosen: its
value is
(10.3.19)
2
var[v|u = u] = σv (1 − ρ2 ),
which can also be written as
(10.3.20)
var[v|u = u] = var[v] −
(cov[u, v])2
.
var[u]
We did this in such detail because any bivariate normal with zero mean has this
form. A multivariate normal distribution is determined by its means and variances
and covariances (or correlations coefficients). If the means are not zero, then the
densities merely differ from the above by an additive constant in the arguments, i.e.,