Chapter 15. Digression about Correlation Coefficients

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.92 MB, 370 trang )

174

15. DIGRESSION ABOUT CORRELATION COEFFICIENTS

Answer. The minimum MSE with only a constant is var[y] and (14.2.32) says that MSE[constant

term and x; y] = var[y]−(cov[x, y])2 / var[x]. Therefore the diﬀerence in MSE’s is (cov[x, y])2 / var[x],

and if one divides by var[y] to get the relative diﬀerence, one gets exactly the squared correlation

coeﬃcient.

Multiple Correlation Coeﬃcients. Now assume x is a vector while y remains a

scalar. Their joint mean vector and dispersion matrix are

(15.1.3)

Ω

x

µ

∼

, σ 2 xx

ω xy

y

ν

ω xy

.

ωyy

By theorem ??, the best linear predictor of y based on x has the formula

(15.1.4)

y ∗ = ν + ω xy Ω − (x − µ)

xx

y ∗ has the following additional extremal value property: no linear combination b x

has a higher squared correlation with y than y ∗ . This maximal value of the squared

correlation is called the squared multiple correlation coeﬃcient

(15.1.5)

ρ2

y(x) =

ω xy Ω − ω xy

xx

ωyy

The multiple correlation coeﬃcient itself is the positive square root, i.e., it is always

nonnegative, while some other correlation coeﬃcients may take on negative values.

The squared multiple correlation coeﬃcient can also deﬁned in terms of proportionate reduction in MSE. It is equal to the proportionate reduction in the MSE of

the best predictor of y if one goes from predictors of the form y ∗ = a to predictors

of the form y ∗ = a + b x, i.e.,

MSE[constant term; y] − MSE[constant term and x; y]

2

(15.1.6)

ρy(x) =

MSE[constant term; y]

There are therefore two natural deﬁnitions of the multiple correlation coeﬃcient.

These two deﬁnitions correspond to the two formulas for R2 in (14.3.6).

Partial Correlation Coeﬃcients. Now assume y = y 1 y 2

is a vector with

two elements and write

   





Ω xx ω y1 ω y2

x

µ

y 1  ∼ ν1  , σ 2  ω y1 ω11 ω12  .

(15.1.7)

y2

ν2

ω y2 ω21 ω22

Let y ∗ be the best linear predictor of y based on x. The partial correlation coeﬃcient

ρ12.x is deﬁned to be the simple correlation between the residuals corr[(y 1 −y ∗ ), (y 2 −

1

y ∗ )]. This measures the correlation between y 1 and y 2 which is “local,” i.e., which

2

does not follow from their association with x. Assume for instance that both y 1 and

y 2 are highly correlated with x. Then they will also have a high correlation with

each other. Subtracting y ∗ from y i eliminates this dependency on x, therefore any

i

remaining correlation is “local.” Compare [Krz88, p. 475].

The partial correlation coeﬃcient can be deﬁned as the relative reduction in the

MSE if one adds y 2 to x as a predictor of y 1 :

(15.1.8)

MSE[constant term and x; y 2 ] − MSE[constant term, x, and y 1 ; y 2 ]

.

ρ2 =

12.x

MSE[constant term and x; y 2 ]

Problem 218. Using the deﬁnitions in terms of MSE’s, show that the following

relationship holds between the squares of multiple and partial correlation coeﬃcients:

(15.1.9)

2

2

1 − ρ2

2(x,1) = (1 − ρ21.x )(1 − ρ2(x) )

15.1. A UNIFIED DEFINITION OF CORRELATION COEFFICIENTS

175

Answer. In terms of the MSE, (15.1.9) reads

(15.1.10)

MSE[constant term, x, and y 1 ; y 2 ]

MSE[constant term, x, and y 1 ; y 2 ] MSE[constant term and x; y 2 ]

=

.

MSE[constant term; y 2 ]

MSE[constant term and x; y 2 ]

MSE[constant term; y 2 ]

From (15.1.9) follows the following weighted average formula:

2

2

2

ρ2

2(x,1) = ρ2(x) + (1 − ρ2(x) )ρ21.x

(15.1.11)

An alternative proof of (15.1.11) is given in [Gra76, pp. 116/17].

Mixed cases: One can also form multiple correlations coeﬃcients with some of

the variables partialled out. The dot notation used here is due to Yule, [Yul07]. The

notation, deﬁnition, and formula for the squared correlation coeﬃcient is

(15.1.12)

ρ2

y(x).z =

(15.1.13)

=

MSE[constant term and z; y] − MSE[constant term, z, and x; y]

MSE[constant term and z; y]

ω xy.z Ω − ω xy.z

xx.z

ωyy.z

CHAPTER 16

Speciﬁc Datasets

16.1. Cobb Douglas Aggregate Production Function

Problem 219. 2 points The Cobb-Douglas production function postulates the

following relationship between annual output q t and the inputs of labor t and capital

kt :

(16.1.1)

γ

q t = µ β kt exp(εt ).

t

q t , t , and kt are observed, and µ, β, γ, and the εt are to be estimated. By the

variable transformation xt = log q t , yt = log t , zt = log kt , and α = log µ, one

obtains the linear regression

(16.1.2)

xt = α + βyt + γzt + εt

Sometimes the following alternative variable transformation is made: ut = log(q t / t ),

vt = log(kt / t ), and the regression

(16.1.3)

ut = α + γvt + εt

is estimated. How are the regressions (16.1.2) and (16.1.3) related to each other?

Answer. Write (16.1.3) as

(16.1.4)

xt − yt = α + γ(zt − yt ) + εt

and collect terms to get

(16.1.5)

xt = α + (1 − γ)yt + γzt + εt

From this follows that running the regression (16.1.3) is equivalent to running the regression (16.1.2)

with the constraint β + γ = 1 imposed.

The assumption here is that output is the only random variable. The regression

model is based on the assumption that the dependent variables have more noise in

them than the independent variables. One can justify this by the argument that

any noise in the independent variables will be transferred to the dependent variable,

and also that variables which aﬀect other variables have more steadiness in them

than variables which depend on others. This justiﬁcation often has merit, but in the

speciﬁc case, there is much more measurement error in the labor and capital inputs

than in the outputs. Therefore the assumption that only the output has an error

term is clearly wrong, and problem 221 below will look for possible alternatives.

Problem 220. Table 1 shows the data used by Cobb and Douglas in their original

article [CD28] introducing the production function which would bear their name.

output is “Day’s index of the physical volume of production (1899 = 100)” described

in [DP20], capital is the capital stock in manufacturing in millions of 1880 dollars

[CD28, p. 145], labor is the “probable average number of wage earners employed in

manufacturing” [CD28, p. 148], and wage is an index of the real wage (1899–1908

= 100).

177

178

16. SPECIFIC DATASETS

year

1899

1900

1901

1902

1903

1904

1905

1906

1907

1908

1909

output

100

101

112

122

124

122

143

152

151

126

155

159

capital

4449

4746

5061

5444

5806

6132

6626

7234

7832

8229

8820

9240

labor

4713

4968

5184

5554

5784

5468

5906

6251

6483

5714

6615

6807

wage

99

98

101

102

100

99

103

101

99

94

102

104

1911

1912

1913

1914

1915

1916

1917

1918

1919

1920

1921

1922

year

1910

output

153

177

184

169

189

225

227

223

218

231

179

240

capital

9624

10067

10520

10873

11840

13242

14915

16265

17234

18118

18542

19192

labor

6855

7167

7277

7026

7269

8601

9218

9446

9096

9110

6947

7602

wage

97

99

100

99

99

104

103

107

111

114

115

119

Table 1. Cobb Douglas Original Data

• a. A text ﬁle with the data is available on the web at www.econ.utah.edu/

ehrbar/data/cobbdoug.txt, and a SDML ﬁle (XML for statistical data which can be

read by R, Matlab, and perhaps also SPSS) is available at www.econ.utah.edu/ehrbar/

data/cobbdoug.sdml. Load these data into your favorite statistics package.

Answer. In R, you can simply issue the command cobbdoug <- read.table("http://www.

econ.utah.edu/ehrbar/data/cobbdoug.txt", header=TRUE). If you run R on unix, you can also

do the following: download cobbdoug.sdml from the www, and then ﬁrst issue the command

library(StatDataML) and then readSDML("cobbdoug.sdml"). When I tried this last, the XML package necessary for StatDataML was not available on windows, but chances are it will be when you

read this.

In SAS, you must issue the commands

data cobbdoug;

infile ’cobbdoug.txt’;

input year output capital labor;

run;

But for this to work you must delete the ﬁrst line in the ﬁle cobbdoug.txt which contains the

variable names. (Is it possible to tell SAS to skip the ﬁrst line?) And you may have to tell SAS

the full pathname of the text ﬁle with the data. If you want a permanent instead of a temporary

dataset, give it a two-part name, such as ecmet.cobbdoug.

Here are the instructions for SPSS: 1) Begin SPSS with a blank spreadsheet. 2) Open up a ﬁle

with the following commands and run:

SET

BLANKS=SYSMIS

UNDEFINED=WARN.

DATA LIST

FILE=’A:\Cbbunst.dat’ FIXED RECORDS=1 TABLE /1 year 1-4 output 5-9 capital

10-16 labor 17-22 wage 23-27 .

EXECUTE.

This ﬁles assume the data ﬁle to be on the same directory, and again the ﬁrst line in the data ﬁle

with the variable names must be deleted. Once the data are entered into SPSS the procedures

(regression, etc.) are best run from the point and click environment.

• b. The next step is to look at the data. On [CD28, p. 150], Cobb and Douglas

plot capital, labor, and output on a logarithmic scale against time, all 3 series

normalized such that they start in 1899 at the same level =100. Reproduce this graph

using a modern statistics package.

16.1. COBB DOUGLAS AGGREGATE PRODUCTION FUNCTION

179

• c. Run both regressions (16.1.2) and (16.1.3) on Cobb and Douglas’s original

dataset. Compute 95% conﬁdence intervals for the coeﬃcients of capital and labor

in the unconstrained and the cconstrained models.

Answer. SAS does not allow you to transform the data on the ﬂy, it insists that you ﬁrst

go through a data step creating the transformed data, before you can run a regression on them.

Therefore the next set of commands creates a temporary dataset cdtmp. The data step data cdtmp

includes all the data from cobbdoug into cdtemp and then creates some transformed data as well.

Then one can run the regressions. Here are the commands; they are in the ﬁle cbbrgrss.sas in

your data disk:

data cdtmp;

set cobbdoug;

logcap = log(capital);

loglab = log(labor);

logout = log(output);

logcl = logcap-loglab;

logol = logout-loglab;

run;

proc reg data = cdtmp;

model logout = logcap loglab;

run;

proc reg data = cdtmp;

model logol = logcl;

run;

Careful! In R, the command lm(log(output)-log(labor) ~ log(capital)-log(labor), data=cobbdoug)

does not give the right results. It does not complain but the result is wrong nevertheless. The right

way to write this command is lm(I(log(output)-log(labor)) ~ I(log(capital)-log(labor)), data=cobbdoug).

• d. The regression results are graphically represented in Figure 1. The big

ellipse is a joint 95% conﬁdence region for β and γ. This ellipse is a level line of the

SSE. The vertical and horizontal bands represent univariate 95% conﬁdence regions

for β and γ separately. The diagonal line is the set of all β and γ with β + γ = 1,

representing the constraint of constant returns to scale. The small ellipse is that level

line of the SSE which is tangent to the constraint. The point of tangency represents

the constrained estimator. Reproduce this graph (or as much of this graph as you

can) using your statistics package.

Remark: In order to make the hand computations easier, Cobb and Douglass

reduced the data for capital and labor to index numbers (1899=100) which were

rounded to integers, before running the regressions, and Figure 1 was constructed

using these rounded data. Since you are using the nonstandardized data, you may

get slightly diﬀerent results.

Answer. lines(ellipse.lm(cbbfit, which=c(2, 3)))

Problem 221. In this problem we will treat the Cobb-Douglas data as a dataset

with errors in all three variables. See chapter ?? and problem ?? about that.

• a. Run the three elementary regressions for the whole period, then choose at

least two subperiods and run it for those. Plot all regression coeﬃcients as points

in a plane, using diﬀerent colors for the diﬀerent subperiods (you have to normalize

them in a special way that they all ﬁt on the same plot).

Answer. Here are the results in R:

> outputlm<-lm(log(output)~log(capital)+log(labor),data=cobbdoug)

> capitallm<-lm(log(capital)~log(labor)+log(output),data=cobbdoug)

> laborlm<-lm(log(labor)~log(output)+log(capital),data=cobbdoug)

180

16. SPECIFIC DATASETS

1.0

0.9

d

d

0.8

0.7

0.6

0.5

d

d

d

d

d

d

0.4

d

d

.

........................

...........................

..

......

......

...

......

...

......

......

...

....

......

......

....

.....

....

.....

....

.....

.....

....

....

.....

.....

....

....

.....

.....

....

.. .

.........

....

..........

.... ...

.....

.....

.....

..... ...

.....

..... ....

....

....

.....

........

.....

......

....

.....

....

.....

.

.....

....

....

.....

.....

....

....

.....

.....

....

....

......

......

...

...

......

......

...

...

......

......

......

..

........

.

.........

.. ...

.................

.................

.

0.3

dq

dq

d

0.2

0.1

d

d

d

0.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5

Figure 1. Coeﬃcients of capital (vertical) and labor (horizontal), dependent variable output, unconstrained and constrained,

1899–1922

> coefficients(outputlm)

(Intercept) log(capital)

log(labor)

-0.1773097

0.2330535

0.8072782

> coefficients(capitallm)

(Intercept) log(labor) log(output)

-2.72052726 -0.08695944 1.67579357

> coefficients(laborlm)

(Intercept) log(output) log(capital)

1.27424214

0.73812541 -0.01105754

#Here is the information for the confidence ellipse:

> summary(outputlm,correlation=T)

Call:

lm(formula = log(output) ~ log(capital) + log(labor), data = cobbdoug)

Residuals:

Min

1Q

Median

-0.075282 -0.035234 -0.006439

3Q

0.038782

Max

0.142114

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.17731 0.43429

-0.408 0.68721

log(capital) 0.23305 0.06353

3.668 0.00143 **

log(labor)

0.80728 0.14508

5.565 1.6e-05 ***

--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05

‘.’

0.1

‘ ’

Residual standard error: 0.05814 on 21 degrees of freedom

Multiple R-Squared: 0.9574,Adjusted R-squared: 0.9534

F-statistic: 236.1 on 2 and 21 degrees of freedom,p-value: 3.997e-15

Correlation of Coefficients:

(Intercept) log(capital)

log(capital)

0.7243

log(labor)

-0.9451

-0.9096

1

16.1. COBB DOUGLAS AGGREGATE PRODUCTION FUNCTION

181

#Quantile of the F-distribution:

> qf(p=0.95, df1=2, df2=21)

[1] 3.4668

• b. The elementary regressions will give you three ﬁtted equations of the form

ˆ

ˆ

(16.1.6)

output = α1 + β12 labor + β13 capital + residual1

ˆ

(16.1.7)

(16.1.8)

ˆ

ˆ

labor = α2 + β21 output + β23 capital + residual2

ˆ

ˆ

ˆ

capital = α3 + β31 output + β32 labor + residual3 .

ˆ

In order to compare the slope parameters in these regressions, ﬁrst rearrange them

in the form

ˆ

ˆ

(16.1.9)

−output + β12 labor + β13 capital + α1 + residual1 = 0

ˆ

(16.1.10)

ˆ

ˆ

β21 output − labor + β23 capital + α2 + residual2 = 0

ˆ

(16.1.11)

ˆ

ˆ

β31 output + β32 labor − capital + α3 + residual3 = 0

ˆ

This gives the following table of coeﬃcients:

output

labor

capital

intercept

−1

0.8072782

0.2330535 −0.1773097

0.73812541

−1 −0.01105754

1.27424214

1.67579357 −0.08695944

−1 −2.72052726

Now divide the second and third rows by the negative of their ﬁrst coeﬃcient, so that

the coeﬃcient of output becomes −1:

out

labor

capital

intercept

−1

0.8072782

0.2330535

−0.1773097

−1

1/0.73812541 0.01105754/0.73812541 −1.27424214/0.73812541

−1 0.08695944/1.67579357

1/1.67579357

2.72052726/1.67579357

After performing the divisions the following numbers are obtained:

output

labor

capital

intercept

−1 0.8072782

0.2330535 −0.1773097

−1 1.3547833 0.014980570 −1.726322

−1 0.05189149 0.59673221

1.6234262

These results can also be re-written in the form given by Table 2.

Intercept

Slope of output Slope of output

wrt labor

wrt capital

Regression of output

on labor and capital

Regression of labor on

output and capital

Regression of capital

on output and labor

Table 2. Comparison of coeﬃcients in elementary regressions

Fill in the values for the whole period and also for several sample subperiods.

Make a scatter plot of the contents of this table, i.e., represent each regression result

as a point in a plane, using diﬀerent colors for diﬀerent sample periods.

182

16. SPECIFIC DATASETS

T

d

d

d

d

d

d

q

d

d

capital

d

d

d

d

d

d

dqoutput no error, crs

d c Cobb Douglas’s original result

output

d q

d

d

d

d

d

qlabor E

Figure 2. Coeﬃcients of capital (vertical) and labor (horizontal), dependent variable output, 1899–1922

1.0

0.9

d

d

0.8

d

d

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

q

d

..... capital all errors

.....

.....

.....

.....

.....

.....

.....

.....

......

.....

.....

.....

.....

.....

......

.....

.....

.....

.....

.....

......

.....

.....

.....

.....

......

.....

.....

.....

.....

.....

...... output no error, crs

.....

.....

.....

.....

.....

......

.....

.....output all errors

.....

.....

.....

.....

.....

.....

.....

.....

.....

.....

.....

.....

......

.....

.....

.....

.....

.....

......

.....

.....

.....

.....

.....

.....

..... labor

.....

.....

.....

....

..

d

d

d

d

d q

d

dq

d

d

d

d

q

all errors

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5

Figure 3. Coeﬃcient of capital (vertical) and labor (horizontal)

in the elementary regressions, dependent variable output, 1899–1922

Problem 222. Given a univariate problem with three variables all of which have

zero mean, and a linear constraint that the coeﬃcients of all variables sum to 0. (This

is the model apparently appropriate to the Cobb-Douglas data, with the assumption

of constant returns to scale, after taking out the means.) Call the observed variables

x, y, and z, with underlying systematic variables x∗ , y ∗ , and z ∗ , and errors u, v,

and w.

• a. Write this model in the form (??).

16.1. COBB DOUGLAS AGGREGATE PRODUCTION FUNCTION

183

Answer.

x∗

y∗

−1

β

1−β

z∗

(16.1.12)

x

y

x∗ = βy ∗ + (1 − β)z ∗

x = x∗ + u

=0

or

z = x∗

y∗

z∗ + u

v

w

y = y∗ + v

z = z ∗ + w.

• b. The moment matrix of the systematic variables can be written fully in terms

2

2

of σy∗ , σz∗ , σy∗ z∗ , and the unknown parameter β. Write out the moment matrix and

therefore the Frisch decomposition.

Answer. The moment matrix is the middle matrix in the following Frisch decomposition:

2

σx

σxy

σxz

(16.1.13)

(16.1.14)

2

2

β 2 σy∗ + 2β(1 − β)σy∗ z∗ + (1 − β)2 σz∗

2 + (1 − β)σ ∗ ∗

βσy∗

=

y z

2

βσy∗ z∗ + (1 − β)σz∗

σxy

2

σy

σyz

σxz

σyz

2

σz

=

2

βσy∗ + (1 − β)σy∗ z∗

2

σy ∗

2

σy ∗

2

2

βσy∗ z∗ + (1 − β)σz∗

σu

σy ∗ z ∗

+ 0

2

0

σz ∗

• c. Show that the unknown parameters are not yet identiﬁed. However, if one

2

2

2

makes the additional assumption that one of the three error variances σu , σv , or σw

is zero, then the equations are identiﬁed. Since the quantity of output presumably

2

has less error than the other two variables, assume σu = 0. Under this assumption,

show that

σ 2 − σxz

(16.1.15)

β= x

σxy − σxz

and this can be estimated by replacing the variances and covariances by their sample

counterparts. In a similar way, derive estimates of all other parameters of the model.

Answer. Solving (16.1.14) one gets from the yz element of the covariance matrix

(16.1.16)

σy∗ z∗ = σyz

and from the xz element

(16.1.17)

2

σz ∗ =

σxz − βσyz

1−β

Similarly, one gets from the xy element:

(16.1.18)

2

σy ∗ =

σxy − (1 − β)σyz

β

Now plug (16.1.16), (16.1.17), and (16.1.18) into the equation for the xx element:

(16.1.19)

(16.1.20)

2

2

σx = β(σxy − (1 − β)σyz ) + 2β(1 − β)σyz + (1 − β)(σxz − βσyz ) + σu

2

= βσxy + (1 − β)σxz + σu

2

Since we are assuming σu = 0 this last equation can be solved for β:

(16.1.21)

β=

2

σx − σxz

σxy − σxz

If we replace the variances and covariances by the sample variances and covariances, this gives an

estimate of β.

• d. Evaluate these formulas numerically. In order to get the sample means and

the sample covariance matrix of the data, you may issue the SAS commands

0

2

σv

0

0

0 .

2

σw

184

16. SPECIFIC DATASETS

proc corr cov nocorr data=cdtmp;

var logout loglab logcap;

run;

These commands are in the ﬁle cbbcovma.sas on the disk.

Answer. Mean vector and covariance matrix are

(16.1.22)

LOGOUT

LOGLAB

LOGCAP

5.07734

0.0724870714

4.96272 , 0.0522115563

5.35648

0.1169330807

∼

0.0522115563

0.0404318579

0.0839798588

0.1169330807

0.0839798588

0.2108441826

Therefore equation (16.1.15) gives

0.0724870714 − 0.1169330807

ˆ

β=

= 0.686726861149148

0.0522115563 − 0.1169330807

ˆ

ˆ

In Figure 3, the point (β, 1− β) is exactly the intersection of the long dotted line with the constraint.

(16.1.23)

• e. The fact that all 3 points lie almost on the same line indicates that there may

be 2 linear relations: log labor is a certain coeﬃcient times log output, and log capital

is a diﬀerent coeﬃcient times log output. I.e., y ∗ = δ1 + γ1 x∗ and z ∗ = δ2 + γ2 x∗ .

In other words, there is no substitution. What would be the two coeﬃcients γ1 and

γ2 if this were the case?

Answer. Now the Frisch decomposition is

(16.1.24)

2

σx

σxy

σxz

σxy

2

σy

σyz

σxz

σyz

2

σz

2

= σx∗

1

γ1

γ2

γ1

2

γ1

γ1 γ2

γ2

γ1 γ2

2

γ2

+

2

σu

0

0

0

2

σv

0

0

0 .

2

σw

Solving this gives (obtain γ1 by dividing the 32-element by the 31-element, γ2 by dividing the

2

32-element by the 12-element, σx∗ by dividing the 21-element by γ1 , etc.

(16.1.25)

0.0839798588

σyz

=

= 0.7181873452513939

γ1 =

σxy

0.1169330807

σyz

0.0839798588

γ2 =

=

= 1.608453467992104

σxz

0.0522115563

σyx σxz

0.0522115563 · 0.1169330807

2

= 0.0726990758

σx∗ =

=

σyz

0.0839798588

σyx σxz

= 0.0724870714 − 0.0726990758 = −0.000212

σyz

σxy σyz

2

2

σv = σy −

σxz

σxz σzy

2

2

σw = σz −

σxy

2

2

σu = σx −

This model is just barely rejected by the data since it leads to a slightly negative variance for U .

• f. The assumption that there are two linear relations is represented as the

light-blue line in Figure 3. What is the equation of this line?

Answer. If y = γ1 x and z = γ2 x then the equation x = β1 y + β2 z holds whenever β1 γ1 +

β2 γ2 = 1. This is a straight line in the β1 , β2 -plane, going through the points and (0, 1/γ2 ) =

0.0522115563

(0, 0.0839798588 = 0.6217152189353289) and (1/γ1 , 0) = ( 0.1169330807 = 1.3923943475361023, 0).

0.0839798588

This line is in the ﬁgure, and it is just a tiny bit on the wrong side of the dotted line connecting

the two estimates.

16.2. Houthakker’s Data

For this example we will use Berndt’s textbook [Ber91], which discusses some

of the classic studies in the econometric literature.

One example described there is the estimation of a demand function for electricity [Hou51], which is the ﬁrst multiple regression with several variables run on a

computer. In this exercise you are asked to do all steps in exercise 1 and 3 in chapter

7 of Berndt, and use the additional facilities of R to perform other steps of data

analysis which Berndt did not ask for, such as, for instance, explore the best subset

of regressors using leaps and the best nonlinear transformation using avas, do some

Xem Thêm

Chapter 15. Digression about Correlation Coefficients

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về