1. Trang chủ >
  2. Kinh Doanh - Tiếp Thị >
  3. Kế hoạch kinh doanh >

Chapter 16. General Principles of Econometric Modelling

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (6.13 MB, 1,644 trang )


470



16. GENERAL PRINCIPLES OF ECONOMETRIC MODELLING



associated evidence.” The other extreme is “‘Data-driven’ approaches, where models

are developed to closely describe the data . . . These suffer from sample dependence in

that accidental and transient data features are embodied as tightly in the model as

permanent aspects, so that extension of the data set often reveal predictive failure.”

Hendry proposes the following useful distinction of 4 levels of knowledge:

A Consider the situation where we know the complete structure of the process

which gernerates economic data and the values of all its parameters. This is the

equivalent of a probability theory course (example: rolling a perfect die), but involves

economic theory and econometric concepts.

B consider a known economic structure with unknown values of the parameters.

Equivalent to an estimation and inference course in statistics (example: independent

rolls of an imperfect die and estimating the probabilities of the different faces) but

focusing on econometrically relevant aspects.

C is “the empirically relevant situation where neither the form of the datagenerating process nor its parameter values are known. (Here one does not know

whether the rolls of the die are independent, or whether the probabilities of the

different faces remain constant.) Model discovery, evaluation, data mining, modelsearch procedures, and associated methodological issues.

D Forecasting the future when the data outcomes are unknown. (Model of money

demand under financial innovation).



16. GENERAL PRINCIPLES OF ECONOMETRIC MODELLING



471



The example of Keynes’s consumption function in [Gre97, pp. 221/22] sounds at

the beginning as if it was close to B, but in the further discussion Greene goes more

and more over to C. It is remarkable here that economic theory usually does not yield

functional forms. Greene then says: the most common functional form is the linear

one c = α + βx with α > 0 and 0 < β < 1. He does not mention the aggregation

problem hidden in this. Then he says: “But the linear function is only approximate;

in fact, it is unlikely that consumption and income can be connected by any simple

relationship. The deterministic relationship is clearly inadequate.” Here Greene

uses a random relationship to model a relationship which is quantitatively “fuzzy.”

This is an interesting and relevant application of randomness.

A sentence later Green backtracks from this insight and says: “We are not so

ambitious as to attempt to capture every influence in the relationship, but only those

that are substantial enough to model directly.” The “fuzziness” is not due to a lack

of ambition of the researcher, but the world is inherently quantiatively fuzzy. It is

not that we don’t know the law, but there is no law; not everything that happens in

an economy is driven by economic laws. Greene’s own example, in Figure 6.2, that

during the war years consumption was below the trend line, shows this.

Greene’s next example is the relationship between income and education. This

illustrates multiple instead of simple regression: one must also include age, and then

also the square of age, even if one is not interested in the effect which age has, but



472



16. GENERAL PRINCIPLES OF ECONOMETRIC MODELLING



in order to “control” for this effect, so that the effects of education and age will not

be confounded.

Problem 224. Why should a regression of income on education include not only

age but also the square of age?

Answer. Because the effect of age becomes smaller with increases in age.



Critical Realist approaches are [Ron02] and [Mor02].



CHAPTER 17



Causality and Inference

This chapter establishes the connection between critical realism and Holland and

Rubin’s modelling of causality in statistics as explained in [Hol86] and [WM83, pp.

3–25] (and the related paper [LN81] which comes from a Bayesian point of view). A

different approach to causality and inference, [Roy97], is discussed in chapter/section

2.8. Regarding critical realism and econometrics, also [Dow99] should be mentioned:

this is written by a Post Keynesian econometrician working in an explicitly realist

framework.

Everyone knows that correlation does not mean causality. Nevertheless, experience shows that statisticians can on occasion make valid inferences about causality. It is therefore legitimate to ask: how and under which conditions can causal

473



474



17. CAUSALITY AND INFERENCE



conclusions be drawn from a statistical experiment or a statistical investigation of

nonexperimental data?

Holland starts his discussion with a description of the “logic of association”

(= a flat empirical realism) as opposed to causality (= depth realism). His model

for the “logic of association” is essentially the conventional mathematical model of

probability by a set U of “all possible outcomes,” which we described and criticized

on p. 12 above.

After this, Rubin describes his own model (developed together with Holland).

Rubin introduces “counterfactual” (or, as Bhaskar would say, “transfactual”) elements since he is not only talking about the value a variable takes for a given

individual, but also the value this variable would have taken for the same individual

if the causing variables (which Rubin also calls “treatments”) had been different.

For simplicity, Holland assumes here that the treatment variable has only two levels:

either the individual receives the treatment, or he/she does not (in which case he/she

belongs to the “control” group). The correlational view would simply measure the

average response of those individuals who receive the treatment, and of those who

don’t. Rubin recognizes in his model that the same individual may or may not be

subject to the treatment, therefore the response variable has two values, one being

the individual’s response if he or she receives the treatment, the other the response

if he or she does not.



17. CAUSALITY AND INFERENCE



475



A third variable indicates who receives the treatment. I.e, he has the “causal indicator” s which can take two values, t (treatment) and c (control), and two variables

y t and y c , which, evaluated at individual ω, indicate the responses this individual

would give in case he was subject to the treatment, and in case he was or not.

Rubin defines y t − y c to be the causal effect of treatment t versus the control

c. But this causal effect cannot be observed. We cannot observe how those indiviuals who received the treatement would have responded if they had not received

the treatment, despite the fact that this non-actualized response is just as real as

the response which they indeed gave. This is what Holland calls the Fundamental

Problem of Causal Inference.

Problem 225. Rubin excludes race as a cause because the individual cannot do

anything about his or her race. Is this argument justified?

Does this Fundamental Problem mean that causal inference is impossible? Here

are several scenarios in which causal inference is possible after all:











Temporal stability of the response, and transience of the causal effect.

Unit homogeneity.

Constant effect, i.e., yt (ω) − yc (ω) is the same for all ω.

Independence of the response with respect to the selection process regarding

who gets the treatment.



476



17. CAUSALITY AND INFERENCE



For an example of this last case, say

Problem 226. Our universal set U consists of patients who have a certain disease. We will explore the causal effect of a given treatment with the help of three

events, T , C, and S, the first two of which are counterfactual, compare [Hol86].

These events are defined as follows: T consists of all patients who would recover

if given treatment; C consists of all patients who would recover if not given treatment (i.e., if included in the control group). The event S consists of all patients

actually receiving treatment. The average causal effect of the treatment is defined as

Pr[T ] − Pr[C].

• a. 2 points Show that

(17.0.6)



Pr[T ] = Pr[T |S] Pr[S] + Pr[T |S ](1 − Pr[S])



and that

(17.0.7)



Pr[C] = Pr[C|S] Pr[S] + Pr[C|S ](1 − Pr[S])



Which of these probabilities can be estimated as the frequencies of observable outcomes

and which cannot?

Answer. This is a direct application of (2.7.9). The problem here is that for all ω ∈ C, i.e.,

for those patients who do not receive treatment, we do not know whether they would have recovered



17. CAUSALITY AND INFERENCE



477



if given treatment, and for all ω ∈ T , i.e., for those patients who do receive treatment, we do not

know whether they would have recovered if not given treatment. In other words, neither Pr[T |S]

nor E[C|S ] can be estimated as the frequencies of observable outcomes.



• b. 2 points Assume now that S is independent of T and C, because the subjects

are assigned randomly to treatment or control. How can this be used to estimate those

elements in the equations (17.0.6) and (17.0.7) which could not be estimated before?

Answer. In this case, Pr[T |S] = Pr[T |S ] and Pr[C|S ] = Pr[C|S]. Therefore, the average

causal effect can be simplified as follows:

Pr[T ] − Pr[C] = Pr[T |S] Pr[S] + Pr[T |S ](1 − Pr[S]) − Pr[C|S] Pr[S] + Pr[C|S ](1 − Pr[S])

= Pr[T |S] Pr[S] + Pr[T |S](1 − Pr[S]) − Pr[C|S ] Pr[S] + Pr[C|S ](1 − Pr[S])

(17.0.8)



= Pr[T |S] − Pr[C|S ]



• c. 2 points Why were all these calculations necessary? Could one not have

defined from the beginning that the causal effect of the treatment is Pr[T |S]−Pr[C|S ]?

Answer. Pr[T |S] − Pr[C|S ] is only the empirical difference in recovery frequencies between

those who receive treatment and those who do not. It is always possible to measure these differences,

but these differences are not necessarily due to the treatment but may be due to other reasons.



478



17. CAUSALITY AND INFERENCE



The main message of the paper is therefore: before drawing causal conclusions

one should acertain whether one of these conditions apply which make causal conclusions possible.

In the rest of the paper, Holland compares his approach with other approaches.

Suppes’s definitions of causality are interesting:

• If r < s denote two time values, event Cr is a prima facie cause of Es iff

Pr[Es |Cr ] > Pr[Es ].

• Cr is a spurious cause of Es iff it is a prima facie cause of Es and for some

q < r < s there is an event Dq so that Pr[Es |Cr , Dq ] = Pr[Es |Dq ] and

Pr[Es |Cr , Dq ] ≥ Pr[Es |Cr ].

• Event Cr is a genuine cause of Es iff it is a prima facie but not a spurious

cause.

This is quite different than Rubin’s analysis. Suppes concentrates on the causes of a

given effect, not the effects of a given cause. Suppes has a Popperian falsificationist

view: a hypothesis is good if one cannot falsify it, while Holland has the depth-realist

view which says that the empirical is only a small part of reality, and which looks at

the underlying mechanisms.

Problem 227. Construct an example of a probability field with a spurious cause.



17. CAUSALITY AND INFERENCE



479



Granger causality (see chapter/section 67.2.1) is based on the idea: knowing

a cause ought to improve our ability to predict. It is more appropriate to speak

here of “noncausality” instead of causality: a variable does not cause another if

knowing that variable does not improve our ability to predict the other variable.

Granger formulates his theory in terms of a specific predictor, the BLUP, while

Holland extends it to all predictors. Granger works on it in a time series framework,

while Holland gives a more general formulation. Holland’s formulation strips off the

unnecessary detail in order to get at the essence of things. Holland defines: x is not

a Granger cause of y relative to the information in z (which in the timeseries context

contains the past values of y) if and only if x and y are conditionally independent

given z. Problem 40 explains why this can be tested by testing predictive power.



Xem Thêm
Tải bản đầy đủ (.pdf) (1,644 trang)

×