Case 14.2 Integrated Case: The Global Motors Survey Associative Analysis

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (21.9 MB, 497 trang )

CHAPTER

15

Learning Objectives

• Tounderstandthebasicconcept

ofprediction

• Tolearnhowmarketing

researchersuseregression

analysis

• Tolearnhowmarketing

researchersusebivariate

regressionanalysis

• Toseehowmultipleregression

differsfrombivariateregression

• Tolearnhowtoobtainand

interpretmultipleregression

analyseswithSPSS

“Where We are”

1 Establish the need for marketing

research.

A Marketing Research Practitioner’s

Comments on Multiple Regression

Analysis

Marketing research practitioners have two basic

missions: describe the current state of the marketplace and predict how the marketplace will

react to changes in current product offerings or

the introduction of new offerings. Prediction is

the more difficult of these two missions simply

because so many variables must be measured.

William D. Neal,

Founder and Senior Partner,

SDR Consulting

Consider the introduction of a new type

of yogurt. To predict sales among regular yogurt purchasers, one would need to

initially measure awareness of the new offering, consumer importance of

2 Define the problem.

differing product attributes (e.g., taste, calories, claimed benefits, pack-

3 Establish research objectives.

age size), relative price, availability, and so forth. Some of these product

4 Determine research design.

attributes may be important predictors of sales, and some may not. But

5 Identify information types and

it is obvious that it is highly unlikely one single attribute will adequately

sources.

6 Determine methods of accessing

data.



Understanding Regression

Analysis Basics

predict sales. Rather it is most likely it will require a combination of

measures to provide an acceptable sales prediction model.

One of the best tools for unraveling this prediction complexity is lin-

7 Design data collection forms.

ear regression. Executed correctly, regression can help you understand

8 Determine the sample plan and size.

whether the variables you have measured can aid in predicting sales (or

9 Collect data.

consideration, or the likelihood of choice). Linear regression is one of the

10 Analyze data.

11 Prepare and present the final

research report.

fundamental and most important tools in the researcher’s toolbox.

When I first studied regression in graduate school, we were required

to do the analysis manually using only a simple hand calculator. That

was a great learning experience and taught me how to analyze residuals

to understand whether the data was truly linear and whether the data

was a good fit to the model I developed. In this age, we have powerful

computers and sophisticated software to do all those things. However, since the advent of these new tools, I have also seen too many

instances where researchers who lack a deep understanding of regression analysis produce a regression-based model they represented to

be a good predictor when it is not, leading to erroneous conclusions.

Thus, a note of caution is in order: Test it, test it, and test it again.

Text and Images: By permission, William D. Neal,

Founder and Senior Partner, SDR Consulting.

A

s you can surmise from reading the opening vignette, this chapter takes up the subject of

multiple regression analysis. Undoubtedly, your reading of William Neal’s description

has left you with more questions than answers. For example, what is a residual? Truly

linear? Good fit? Your questions should alert you to the fact that we are going to describe a complex analytical technique. We will endeavor to describe regression analysis in a slow and methodical manner, and when we end our description, we will warn you that, while you have learned to

run it and interpret its findings, we have barely scratched the surface of this complicated analysis.

Bivariate Linear Regression Analysis

In this chapter, we will deal exclusively with linear regression analysis, a predictive model technique often used by marketing researchers. However, regression analysis is a complex statistical technique with a large number of requirements and nuances.1 You must understand that this

chapter offers only a basic introduction to this area, and as we will warn you toward the end of

the material, a great many aspects of regression analysis are beyond the scope of this textbook.

We define regression analysis as a predictive analysis technique in which one or more

variables are used to predict the level of another by use of the straight-line formula. Bivariate

regression means only two variables are being analyzed, and researchers sometimes refer to

this case as simple regression. We will review the equation for a straight line and introduce

basic terms used in regression. We also describe some basic computations and significance

with bivariate regression.

A straight-line relationship underlies regression,

and it is a powerful predictive model. Figure 15.1 illustrates a straight-line relationship, and you should refer to

it as we describe the elements in a general straight-line

formula. The formula for a straight line is:

Formula for a straight-line

relationship

With bivariate regression,

one variable is used to

predict another variable

using the straight-line

formula.

y = a + bx

where

y = the predicted variable

x = the variable used to predict y

a = the intercept, or point where the line cuts the y

axis when x = 0

b = the slope, or the change in y for any 1 unit

change in x

Photo: Eisenhans/Fotolia

407

408

Chapter 15 • Understanding regression analysis BasiCs

y

b

a

b = the slope, the change in the line

for each one-unit change in x

basic cOncepts in regressiOn

anaLysis

We now define the variables and show how the intercept and slope are computed.

1

a = intercept, the point on the y-axis

that the line cuts when x = 0

0

Figure 15.1

General Equation

for a Straight Line

in Graph Form

The straight-line equation

is the basis of regression

analysis.

Regression is directly

related to correlation by

the underlying straight-line

relationship.

You should recall the straight-line relationship we described underlying the correlation coefficient: When the scatter diagram for two variables

appears as a thin ellipse, there is a high correlation between them. Regression is directly related

to correlation.

x

independent and Dependent variables As

we indicated, bivariate regression analysis is a case

in which only two variables are involved in the predictive model. When we use only two variables, one is termed dependent and the other is termed independent. The dependent variable

is that which is predicted, and it is customarily termed y in the regression straight-line equation. The independent variable is that which is used to predict the dependent variable, and

it is the x in the regression formula. We must quickly point out that the terms dependent and

independent are arbitrary designations and are customary to regression analysis. There is no

cause-and-effect relationship or true dependence between the dependent and the independent

variable. It is strictly a statistical relationship, not causal, that may be found between these

two variables.

The least squares criterion

used in regression analysis

guarantees that the

“best” straight-line slope

and intercept will be

calculated.

computing the slope and the intercept To compute a (intercept) and b (slope),

you must work with a number of observations of the various levels of the dependent

variable paired with different levels of the independent variable, identical to the scatter

diagrams we illustrated previously when we were demonstrating how to perform

correlation analysis.

The formulas for calculating the slope (b) and the intercept (a) are rather complicated, but

some instructors are in favor of their students learning these formulas, so we have included

them in Marketing Research Insight 15.1.

When SPSS or any other statistical analysis program computes the intercept and the

slope in a regression analysis, it does so on the basis of the least squares criterion. The least

squares criterion is a way of guaranteeing that the straight line that runs through the points

on the scatter diagram is positioned to minimize the vertical distances away from the line of

the various points. In other words, if you draw a line where the regression line is calculated

and calculate the distances of all the points away from that line (called residuals), it would be

impossible to draw any other line that would result in a lower sum of all of those distances.

The least squares criterion guarantees that the line is the one with the lowest total squared

residuals. Each residual is squared to avoid a cancellation effect of positive and negative

residuals.

To learn

about

linear

regression,

launch

www.youtube.com, and

search for “Intro to Linear

Regression.”

hOW tO imprOve a regressiOn anaLysis FinDing

When a researcher would wants to improve a regression analysis, the researcher can use a

scatter diagram to identify outlier pairs of points. An outlier2 is a data point that is substantially outside the normal range of the data points being analyzed. As one author has noted,

outliers “stick out like sore thumbs.”3 When using a scatter diagram to identify outliers,4 the

researcher draws an ellipse that encompasses most of the points that appear to be in an elliptical pattern.5 He or she then eliminates outliers from the data and reruns the regression

analysis. Generally, this approach will improve the regression analysis results.

In regression, the

independent variable

is used to predict the

dependent variable.

Bivariate linear regression analysis

marketing research insight 15.1

409

Practical Application

How to Calculate the Intercept and Slope of a Bivariate Regression

In this example, we are using the Novartis pharmaceuticals

company sales territory and number of salespersons data found

Table 15.1

in Table 15.1. Intermediate regression calculations are included

in Table 15.2.

Bivariate Regression Analysis Data and Intermediate

Calculations

Territory (I)

sales

($ millions) (y)

number of

salespersons (x)

1

2

3

4

5

6

7

8

9

10

102

125

150

155

160

168

180

220

210

205

230

255

250

260

250

275

280

240

300

310

4,325

(Average = 216.25)

11

12

13

14

15

16

17

18

19

20

Sums

xy

χ2

7

5

9

9

9

8

10

10

12

12

714

625

1,350

1,395

1,440

1,344

1,800

2,200

2,520

2,460

49

25

81

81

81

64

100

100

144

144

12

15

14

15

16

16

17

18

18

19

251

(Average = 12.55)

2,760

3,825

3,500

3,900

4,320

4,400

4,760

4,320

5,400

5,890

58,603

144

225

196

225

256

256

289

324

324

361

3,469

The formula for computing the regression parameter b is:

Formula for b,

the slope, in bivariate

regression

where

xi = an x variable value

yi = a y value paired with each xi value

n = the number of pairs

n a xi yi - a a xi b a a yi b

n

b =

n

i=1

n a x 2i

i=1

n

n

i=1

i=1

2

- a a xi b

n

i=1

410

Chapter 15 • Understanding regression analysis BasiCs

The calculations for b, the slope, are as follows:

Calculation of b, the slope,

in bivariate regression using

Novartis sales territory data

n a xi yi - a a xi b a a yi b

n

b =

=

n

i=1

n a x 2i

i=1

n

n

i=1

i=1

2

- a a xi b

n

i=1

20 * 58603 - 251 * 4325

20 * 3469 - 2512

1172060 - 1085575

=

69380 - 63001

86485

=

6379

= 13.56

Notes:

n = 20

Sum xy = 58603

Sum of x = 251

Sum of y = 4325

Sum of x2 = 3469

The formula for computing the intercept is:

Formula for a,

the intercept, in

bivariate regression

a = y - bx

The computations for a, the intercept, are as follows:

Calculation of a, the intercept,

in bivariate regression using

Novartis sales territory data

a = y - bx

= 216.25 - 13.56 * 12.55

= 216.25 - 170.178

= 46.07

Notes:

y = 216.25

x = 12.55

In other words, the bivariate regression equation has been found to be:

Novartis sales regression equation

y = 46.07 + 13.56 x

The interpretation of this equation is as follows. Annual sales in the average Novartis sales territory are $46.07 million, and sales

increase $13.56 million annually with each additional salesperson.

Multiple Regression Analysis

We follow up our introduction to bivariate regression analysis by discussing multiple regression analysis. You will find that all of the concepts in bivariate regression apply to multiple

regression analysis, except you will be working with multiple independent variables.

There is an underlying

general conceptual model

in multiple regression

analysis.

an UnDerLying cOnceptUaL mODeL

A model as a structure that ties together various constructs and their relationships. It is beneficial for the marketing manager and the market researcher to have some sort of model in

mind when designing the research plan. The bivariate regression equation is a model that ties

together an independent variable and its dependent variable. The dependent variables that

interest market researchers are typically sales, potential sales, or some attitude held by those

who make up the market. For example, in the Novartis example, the dependent variable was

territory sales. If Dell Computers commissioned a survey, it might want information on those

who intend to purchase a Dell computer, or it might want information on those who intend to

buy a competing brand as a means of understanding these consumers and perhaps dissuading

them. The dependent variable would be purchase intentions for Dell computers. If Maxwell

House Coffee were considering a brand of gourmet iced coffee, it would want to know how

coffee drinkers feel about gourmet iced coffee; attitudes toward buying, preparing, and drinking iced coffee would be the dependent variable.

MUltiple regression analysis

Figure 15.2 provides a general conceptual

model that fits many marketing research situations,

Attitudes,

particularly those that are investigating consumer

Opinions,

behavior. A general conceptual model identifies

Feelings

independent and dependent variables and shows

Purchases;

their expected basic relationships to one another.

Intentions to

In Figure 15.2, you can see that purchases, intenPurchase;

tions to purchase, and preferences are in the center,

Preferences;

meaning they are dependent. The surrounding conor Satisfaction

Past Behavior,

cepts are possible independent variables. That is,

Experience,

any one could be used to predict any dependent varKnowledge

iable. For example, one’s intentions to purchase an

expensive automobile like a Lexus could depend on

one’s income. It could also depend on the friends’

recommendations (word of mouth), one’s opinions about how a Lexus would enhance one’s

self-image, or experiences riding in or driving a Lexus.

In truth, consumers’ preferences, intentions, and actions are potentially influenced by a

great number of factors as would be evident if you listed all of the subconcepts that make up

each concept in Figure 15.2. For example, there are probably a dozen demographic variables;

there could be dozens of lifestyle dimensions, and a person is exposed to a great many types

of advertising media every day. Of course, in the problem definition stage, the researcher and

manager reduce the myriad of independent variables down to a manageable number to be

included on the questionnaire. That is, they have the general model structure in Figure 15.2

in mind, but they identify and measure specific variables that pertain to the problem at hand.

Because bivariate regression analysis treats only one dependent–independent pair, it would

take a great many bivariate regression analyses to account for all possible relevant dependent–

independent pairs of variables in a general model such as Figure 15.2. Fortunately, there is no

need to perform a great many bivariate regressions, as multiple regression analysis is a much

better tool, and a technique we are about to describe in some detail.

Active Learning

The General Conceptual Model for Global Motors

Understandably, Nick Thomas, CEO of Global Motors, a new division of a large automobile

manufacturer, ZEN Motors, wants everyone to intend to purchase a new gasoline alternative technology automobile; however, this will not be the case due to different beliefs and

predispositions in the driving public. Regression analysis will assist Nick by revealing what

variables are good predictors of intentions to buy the various new technology automobile

models under consideration at Global Motors. What is the general conceptual model apparent in the Global Motors survey dataset?

In order to answer this question and to portray the general conceptual model in the format of Figure 15.2, you must inspect the several variables in this SPSS dataset or otherwise

come up with a list of the variables in the survey. Using any “Desirability” variable as the dependent variable, diagram the general types of independent or predictor variables that are

apparent in this study. Comment on the usefulness of this general conceptual model to Nick

Thomas; that is, assuming that the regression results are significant, what marketing strategy

implications will become apparent?

411

Media

Exposure,

Word of

Mouth

Demographics,

Lifestyle

Figure 15.2 A

General Conceptual

Model for Multiple

Regression Analysis

The researcher and

manager identify, measure,

and analyze specific

variables that pertain to

the general conceptual

model in mind.

412

Chapter 15 • Understanding regression analysis BasiCs

Multiple regression means

you have more than one

independent variable to

predict a single dependent

variable.

With multiple regression,

you work with a regression

plane rather than a line.

A multiple regression

equation has two or more

independent variables (x’s).

mULtipLe regressiOn anaLysis DescribeD

Multiple regression analysis is an expansion of bivariate regression analysis in that more

than one independent variable is used in the regression equation. The addition of independent variables complicates the conceptualization by adding more dimensions or axes to the

regression situation. But it makes the regression model more realistic because, as we have

just explained with our general model discussion, predictions normally depend on multiple

factors, not just one.

basic assumptions in multiple regression Consider our Novartis example with the

number of salespeople as the independent variable and territory sales as the dependent variable. A second independent variable, such as advertising levels, can be added to the equation.

The addition of a second variable turns the regression line into a regression plane because

there are three dimensions if we were to try to graph it: territory sales (Y), number of sales

people (X1), and advertising level (X2). A regression plane is the shape of the dependent variable in multiple regression analysis. If other independent variables are added to the regression

analysis, it would be necessary to envision each one as a new and separate axis existing at

right angles to all other axes. Obviously, it is impossible to draw more than three dimensions

at right angles. In fact, it is difficult to even conceive of a multiple dimension diagram, but the

assumptions of multiple regression analysis require this conceptualization.

Everything about multiple regression is largely equivalent to bivariate regression except

you are working with more than one independent variable. The terminology is slightly different in places, and some statistics are modified to take into account the multiple aspect, but for

the most part, concepts in multiple regression are analogous to those in the simple bivariate

case. We note these similarities in our description of multiple regression.

The equation in multiple regression has the following form:

Multiple regression equation y = a + b1x 1 + b2x 2 + b3x 3 + . . . + bm x m

Where

y = the dependent, or predicted, variable

xi = independent variable i

a = the intercept

bi = the slope for independent variable i

m = the number of independent variables in the equation

As you can see, the addition of other independent variables has done nothing more than

to add bixi’s to the equation. We have retained the basic y = a + bx straight-line formula,

except now we have multiple x variables, and each one is added to the equation, changing y

by its individual slope. The inclusion of each independent variable in this manner preserves

the straight-line assumptions of multiple regression analysis. This is sometimes known as

additivity because each new independent variable is added to the regression equation.

Let’s look at a multiple regression analysis result so you can better understand the multiple regression equation. Here is a possible result using our Lexus example:

Lexus purchase

intention

multiple regression

equation example

Intention to purchase a Lexus = 2

+ 1.0 * attitude toward Lexus (1-5 scale)

- .5 * attitude toward current auto (1-5 scale)

+ 1.0 * income level (1-10 scale)

Notes:

a=2

b1 = 1.0

b2 = −.5

b3 = 1.0

This multiple regression equation says that you can predict a consumer’s intention to buy

a Lexus level if you know three variables: (1) attitude toward the Lexus brand, (2) attitude

toward the automobile he/she owns now, and (3) income level using a scale with 10 income

MUltiple regression analysis

413

levels. Further, we can see the impact of each of these variables on Lexus purchase intentions.

Here is how to interpret the equation. First, the average person has a 2 intention level, or some

small propensity to want to buy a Lexus. Attitude toward Lexus is measured on a 1–5 scale;

with each attitude scale point, intention goes up one point. That is, an individual with a strong

positive attitude of 5 will have a greater intention than one with a strong negative attitude of 1.

With attitude toward the current automobile he/she owns (for example, a potential Lexus

buyer may currently own a Cadillac or a BMW), the intention decreases by .5 for each level

on the 5-point scale. Of course we are assuming that these potential buyers own automobile makes other than a Lexus. Finally, the intention increases by 1 with each increasing

income level.

Here is a numerical example for a potential Lexus buyer whose Lexus attitude is 4,

current automobile make attitude is 3, and income is 5:

Calculation of

Lexus purchase

intention using

the multiple

regression

equation

Intention to purchase a Lexus = 2

+ 1.0 * 4

- .5 * 3

+ 1.0 * 5

= 9.5

Notes:

Intercept = 2

Attitude toward Lexus (x1) = 4

Attitude toward current auto (x2) = 3

Income level (x3) = 5

Multiple regression is a powerful tool, because it tells us what factors are related to the

dependent variable, how (the sign) each factor influences the dependent variable, and how

much (the size of bi) each factor influences it.

While you have yet not learned how to run multiple regression analysis on SPSS, you

have sufficient knowledge to realize that this analysis can provide interesting insights into

consumer behavior. Marketing Research Insight 15.2 presents an application of multiple

regression analysis in the social media marketing research arena.

As with bivariate regression analysis in which we alluded to the correlation between y

and x, it is possible to inspect the strength of the linear relationship between the independent

variables and the dependent variable with multiple regression. Multiple R, also called the coefficient of determination, is a handy measure of the strength of the overall linear relationship. As with bivariate regression analysis, the multiple regression analysis model assumes

that a straight-line (plane) relationship exists among the variables. Multiple R ranges from 0

to +1.0 and represents the amount of the dependent variable “explained,” or accounted for,

by the combined independent variables. High multiple R values indicate that the regression

plane applies well to the scatter of points, whereas low values signal that the straight-line

model does not apply well. At the same time, a multiple regression result is an estimate of

the population multiple regression equation, and, as was the case with other estimated population parameters, it is necessary to test for statistical significance.

Multiple R is like a lead indicator of the multiple regression analysis findings. As you

will see soon, it is one of the first pieces of information provided in a multiple regression output. Many researchers mentally convert the multiple R value into a percentage. For example

a multiple R of .75 means that the regression findings will explain 75% of the dependent

variable. The greater the explanatory power of the multiple regression finding, the better and

more useful it is for the researcher.

Before we show you how to run a multiple regression analysis using SPSS, consider this

caution: The independence assumption stipulates that the independent variables must be

statistically independent and uncorrelated with one another. The independence assumption

is crucial because if it is violated, the multiple regression findings are untrue. The presence

of moderate or stronger correlations among the independent variables is termed multicollinearity, which will violate the independence assumption of multiple regression analysis

results when it occurs.6 It is up to the researcher to test for and remove multicollinearity if it

is present.

Multiple R indicates how

well the independent

variables can predict the

dependent variable in

multiple regression.

With multiple regression,

the independent

variables should have

low correlations with one

another.

414

Chapter 15 • Understanding regression analysis BasiCs

marketing research insight 15.2

Social Media Marketing

Multiple Regression Analysis Gives Insights into Students’

Use of Twitter and Facebook

Two researchers recently took note of the widespread adoption of Twitter and Facebook by university students. 7 They

pondered the possible factors and reasons why these two

social media tools are so popular. Using adoptions of innovations theory, they identified several independent variables that

might be related to the use of one or both vehicles. Drawing

samples of undergraduate students from both a large Midwest university and a large Southeastern university, they used

an online survey to obtain information about some demographic variables (such as gender), several behavioral variables

(such as amount of mobile phone usage), and a number of

other variables (such as popularity or degree to which friends

use Twitter/Facebook).

The researchers then used multiple regression analysis

with these items treated as independent variables, and the

degree of use of Twitter/Facebook as the dependent variables. As reported by the researchers, the statistically significant independent variables for amount of use of Twitter and

Facebook are summarized in the following table by “Yes.”

A “No” means that no statistically significant relationship

was found.

Multicollinearity can be

assessed and eliminated

in multiple regression with

the VIF statistic.

Variable

Amount of mobile phone usage

How long used account

Friends expect me to use

Attitude toward Twitter/Facebook

To pass the time

Substitute for face-to-face

My friends use it

Twitter Use Facebook Use

Yes

No

Yes

Yes

No

Yes

No

No

Yes

Yes

No

Yes

No

Yes

The findings reveal that university students use Twitter if: (1) they are

heavy mobile users, (2) they believe their friends expect to tweet,

(3) they have a positive attitude toward Twitter, and (4) they prefer

to use Twitter in place of face-to-face meetings with their friends.

On the other hand, university students use Facebook if: (1) they

are established users of Facebook, (2) they have a positive attitude

toward Facebook, (3) they are bored and want to do something to

pass the time, and (4) many of their friends use it. It is interesting to

note that gender and age were not found to be significantly related

to either the use of Twitter or Facebook.

The way to avoid multicollinearity is to use warning statistics issued by most statistical analysis programs to identify this problem. One commonly used method is the variance

inflation factor (VIF). The VIF is a single number, and a rule of thumb is that as long as the

VIF is less than 10, multicollinearity is not a concern. With a VIF of greater than 10 associated with any independent variable in the multiple regression equation, it is prudent to remove

that variable from consideration or to otherwise reconstitute the set of independent variables.8

In other words, when examining the output of any multiple regression, the researcher should

inspect the VIF number associated with each independent variable that is retained in the final

multiple regression equation by the procedure. If the VIF is greater than 10, the researcher

should remove that variable from the independent variable set and rerun the multiple regression.9 This iterative process is used until only independent variables that are statistically significant and that have acceptable VIFs are in the final multiple regression equation.

integrateD case

®

Global Motors: How to Run and Interpret Multiple Regression

Analysis on SpSS

Running multiple regression first requires specification of the dependent and independent

variables. Let’s select the desirability of the standard size gasoline automobile model as

the dependent variable, and think about a general conceptual model that might pertain to

Global Motors. We already know from basic marketing strategy that demographics are

MUltiple regression analysis

often used for target marketing, and we have hometown size, age, income, education, and

household size. Also, beliefs are often useful for predicting market segments, and we have

some variables that pertain to beliefs about the gasoline emissions and global warming. To

summarize, we have determined our conceptual model: the desirability of a standard-size

gasoline automobile related to (1) household demographics and (2) beliefs about global

warming. Where appropriate, we have recoded the ordinal demographic variables with

midpoints to convert them to ratio scales.

The ANALYZE-REGRESSION-LINEAR command sequence is used to run a multiple

regression analysis, and the variable, Desirability: standard size gasoline model, is selected

as the dependent variable, while the other nine are specified as the independent variables. You

will find this annotated SPSS clickstream in Figure 15.3.

As the computer output in Figure 15.4 shows, the multiple R value (Adjusted R Square

in the Model Summary table) indicating the strength of relationship between the independent

variables and the dependent variable is .235, signifying that there is some linear relationship

present. Next, the printout reveals that the ANOVA F is significant, signaling that the null

hypothesis of no linear relationship is rejected, and it is justifiable to use a straight-line relationship to model the variables in this case.

It is necessary in multiple regression analysis to test for statistical significance of the

bi (beta) determined for the each independent variable. In other words, you must determine

whether sampling error is influencing the results and giving a false reading. One must test for

significance from zero (the null hypothesis) through the use of separate t tests for each bi. The

SPSS output in Figure 15.4 indicates the levels of statistical significance in the Coefficients

table in the column labeled “Sig.”; we have highlighted in yellow the cases where the significance level is .05 or less (95% level of confidence). It is apparent that size of hometown,

gender, number of people in the household, age, income, and the two attitude variables are

statistically significant. The other independent variables fail this test, meaning that their computed betas must be treated as zeros. No VIF value is greater than 10, so multicollinearity is

not a concern here.

415

The SPSS ANALYZEREGRESSION-LINEAR

command is used to run

multiple regression.

With multiple

regression, look at the

significance level of each

calculated beta.

Figure 15.3 SPSS

Clickstream for

Multiple Regression

Analysis

Source: Reprint courtesy

of International Business

Machines Corporation,

© SPSS, Inc., an IBM

Company.

416

Chapter 15 • Understanding regression analysis BasiCs

Figure 15.4 SPSS

Output for Multiple

Regression Analysis

Source: Reprint courtesy

of International Business

Machines Corporation,

© SPSS, Inc., an IBM

Company.

A trimmed regression

means that you eliminate

the nonsignificant

independent variables and

rerun the regression.

Run trimmed regressions

iteratively until all betas

are significant.

“trimming” the regressiOn FOr signiFicant FinDings

What do you do with the mixed significance results we have just found in our multiple regression analysis? Before we answer this question, you should be aware that this mixed result

is very likely, so understanding how to handle it is vital to developing the ability to perform

multiple regression analysis successfully.

It is standard practice in multiple regression analysis to systematically eliminate one by

one those independent variables that are shown to be insignificant through a process called

trimming. You successively rerun the trimmed model and inspect the significance levels each

time. This series of eliminations or iterations helps to achieve the simplest model by eliminating the nonsignificant independent variables. The trimmed multiple regression model with all

significant independent variables is presented in Figure 15.5.

This trimming process enables the marketing researcher to think in terms of fewer

dimensions within which the dependent variable relationship operates. Generally, successive iterations sometimes cause the multiple R to decrease somewhat, and it is advisable to

scrutinize this value after each run. You can see that the new multiple R is now .236, so in

our example, there has been very little change. Iterations will also cause the beta values and

the intercept value to shift slightly; consequently, it is necessary to inspect all significance

levels of the betas once again. Through a series of iterations, the marketing researcher finally arrives at the final regression equation expressing the salient independent variables

and their linear relationships with the dependent variable. A concise predictive model has

been found.

Xem Thêm

Case 14.2 Integrated Case: The Global Motors Survey Associative Analysis

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về