Introduction to Factor Analysis!

Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate School! Monterey, California! 3/26/13 Reading:! Fricker, Kulzy & Applege...
0 downloads 2 Views 2MB Size
Introduction to Factor Analysis! Professor Ron Fricker! Naval Postgraduate School! Monterey, California!

3/26/13

Reading:! Fricker, Kulzy & Appleget (2012)!

1

Goals for this Lecture! •  Learn about factor analysis as a tool for:! –  Deriving unobserved latent variables from observed survey question responses! –  Data reduction!

•  Understand the steps in conducting factor analysis and the R functions/syntax! •  Illustrate the application of factor analysis to survey data!

3/26/13

2

Why Factor Analysis?! •  Factor analysis is a method for identifying latent traits from question-level survey data! –  Useful in survey analysis whenever the phenomenon of interest is complex and not directly measurable via a single question!

•  In such situations, must ask a series of questions about the phenomenon, then appropriately combine the resulting responses into a single measure or “factor”! –  Such factors, then, become the observed measures of the unobservable or latent phenomenon! 3/26/13

3

Goal of Factor Analysis! •  Factor analysis is a hybrid of social and statistical science! •  Dates to the early 1900s, where the goal was multivariate data reduction! •  Idea is to explain the correlation structure observed in p dimensions via a linear combination of r factors, where:! –  the number of factors is smaller than the number of observed variables (r < p), and! –  the factors achieve both “statistical simplicity and scientific meaningfulness” (Harman, 1976)! 3/26/13

4

Factor Analysis and Survey Data! •  Common use of exploratory factor analysis is to “determine what sets of items hang together in a questionnaire” (DeCoster, 1998)! –  Particularly important for instruments with large number of items (i.e., for data reduction)! –  Also when need to summarize sets of items in terms of their commonalities (i.e., express results in terms of latent variables)!

•  Practically, can make interpreting and summarizing (complex) survey results easier / more meaningful / efficient! 3/26/13

5

Three Types of Factor Analysis! •  Principle components! –  Empirical data reduction methodology, but not focused on achieving “scientific meaningfulness”!

•  Exploratory factor analysis! –  Also empirical data reduction methodology that often does derive scientifically meaningful factors! –  Focus of this lecture!

•  Confirmatory factor analysis! –  Variety of methods focused on testing hypotheses about structure of factors! –  See Maj Steve Jones’ thesis (2012) for more info.! 3/26/13

6

A Bit About Principle Components! •  Standard statistical method for data reduction! •  Seeks to explain as much variance as possible in a small number of orthogonal linear combinations of the original data! •  Useful when the goal is to reduce the number of variables in a model/analysis while capturing much of the variability! •  However, as just stated, resulting components do not necessarily achieve “scientific meaningfulness”! 3/26/13

7

A Bit About Confirmatory Factor Analysis! •  Intended as a way to test theories/hypotheses about factor constructs! •  My preference: Whenever possible, test results via reproducibility (on separate data) vice confirmatory factor analysis (CFA)! –  “Finally, the process of reproducing Factor Analysis on outof-sample data (the 2011 survey) proved much more useful than conducting CFA. Although CFA most undoubtedly has uses for some models and some data sets, it is neither powerful enough, nor informative enough, to justify its use compared to the reproduction of Factor Analysis” (Jones, 2012).! ü  Reproducibility is the appropriate scientific standard and important to do for any statistical analysis! 3/26/13

8

Exploratory Factor Analysis in a Picture! •  Example: Six questions that are functions of two underlying (unobserved) factors:!

3/26/13

9

Mathematically! •  The idea is to find a set of r common factors, F1,…,Fr, such that when used to estimate the data the correlation structure of the estimated data is close to the correlation structure of the actual data!

Loadings! 3/26/13

Unique loading (and its factor)! Common factors!

10

Steps in (Exploratory) Factor Analysis! •  Determine the number of factors! –  Seems like a Catch-22 (“How can I know the number of factors if they’re unobserved?”), but there is a way that works well!

•  Fit the exploratory factor analysis model! •  Rotate the model to achieve desired solution! –  Two main approaches: promax and varimax! –  Decide whether to keep all variables in each factor or use a cut-off for the loadings!

•  Interpret the resulting factors! –  Re-rotate as necessary! 3/26/13

11

Determining the Number of Factors! •  Getting the number of factors right is critical! –  Too few and factors load with irrelevant items! –  Too many and items spread out over many factors! –  Both make interpreting the resulting factors hard and may obscure the real underlying factors!

•  Variety of methods proposed:! –  Kaiser rule, scree plot, etc.!

•  What works well is parallel analysis! –  Idea: Factors derived from real data should have larger eigenvalues than equivalent factors derived from equivalent simulated data! 3/26/13

12

Parallel Analysis with QOL Data! •  Consider question 7 from QOL survey! –  5-point Likert rating of 15 NPS services!

Removed – too much missing (864 out of 1,368)

Removed – too much missing (505 out of 1,368) Removed – too much missing (950 out of 1,368) Removed – too much missing (818 out of 1,368)

3/26/13

13

Data Preparation! •  Re-coded Likert scale: 1=Very Satisfied to 5=Very Unsatisfied! •  Deleted all records where respondents failed to answer one or more of the 11 parts (casewise deletion)! –  Only did it here for convenience to illustrate factor analysis! –  Would have used nearest neighbor hot deck imputation, based on demographics, in a real analysis!

•  Final result: 11 questions for 555 respondents!

3/26/13

14

Results for QOL Q7 Data!

1

2

3

PC Actual Data PC Simulated Data PC Resampled Data FA Actual Data FA Simulated Data FA Resampled Data

0

eigenvalues of principal components and factor analysis

Parallel Analysis Scree Plots

2

4

6

8

10

Factor Number

3/26/13

Indicates 6 factors appropriate for Q7!

15

Fitting the Model! •  Idea: Find factors and associated loadings so that covariance of their linear combination is “close to” covariance of the original data! ˆ Fˆ + Ψ ˆ so that ! ˆ =Λ –  I.e., find the estimated data X ˆ cor ( X ) ≈ cor X •  Mathematics beyond the scope of what we’ll cover today! •  Because factors and their loadings are all unknown, there is no unique solution !

( )

–  In fact, there are an infinite number of solutions!

3/26/13

16

Fitting the Model in R! •  Given the desired number of factors, use the factanal() function in base R! •  Basic syntax is factanal(dataframe,nr_factors) –  Here “dataframe” contains only those variables to be used in the factor analysis! –  And “nr_factors” is an integer! –  Default rotation is varimax, but can also specify promax! •  Varimax results in orthogonal factors! •  Promax allows for correlated factors !! 3/26/13

17

Varimax Rotation! •  Varimax finds the rotation that makes the high loadings as high as possible while also making the low loadings as low as possible! •  I.e., varimax finds an orthogonal transformation that for maximizes:! Essentially, the variance of the j factor’s (rescaled) loadings ! over the p questions! th

3/26/13

Sum of the “variances” over the r factors!

18

Example #1: QOL Results!

3/26/13

Exchange and Comm.!

Fitness Services!

Heathcare Services!

Auto Services!

MWR Services!

NPS Student Services!

•  In the end, I found the following 6 factors using a loadings cut-off of 0.4 (a subjective choice):!

19

Compare to Principle Components!

3/26/13

20

Example #1: Discussion! •  This is only an illustration! –  Use of casewise deletion was extreme! •  Better to use demographics and nearest neighbor hot deck imputation! –  Also, only running factor analysis on a small subset of the survey questions was extreme! •  Better to run factor analysis on all the questions! –  How might the additional information affected the factor formulation? What else might have entered into the factors?!

•  Compared to principle components, resulting factors more intuitively interpretable! 3/26/13

21

Example #2: Modeling Trust in Government! •  Advance understanding of how citizens’ trust in their government is related to observable / measureable government characteristics! –  Specifically, is the “integrative model of organizational trust” empirically supported by Sahel survey data?!

•  Assuming trust is related to support, could facilitate insight into causes of unstable governments! –  In particular, what stabilizes a government?! –  Could be useful for determining where and how to apply resources! 3/26/13

22

Trust Critical to Human Interaction! •  Psychology: “trust is one of the most important components – and perhaps the most essential 
 ingredient – for the development and maintenance 
 of … well-functioning relationships” ! ! !

! !

!Social Psychology: Handbook of Basic Principles, ! !A.W. Krugalanski and E.T. Higgins (eds.), 2007, p. 587

!

•  International relations: trust within the international system is “the underpinning of all human contact and institutional interaction” ! ! ! !

! ! !

!Building Trust in Government in the Twenty-first Century:! !Review of Literature and Emerging Issues, ! !P.K. Blind, 2006, p. 3!

•  Counterinsurgency: trust building is the military’s “true main effort: everything else is secondary” ! 3/26/13

!

!

!Counterinsurgency, D. Kilcullen, 2010, p. 37!

23

Integrative Model of Organizational Trust!

Trust: “the willingness of an individual to be vulnerable to the actions of another party based on the expectation that the other will perform a particular action important to the individual, irrespective of the ability to monitor or control the other party”



3/26/13







Mayer, R.C., Davis, J.H., and F.D. Schoorman (1995). An Integrative Model of



Organizational Trust, Academy of Management Review, 20, p. 712.

24

Definitions! •  Ability is defined as “that group of skills, competencies, and characteristics that enable a party to have influence within some specific domain”! –  In this domain: citizens’ confidence that the government is competent in providing desired services!

•  Benevolence is the “extent to which a trustor believes that a trustee wants to do good for the trustor” ! –  In this domain: the belief that the government acts with kindness and goodwill towards its citizens !

•  Integrity is the trustor’s perception “that the trustee adheres to some set of principles that the trustor finds acceptable” ! –  In this domain: citizens’ perception that the government adheres to and supports ethical and socially beneficial principles of governance, including fairness, justice, democracy, etc. !

3/26/13

25

An Empirical Assessment: 
 Is the Model Supported by Data?!

3/26/13

26

The Data: Four National Surveys! •  140 questions common across four countries! •  Fielded in 2010 to: ! –  3,770 respondents in Country “A” ! –  1,661 respondents in Country “B” ! –  1,874 respondents in Country “C”! –  1,481 respondents in Country “D”!

•  Survey asked about ! –  quality of life! –  governance, politics, and international relations! –  security, social tolerance! 3/26/13

27

Example #2 (continued)! •  Figure shows the results from fa.parallel for Country A, which resulted in setting r = 27 –  Sensitivity analysis using other values of r confirmed that r = 27 was appropriate !

•  Country B: r = 28; for Country C: r = 25; etc.! 3/26/13

28

“Government Trust” Factors! Country “A”

3/26/13

Country “B”

Country “C”

Country “D”

29

“Trustor Propensity” Factors!

Country “A”

3/26/13

Country “B”

Country “C”

Country “D”

30

“Ability” Factors! Ability Factors & Loadings

Country “A”

3/26/13

Country “B”

Country “C”

Country “D”

31

“Benevolence/Integrity” Factors! Country “A”

3/26/13

Country “B”

Country “C”

Country “D”

32

Model Results!

3/26/13

Note: Interaction terms suppressed and significance “rolled up” to main effects for presentation clarity.

33

Conclusions! •  Data empirically supports the Integrative Model of Organizational Trust general form when applied to government trust! –  Ability and benevolence/integrity are important ! –  However, each country does differ in terms of what specifically is important to the citizens in terms of government ability and integrity/benevolence!

•  But it also is clear that there are other important terms in a model of government trust! –  Begs the question of whether there are other relevant terms we could not assess because the questions were not asked in our surveys!

3/26/13

34

Proposed Integrative Model of 
 Government/Organizational Trust! Factors of perceived trustworthiness

Ability

- Internal

o Essential services

o Economics

o Individual safety,

security

- External

o National security

o Aid / assistance

(as appropriate)

Benevolence/

Integrity

- Organizational

o Democratic

o Open / transparent

- Societal

o Free and fair

o Peaceful, tolerant

Reputation

- Effective international

relations

- National status

3/26/13

Perception that the government is competent at:! •  Internally providing desired services desired by citizens! •  Externally maintaining national security and attracting external aid! Perception that the government:! •  Operates according to ethical / socially beneficial principles ! •  Promotes societal conditions desired by the citizenry! Perception that the government conducts effective international relations and fosters national status!

35

What We Have Just Learned! •  Learned about factor analysis as a tool for:! –  Deriving unobserved latent variables from observed survey question responses! –  Data reduction!

•  Discussed the steps in conducting factor analysis and the R functions/syntax! •  Illustrated the application of factor analysis to survey data!

3/26/13

36