The Factor Analysis (FA) will be applied for the identification of the core factors affecting the dependent variable. This technique is considered appropriate as it requires no preexisting of functional relationships and is a well known for data reduction. It is used to reduce large number of variables into a few numbers of core factors. 

Before to begin, we must know, how we are going to conduct FA as there is two ways in general. First, Explorative Factor Analysis (EFA) and second, Confirmative Factor Analysis (CFA). I’m trying to explain difference between CFA and EFA in simple terms.

  • CFA evaluates a priori hypotheses and is largely driven by theory while EFA is to identify factors based on data and to maximize the amount of variance required.
  • CFA analyses require the researcher to hypothesize, in advance, the number of factors, whether or not these factors are correlated, and which items/measures load onto and reflect which factors while in EFA, researcher is not required to have any specific hypotheses about how many factors will emerge, and what items or variables these factors will comprise.
  • EFA looks for patterns while CFA does statistical hypothesis testing on proposed models.
  • If we are unsure of what factors to include in our model we apply EFA. Once we have eliminated some factors and settled on what to include in our model we do CFA to test the model formally to see if the chosen factors are significant.
  • EFA is a data driven approach or can say an inductive approach which means that it follows a bottom - up strategy. SO, in EFA we draw conclusions based on specific observations while CFA is a deductive approach which follows top - down strategy where we develop our conclusion based on theory.
  • In EFA, we use the data to determine the underlying structure. Also, typically use an orthogonal rotation and cross loadings are permitted, as long as they are relatively small while in CFA, we specify the factor structure on the basis of a ‘good’ theory and then use CFA to determine whether there is empirical support for the proposed theoretical factor structure. also; assumes oblique rotation and no cross loadings.
  • CFA is usually performed by using statistical software like AMOS, LISREL, EQS and SAS whereas EFA can be perormed simply using SPSS, STATA and R-Studio. 
  • To sumup, EFA is a method for finding latent variables in data, usually data sets with a lot of variables. CFA is a method of confirming that certain structures in the data are correct; often, there is an hypothesized model due to theory and we want to confirm it.

Here, I have tried to illustrate how the EFA is performed in SPSS. First, lets take a data set, for example given here is the SALES data file. download SALES.sav>>>

Methodology 

The 13 variables (X6 to X18) are available for FA and each of them is measured in 0 to 10 graphical rating scales. These variables measure perceptions of 100 customers on the performance of company. The sample size of 100, or approximately 8:1 ratio of observations to variables, seems to be adequate to do FA. Now we do FA more systematically. In this direction, We will first remove all those variables that may create problem during the time of interpretation of factors. For this we will examine the followings. 
First, we look on data whethter FA is doable on it. Its the KMO & Bartlett's test to run here.
Note: The Kaiser-Meyer-Olkin (KMO) test is the measure of sampling adequacy, which varies between 0 and 1. The values closer to 1 are better and the value of 0.6 is the suggested minimum. KMO values between 0.8 and 1 indicate the sampling is adequate. KMO values less than 0.6 indicate the sampling is not adequate and that remedial action should be taken.  
The Bartlett's Test of Sphericity is the test for null hypothesis that the correlation matrix has an identity matrix. A high value for KMO and low significance for Sphericity test indicate that the variables are related and it forms the relevance of factor analysis.

Value of general MSA, KMO (0.609) and p-value of the test of sphericity (0.000) suggest us to go for FA. The following process is followed to conduct FA:
  • Correlation matrix of 13 variables and Diagonal elements of Anti-image correlation matrix, called variable MSA. SPSS outputs are summarized below when we do FA of 13 perception variables (X6 to X18) of  SALES data file without doing some preliminary work .

Table: 1.1: Correlation Matrix

In the 13 × 13 correlation matrix, one of the variables - X15 - is found to be insignificantly correlated (<0.05) with all other variables. Assuming that such variable would not make any contribution during the course of analysis, it is decided to drop the variable.

  • With deletion of variable X15:

Table 1.2: Anti-image Matrix

In the 12 × 12 anti-image correlation matrix, the measures of sampling adequacy (MSA) are found to fall below the acceptance level (< 0.5). The minimum level of MSA is found for the variable X17 (value is 0.444). It is decided to drop this variable too. We dont drop X11 simultaneously, because  we need to check one by one (starting with the lowest value), how the result differs with dropping of one variable at a time.
  • With deletion of variable X17

Table 1.3: Anti-image Matrix

The minimum value in the principal diagonal of AIC is 0.509 > 0.5, so no variable is need to be deleted.
  • In this 11x11 anti-image correlation matrix, amm MSAs are found at acceptable level. Factor analysis on these 11 variables are carried out. 
The variable X11 has found relatively high cross-loadings between the two factors F1 and F4 (0.591 on F1 and 0.642 on F4), which complicates the interpretation of factors. This variable is also decided to drop. 
  • Finally, 10 variables were identified for factor analysis. The 10 eigenvalues of the 10 × 10 correlation matrix R, % of variance of R and the cumulative % of variance explained by each eigenvalue is presented in Table 1.5 
                                            Table 1.5 Eigenvalues and their contributions
  • The number of factors turned out to be four (Eigenvalues>1) and they correspondingly explained around 30.9, 22.7, 16.6 and 10.4 per cent of the total variance. Moreover, these four eigenvalues together explains around 80.6 per cent of the total variance. 

The overall Measures of Sampling Adequacy (MSA) as indicated by the Kaiser-Meyer-Olkin (KMO) MSA turned out to be 0.669, which is above the acceptance level. The p-value of the Bartlett's test of sphericity turned out to be 0.000, which indicates that the correlation matrix is different from the identity matrix, a desirable property for factor analysis. These two results suggest for go ahead for FA.

Main Results of FA 

Four factors have been identified in the present FA, since there are four eigenvalues greater than 1. The factor loadings of 10 variables on these four factors, communalities of the 10 variables, and contribution of each factor (in absolute and percentage terms) are summarized in Table 1.7, where underline loadings are relatively higher than other loadings within each column. 

    Table 1.7: Rotated Component Matrix

The main findings are as follows:

  • The lowest communality is 0.585 for the variable - advertising – which means the four factors extract around 58.5 per cent of the total variance of advertising. The highest communality is 0.894 for the variable - delivery speed and Technical Support – which means the four factors extract around 89.4 per cent of the total variance of delivery speed and Technical Support. 
  • The four factors F1, F2, F3, and F4 correspondingly extract 25.9, 22.2, 18.5, and 14.1 per cent of the total variance, and the four factors together extract 80.6 per cent. 
  • The three variables - Complaint Resolution, Delivery Speed, and Order & Billing – constitute the first factor F1, since the factor loadings of these variables on F1 are markedly high. This factor reflects the post sale performance of Company, so it is named as Post Sale Customer Service. Similarly, F2, F3 and F4 correspondingly named as Marketing, Technical Support, and Product Value. 
  • The four factors correspond to four dimensions, which encompass a wide range of elements in the customer perception, from the tangible product attributes (Product Value) to the relationship with the company (Customer Service and Technical Support) to even the outreach efforts (Marketing). 
  • Business planners within company can now discuss plans revolving around these four areas instead of having to deal with the separate variables.
  • Factor analysis also provides the basis for data reduction through either summated scales or factor scores. These new composite variables rather than the individual variables can be used for various analyses.
  • Correlation matrix of 10 variables by factor group is summarized in Table 2. The variables within factor groups are strongly correlated but the variables across the factor groups are weakly correlated. This is a desired structure of correlation matrix. 

Table 2: Correlation Within and Between Factor Groups

Finally It remains to examine the reliability of each factor. But in this direction, the perception score on competitive pricing need to reversed, since it is negatively correlated with product quality. The value of Cronbach's Alpha of each factor is summarized below.


Conclusion

The Factor Analysis has thus identified 4 core factors that affect the performance of the sales department in the company. They can be categorized as under: -
1 Post Sale Customer Service
2. Marketing
3. Technical Support, and 
4. Product Value. 
The above factors now, can be used for further analysis as per the objectives of our study.

References:
  1. Bartlett, M. S. (1951). The effect of standardization on a Chi-square approximation in factor analysis. Biometrika, 38(3/4), 337-344
  2. Henson, R.K. & Roberts, J.K. (2006). Use of exploratory factor analysis in published research. Educational and Psychological Measurement, 66(3), 393-416.
  3. Revelle, W. (2016). How To: Use the psych package for Factor Analysis and data reduction.