CORRESP performs correspondence analysis using the Carroll, Green and Schaffer algorithm for mapping two way contingency table data. The PC-MDS operationalization of this algorithm has been developed to accept large data sets. The CGS algorithm is described in detail in two articles appearing in the Journal of Marketing Research, "Interpoint Distance Comparisons in Correspondence Analysis" (August 1986), and "Comparing Interpoint Distances in Correspondence Analysis: A Clarification", (November 1987).
CORRESP permits squared distance comparisons within row, within column, and between rows and columns. This ability to compare distances between rows and columns is a marked improvement over other correspondence analysis scaling expositions.
In non-technical terms, correspondence analysis could be described as a singular value decomposition (as in factor analysis) of a matrix of chi-square distances. The decomposition produces a set of matrices (eigenvalue and eigenvectors) which are applied to row and column distance matrices to produce the interpoint distances for correspondence analysis mapping.
The Correspondence analysis map may be interpreted as a point-point model. The distances between points are however, chi-square metric rather than Euclidean metric. 3-D GRAPHICS are supported.
Correspondence analysis is a metric method of multidimensional scaling for categorical data. The method has been most recently associated with the work of the French school (Benzècri; Lebart, Morineau, and Warwick. See the PC-MDS program documentation for CORAN).
Normal methods of analysis for contingency table data entail the computation of a chi-square statistic for the data. Although this approach indicates the independence of row and column variables, the relationships between the rows and columns are not identified. More specifically, they are not identified in terms of a metric space. CORRESP displays these data geometrically as a set of points in 2 dimensional space.
Mathematical Algorithm:
Given a two way contingency table, F, CORRESP normalizes F by pre and post multiplication by the reciprocals of the row and column sums. Chi-square distances are then computed on this matrix. It is this matrix of distances that is decomposed, rotated, and again centered to produce the distance matrix.
Consider the following:

F can be normalized using the form
H = R-½ F C-½
Where R-½ is a 4x4 diagonal matrix and C-½ is a 3x3 diagonal matrix, whose entries consist of the reciprocals of the square roots of the row and column marginals. The H matrix is next converted to a chi-square distance matrix using the standard (O-E)²/E formula.
H is next decomposed using the singular value decomposition:
H = P Q'
with P'P = Q'Q = I and is a diagonal matrix.
The correspondence analysis next defines the coordinates of the row and column points, X and Y.
X = R-½ P (+I)½
and, Y = C-½ R (Q+I)½
This transformation stretches the vertical axis so that the resulting configuration is more spherical than other approaches to correspondence analysis.
Parameters may be ENTERED from the keyboard or | ||
| Press ENTER to READ as data | OR | Type 1 for KEYBOARD ENTRY |
| ENTER THE NUMBER OF ROWS IN THE MATRIX ( 1 to 100) |
| ENTER THE NUMBER OF COLUMNS IN THE MATRIX(1 TO 100) |
ENTER the INPUT FORMAT for reading the data | ||
| Press ENTER
if INPUT FORMAT is READ as part of the data file |
OR | Type FREE if data is in free format. |
In Free format, data points are separated by spaces |
Indicate if ROW AND COLUMN LABELS are desired: | ||
| Press ENTER
if NO LABELS are desired |
Type 1 to ENTER LABELS from keyboard |
Type 2 to READ LABELS from datafile |
Example 1: Example 2:
All Parameters, Format and Data Only, Parameters entered
Labels are part of the from keyboard, FREE format,
data file. No labels, or Labels entered
interactively.
4 3 1 5 1
(3F3.0) 5 1 3
1 5 1 2 10 2
5 1 3 1 1 7
2 10 2
1 1 7
ROW 1
ROW 2
ROW 3
ROW 4
COLUMN 1
COLUMN 2
COLUMN 3
SAMPLE CORRESPONDENCE OUTPUT
C O R R E S P |
TWO WAY CORRESPONDENCE ANALYSIS |
MATHEMATICAL ALGORITHM BY J.D. CARROLL AND P.E. GREEN |
PC - MDS VERSION |
|
|
ANALYSIS TITLE: H-F JMR POP TEST DATA 34 SUBJECTS X 8 BRANDS |Print out of file information
DATA IS READ FROM FILE: B:HFJMR.DAT |Input file
OUTPUT FILE IS: HFDAT.PRN |Output file
INPUT FORMAT: FREE |"FREE" Format used to read data
|
INPUT DATA AND EXPECTED CHI-SQUARE FREQUENCIES: |
OBSERVED | |
EXPECTED | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8|Print out of observed input
---------+---------+---------+---------+---------+---------+---------+---------+------|data file (read as frequencies)
| 1.00 | .00 | .00 | .00 | 1.00 | 1.00 | .00 | 1.|and the expected value of the
1| .81 | .69 | .32 | .28 | .65 | .44 | .44 | .|cell in the contingency table.
---------+---------+---------+---------+---------+---------+---------+---------+------|
| 1.00 | .00 | .00 | .00 | 1.00 | .00 | .00 | .|
2| .40 | .34 | .16 | .14 | .32 | .22 | .22 | .|(Note there are 8 columns plus
---------+---------+---------+---------+---------+---------+---------+---------+------|a column for row Totals).
| 1.00 | .00 | .00 | .00 | 1.00 | .00 | .00 | .|
3| .40 | .34 | .16 | .14 | .32 | .22 | .22 | .|
---------+---------+---------+---------+---------+---------+---------+---------+------|The Actual data file appears:
| .00 | 1.00 | .00 | 1.00 | .00 | .00 | 1.00 | .| 1 0 0 0 1 1 0 1
4| .61 | .52 | .24 | .21 | .48 | .33 | .33 | .| 1 0 0 0 1 0 0 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 1 0 0 0 1 0 0 0
| 1.00 | .00 | .00 | .00 | 1.00 | .00 | .00 | .| 0 1 0 1 0 0 1 0
5| .40 | .34 | .16 | .14 | .32 | .22 | .22 | .| 1 0 0 0 1 0 0 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 1 0 0 0 1 1 0 0
| 1.00 | .00 | .00 | .00 | 1.00 | 1.00 | .00 | .| 0 1 1 1 0 0 1 0
6| .61 | .52 | .24 | .21 | .48 | .33 | .33 | .| 1 1 0 0 1 1 0 1
---------+---------+---------+---------+---------+---------+---------+---------+------| 1 1 0 0 0 1 1 1
| .00 | 1.00 | 1.00 | 1.00 | .00 | .00 | 1.00 | .| 1 0 0 0 1 0 0 1
7| .81 | .69 | .32 | .28 | .65 | .44 | .44 | .| 1 0 0 0 1 1 0 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 0 1 0 0 0 0 1 0
| 1.00 | 1.00 | .00 | .00 | 1.00 | 1.00 | .00 | 1.| 0 0 1 1 0 1 0 1
8| 1.01 | .86 | .40 | .35 | .81 | .56 | .56 | .| 1 0 0 0 0 1 0 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 0 1 1 0 0 0 1 0
| 1.00 | 1.00 | .00 | .00 | .00 | 1.00 | 1.00 | 1.| 0 0 0 0 1 1 0 0
9| 1.01 | .86 | .40 | .35 | .81 | .56 | .56 | .| 0 1 0 0 0 1 0 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 1 1 0 0 1 0 0 0
| 1.00 | .00 | .00 | .00 | 1.00 | .00 | .00 | 1.| 1 0 0 0 0 0 0 1
10| .61 | .52 | .24 | .21 | .48 | .33 | .33 | .| 1 1 1 0 1 0 0 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 1 0 0 0 1 0 0 0
| 1.00 | .00 | .00 | .00 | 1.00 | 1.00 | .00 | .| 1 0 0 0 1 0 0 0
11| .61 | .52 | .24 | .21 | .48 | .33 | .33 | .| 0 1 0 1 0 0 1 0
---------+---------+---------+---------+---------+---------+---------+---------+------|
---------+---------+---------+---------+---------+---------+---------+---------+------| 1 1 0 0 1 0 0 0
| .00 | 1.00 | .00 | .00 | .00 | .00 | 1.00 | .| 0 1 1 1 0 0 0 0
12| .40 | .34 | .16 | .14 | .32 | .22 | .22 | .| 0 1 0 1 0 0 1 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 0 1 0 0 0 0 1 0
| .00 | .00 | 1.00 | 1.00 | .00 | 1.00 | .00 | 1.| 1 0 0 0 0 1 0 1
13| .81 | .69 | .32 | .28 | .65 | .44 | .44 | .| 1 0 0 0 0 1 0 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 0 1 1 0 0 0 1 0
| 1.00 | .00 | .00 | .00 | .00 | 1.00 | .00 | .| 1 0 0 0 1 0 0 1
14| .40 | .34 | .16 | .14 | .32 | .22 | .22 | .| 0 1 1 0 0 0 1 0
---------+---------+---------+---------+---------+---------+---------+---------+------| 1 0 0 0 1 0 0 1
| .00 | 1.00 | 1.00 | .00 | .00 | .00 | 1.00 | .| 0 1 1 1 0 0 1 0
15| .61 | .52 | .24 | .21 | .48 | .33 | .33 | .| (42 lines of labels would
---------+---------+---------+---------+---------+---------+---------+---------+------| follow if read in)
| .00 | .00 | .00 | .00 | 1.00 | 1.00 | .00 | .|
16| .40 | .34 | .16 | .14 | .32 | .22 | .22 | .| Rows are people,
---------+---------+---------+---------+---------+---------+---------+---------+------| Columns are pop brands:
| .00 | 1.00 | .00 | .00 | .00 | 1.00 | .00 | .| 1= COKE
17| .40 | .34 | .16 | .14 | .32 | .22 | .22 | .| 2= DIET COKE
---------+---------+---------+---------+---------+---------+---------+---------+------| 3= DIET PEPSI
| 1.00 | 1.00 | .00 | .00 | 1.00 | .00 | .00 | .| 4= DIET 7-UP
18| .61 | .52 | .24 | .21 | .48 | .33 | .33 | .| 5= PEPSI
---------+---------+---------+---------+---------+---------+---------+---------+------| 6= SPRITE
| 1.00 | .00 | .00 | .00 | .00 | .00 | .00 | 1.| 7= TAB
19| .40 | .34 | .16 | .14 | .32 | .22 | .22 | .| 8= 7-UP
---------+---------+---------+---------+---------+---------+---------+---------+------|
.
.
---------+---------+---------+---------+---------+---------+---------+---------+------|
| .00 | 1.00 | 1.00 | 1.00 | .00 | .00 | 1.00 | .|
34| .81 | .69 | .32 | .28 | .65 | .44 | .44 | .|
---------+---------+---------+---------+---------+---------+---------+---------+------|
TOTALS | 20.00 | 17.00 | 8.00 | 7.00 | 16.00 | 11.00 | 11.00 | 9.|
|
MATRIX OF CHI-SQUARE DISTANCES |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8|
---------+---------+---------+---------+---------+---------+---------+---------+------|Matrix of Chi-Square
1| .21 | -.83 | -.57 | -.53 | .44 | .83 | -.67 | 1.|distances. This matrix
---------+---------+---------+---------+---------+---------+---------+---------+------|is computed from a
2| .94 | -.59 | -.40 | -.38 | 1.19 | -.47 | -.47 | -.|transformed
---------+---------+---------+---------+---------+---------+---------+---------+------|(normalized) frequency
3| .94 | -.59 | -.40 | -.38 | 1.19 | -.47 | -.47 | -.| matrix computed by:
---------+---------+---------+---------+---------+---------+---------+---------+------|(R*(-1/2)) * F * (C*(-1/2))
4| -.78 | .68 | -.49 | 1.71 | -.70 | -.58 | 1.15 | -.|
---------+---------+---------+---------+---------+---------+---------+---------+------|where the pre and post
5| .94 | -.59 | -.40 | -.38 | 1.19 | -.47 | -.47 | -.|multiply matrices are
---------+---------+---------+---------+---------+---------+---------+---------+------|diagonal matrices containing
6| .51 | -.72 | -.49 | -.46 | .74 | 1.15 | -.58 | -.|transformed row and
---------+---------+---------+---------+---------+---------+---------+---------+------|column totals
.
---------+---------+---------+---------+---------+---------+---------+---------+------|
34| -.90 | .38 | 1.19 | 1.35 | -.80 | -.67 | .83 | -.|
---------+---------+---------+---------+---------+---------+---------+---------+------|
|
CHI-SQUARE STATISTIC FOR DISTANCE MATRIX = 165.495 |
|
EIGENVALUES OF THE CHI-SQUARE DISTANCE MATRIX |
1 2 3 4 5 6 7 |
------------------------------------------------------------------------------ |The Chi-Square distance
VALUES .7764 | .2669 | .2074 | .1674 | .1481 | .0553 | .0502 | |matrix is analyzed and
% OF TOTAL 46.4444 | 15.9639 | 12.4079 | 10.0168 | 8.8575 | 3.3091 | 3.0002 | |canonical decomposition
CUMULATIVE % 46.4444 | 62.4083 | 74.8162 | 84.8331 | 93.6906 | 96.9998 | 100.0000 | 10|is performed.
------------------------------------------------------------------------------ |Eigenvalues are
|interpreted as in
PLOT COORDINATES OF THE ROW VARIABLES (FIRST 5 DIMENSIONS) |factor analysis.
LABELS DIMENSION |
1 2 3 4 5 |
------------------------------------------------------------------------------ |The plot coordinates
1 1.3059 | .9340 | .4591 | -.2025 | -.0778 | |of the varimax rotated
2 1.4618 | -2.0918 | -.3365 | .5629 | .5660 | |solution are given for
3 1.4618 | -2.0918 | -.3365 | .5629 | .5660 | |the first five
4 -1.8552 | .0009 | .1010 | -.6585 | 2.6364 | |dimensions of the
5 1.4618 | -2.0918 | -.3365 | .5629 | .5660 | |solution.
6 1.3391 | .3441 | -1.5978 | .6987 | .3392 | |These coordinates
7 -1.8405 | -.0443 | .4420 | 1.0455 | .2623 | |identify the points
8 .7709 | .6117 | .0741 | -.5489 | -.1791 | |of both the row and
9 .0540 | 1.1197 | .0708 | -1.7049 | -.3823 | |the column stimuli.
10 1.3767 | -.4932 | 1.9856 | -.5934 | -.0655 | |
11 1.3391 | .3441 | -1.5978 | .6987 | .3392 | |
12 -1.7340 | -.6070 | -1.1036 | -3.1774 | -.3228 | |
13 -.3985 | 2.2392 | 1.6212 | 2.1503 | .0629 | |
14 1.2656 | 2.0544 | -2.0344 | .3681 | .0313 | |
15 -1.7549 | -.4646 | -.2474 | -.0658 | -2.5019 | |
16 1.2898 | 1.0696 | -2.4226 | 1.1650 | .4202 | |
17 -.1378 | 2.2690 | -2.7931 | -.4823 | -.3495 | |
18 .5182 | -1.6205 | -.7129 | -.2697 | .1825 | |
19 1.3221 | .7985 | 3.3408 | -1.5700 | -.5757 | |
20 -.0605 | -1.2603 | -.1684 | 1.3371 | -1.5781 | |
21 1.4618 | -2.0918 | -.3365 | .5629 | .5660 | |
22 1.4618 | -2.0918 | -.3365 | .5629 | .5660 | |
23 -1.8552 | .0009 | .1010 | -.6585 | 2.6364 | |
24 .5182 | -1.6205 | -.7129 | -.2697 | .1825 | |
25 -1.7544 | .1197 | .8365 | 2.8674 | .3701 | |
26 -1.8552 | .0009 | .1010 | -.6585 | 2.6364 | |
27 -1.7340 | -.6070 | -1.1036 | -3.1774 | -.3228 | |
28 1.2459 | 2.2709 | .8537 | -.7232 | -.4220 | |
29 1.2656 | 2.0544 | -2.0344 | .3681 | .0313 | |
30 -1.7549 | -.4646 | -.2474 | -.0658 | -2.5019 | |
31 1.3767 | -.4932 | 1.9856 | -.5934 | -.0655 | |
32 -1.7549 | -.4646 | -.2474 | -.0658 | -2.5019 | |
33 1.3767 | -.4932 | 1.9856 | -.5934 | -.0655 | |
34 -1.8405 | -.0443 | .4420 | 1.0455 | .2623 | |
VARIABLE COORDINATES OF THE COLUMN VARIABLES (5 DIMENSIONS) |
LABELS DIMENSION |
1 2 3 4 5 |
------------------------------------------------------------------------------ |
1 1.2667 | -.5719 | .0235 | -.0958 | .0682 | |
2 -1.2064 | -.3502 | -.6676 | -.7917 | -.2249 | |
3 -1.5830 | -.0929 | .6672 | 2.5197 | -2.6397 | |
4 -1.8482 | .6286 | 1.1432 | 1.7921 | 3.2919 | |
5 1.3094 | -1.5894 | -.3300 | .5564 | .3674 | |
6 .9636 | 2.6944 | -1.8766 | .3970 | -.0441 | |
7 -1.8494 | -.2770 | -.3377 | -1.8087 | -.0235 | |
8 1.0631 | 1.3969 | 3.0195 | -1.1891 | -.5112 | |
------------------------------------------------------------------------------ |
|
*****IDENTIFICATION KEY FOR PLOTS WITH IDENTIFIED POINTS***** |
PT # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
CHAR 1 2 3 4 5 6 7 8 9 A B C D E F |
|
PT # 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
CHAR G H I J K L M N O P Q R S T U |
|
PT # 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
CHAR V W X Y Z + / = * & $ @ % ? < |
|
PT # 46 47 48 49 50 |
CHAR ( ) . ; ^ |
POINT NUMBERS ABOVE 50 IDENTIFIED AS >, MULTIPLE POINTS IDENTIFIED AS # |
DIM 2 |
3.233 +.......+.......+.......+.......+.......+.......+.......+.......+.......+.......+|Plot of Dim. 1 & 2.
3.104 . | .|
2.975 . | .|The points are identified
2.845 . | &(SPRITE) .|by codes.
2.716 . | .|
2.587 + | +|The row points are
2.457 . | .|identified first,
2.328 . D H | S .|starting with 1 and
2.199 . | # .|continuing to the
2.069 . | .|end of the rows.
1.940 + | +|The columns then
1.811 . | .|start with the next
1.681 . | .|alphanumeric char.
1.552 . | @(7-UP) .|contained in the
1.423 . | .|graph legend.
1.293 + | +|
1.164 . 9 G .|Plot labels have
1.035 . | 1 .|been entered using
.905 . | J .|a text editor and
.776 . | .|are not printed
.647 + =(DIET 7-UP) | 8 +|by the program
.517 . | .|
.388 . | # .|
.259 . | .|
.129 . # P | .|
DIM 1 +---------------#---/(DIET PEPSI)-------0---------------------------------------+|
-.129 . | .|
-.259 . $(TAB) +(DIET COKE) | .|
-.388 . # | # .|
-.517 . # | Z(COKE) .|
-.647 + | +|
-.776 . | .|
-.905 . | .|
-1.035 . | .|
-1.164 . K| .|
-1.293 + | +|
-1.423 . | .|
-1.552 . | # *(PEPSI) .|
-1.681 . | .|
-1.811 . | .|
-1.940 + | +|
-2.069 . | # .|
-2.199 . | .|
-2.328 . | .|
-2.457 . | .|
-2.587 + | +|
-2.716 . | .|
-2.845 . | .|
-2.975 . | .|
-3.104 +.......+.......+.......+.......+.......+.......+.......+.......+.......+.......+|
-3.233 -2.587 -1.940 -1.293 -.647 .000 .647 1.293 1.940 2.587 3.|