To illustrate the concept, consider a group of subjects who prefer brand 1 to brand 2. In this case, because brand 1 is preferred, the proportion of total comparisons in which brand 1 is preferred is near 100%. This situation is, however complicated when brand 2 is compared to brand 3. Suppose a smaller percentage, say 40%, prefer brand 2 to brand 3. The differences between preference of brand 1 to 2 and 2 to 3 would lead to the expectation that scale values for brands 1 and 2 would be greater than 2 and 3. Thurstone's Case 5 scaling provides a means for developing an interval scale from these proportions associated with comparative judgments.
Thurstone CASE 5 computes scale scores from
(1) Raw paired comparisons data
(2) A frequency matrix containing the number of times stimulus (j) (column) was preferred to stimulus (i) (row), or
(3) Ranked data.
The basic reference for algorithm and test problems is:
Allen L. Edwards (1957), "Techniques of Attitude Scale Construction," New York, N.Y.: Appleton-Century-Crofts, Inc., pp. 33,42.
Line 1: PARAMETERS (Input in FREE Format)
1. Number of Respondents:
2. Number of Stimuli: Maximum = 60
3. Input Data Form:
0 = Raw data, i.e., A 1 or 0 for each stimulus pair, in "normal pair ordering", for each respondent. i.e., For pair (i,j), a "1" indicates I was preferred over j. A "0" indicates I was not preferred over j. Place 1 in the diagonals.
1 = Frequency matrix. Each row of the matrix must be entered as a unit on one or more lines of the data file. Start each row on a new line. Entries indicate the number of times the column stimulus (j) was preferred or rated over the row stimulus.
2 = Ranked data input. Columns labels represent the number of the stimuli (in order) and entries are rank orders of those stimuli. Ties may be entered. See respondents 5 -7 in the example below.
3 = Ranked data input. Column labels represent rank order numbers and entries are stimuli numbers.
4. Variable names for scale (If used, Parameter 5, the Output Scale, must be set to 1)
5. Output Scale Parameter:
6. Print Option Parameter:
Line 2: Input FORMAT describing the input data field
(80 columns maximum. F type format required)
DATA: DATA SET IS PLACED AFTER THE FORMAT STATEMENT
Lines 3: Labels (One line for each variable. Max=28 characters)
Lines 4 and 5: Titles (2 Lines, each up to 80 Characters wide)
When the scale graph is printed, variables are drawn on several different scales. Should you have multiple runs of a case on segmented data, it is possible upper limits may not be the same on each run and comparison might prove difficult. It is suggested that a common scale be selected for each segment and the other scales be discarded.
NOTES
1. An excellent discussion of Thurstone's Case 5 scaling is found in Green, P. E. and D. S. Tull Research for Marketing Decisions, Prentice Hall (1978), pp. 180-187.
CASE 5 SAMPLE DATA SET #1
In this example, we have 7 brands that are evaluated for preference using a paired comparisons task. The respondent was asked which brand they preferred... A or B for all of the possible brand combinations.
|
B1
|
B1
|
B1
|
B1
|
B1
|
B1
|
B2
|
B2
|
B2
|
B2
|
B2
|
B3
|
B3
|
B3
|
B3
|
B4
|
B4
|
B4
|
B5
|
B5
|
B6
|
|
| Base - Total N |
27
|
26
|
21
|
25
|
29
|
27
|
25
|
32
|
24
|
26
|
27
|
27
|
22
|
27
|
25
|
26
|
25
|
27
|
26
|
32
|
20
|
| Brand 1 N |
15
|
15
|
12
|
12
|
15
|
17
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
| Brand 1 Percent |
56%
|
58%
|
57%
|
48%
|
52%
|
63%
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
| Brand 2 N |
12
|
0
|
0
|
0
|
0
|
0
|
15
|
14
|
9
|
17
|
18
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
| Brand 2 Percent |
44%
|
-
|
-
|
-
|
-
|
-
|
60%
|
44%
|
38%
|
65%
|
67%
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
-
|
| Brand 3 N |
0
|
11
|
0
|
0
|
0
|
0
|
10
|
0
|
0
|
0
|
0
|
15
|
12
|
15
|
17
|
0
|
0
|
0
|
0
|
0
|
0
|
| Brand 3 Percent |
-
|
42%
|
-
|
-
|
-
|
-
|
40%
|
-
|
-
|
-
|
-
|
56%
|
55%
|
56%
|
68%
|
-
|
-
|
-
|
-
|
-
|
-
|
| Brand 4 N |
0
|
0
|
9
|
0
|
0
|
0
|
0
|
18
|
0
|
0
|
0
|
12
|
0
|
0
|
0
|
17
|
13
|
11
|
0
|
0
|
0
|
| Brand 4 Percent |
-
|
-
|
43%
|
-
|
-
|
-
|
-
|
56%
|
-
|
-
|
-
|
44%
|
-
|
-
|
-
|
65%
|
52%
|
41%
|
-
|
-
|
-
|
| Brand 5 N |
0
|
0
|
0
|
13
|
0
|
0
|
0
|
0
|
15
|
0
|
0
|
0
|
10
|
0
|
0
|
9
|
0
|
0
|
13
|
18
|
0
|
| Brand 5 Percent |
-
|
-
|
-
|
52%
|
-
|
-
|
-
|
-
|
63%
|
-
|
-
|
-
|
45%
|
-
|
-
|
35%
|
-
|
-
|
50%
|
56%
|
-
|
| Brand 6 N |
0
|
0
|
0
|
0
|
14
|
0
|
0
|
0
|
0
|
9
|
0
|
0
|
0
|
12
|
0
|
0
|
12
|
0
|
13
|
0
|
4
|
| Brand 6 Percent |
-
|
-
|
-
|
-
|
48%
|
-
|
-
|
-
|
-
|
35%
|
-
|
-
|
-
|
44%
|
-
|
-
|
48%
|
-
|
50%
|
-
|
20%
|
| Brand 7 N |
0
|
0
|
0
|
0
|
0
|
10
|
0
|
0
|
0
|
0
|
9
|
0
|
0
|
0
|
8
|
0
|
0
|
16
|
0
|
14
|
16
|
| Brand 7 Percent |
-
|
-
|
-
|
-
|
-
|
37%
|
-
|
-
|
-
|
-
|
33%
|
-
|
-
|
-
|
32%
|
-
|
-
|
59%
|
-
|
44%
|
80%
|
This data shows that 15 people (15/27=56%) preferred brand 1 over Brand 2 and 12 people (12/27=44%) preferred brand 2 over brand 1. These percentages are entered into a square data matrix with the diagonal values equal to .50.
0.50 0.44 0.42 0.43 0.52 0.48 0.37
0.56 0.50 0.40 0.56 0.63 0.35 0.33
0.58 0.60 0.50 0.44 0.45 0.44 0.32
0.57 0.44 0.56 0.50 0.35 0.48 0.59
0.48 0.38 0.55 0.65 0.50 0.50 0.44
0.52 0.65 0.56 0.52 0.50 0.50 0.80
0.63 0.67 0.68 0.41 0.56 0.20 0.50
The resulting data file appears:
7 7 1 1 1 1
(7F5.2)
0.50 0.44 0.42 0.43 0.52 0.48 0.37
0.56 0.50 0.40 0.56 0.63 0.35 0.33
0.58 0.60 0.50 0.44 0.45 0.44 0.32
0.57 0.44 0.56 0.50 0.35 0.48 0.59
0.48 0.38 0.55 0.65 0.50 0.50 0.44
0.52 0.65 0.56 0.52 0.50 0.50 0.80
0.63 0.67 0.68 0.41 0.56 0.20 0.50
Brand 1 Label
Brand 2 Label
Brand 3 Label
Brand 4 Label
Brand 5 Label
Brand 6 Label
Brand 7 Label
Scale:Overall Brand Preference
7 Brands
CASE 5 SAMPLE DATA SET #2
10 05 02 1 1 1 (5F2.0) 0102030405 0504030201 0402010304 0302010405 0202020101 0303030302 0501010204 0502010204 0201030201 0102020204 VAR LABEL 1 VAR LABEL 2 VAR LABEL 3 VAR LABEL 4 VAR LABEL 5 SCALE TITLE LABEL SUB TITLE LABEL
THURSTONE CASE 5
PC-MDS VERSION
ANALYSIS TITLE: CASE5 TEST DATA
DATA IS READ FROM FILE: CASE5.DAT
OUTPUT PRINT FILE IS: CASE5.PRN
NUMBER OF RESPONDENTS = 10.
NUMBER OF STIMULI = 5
LOW LIMIT OF P = .0250
HIGH LIMIT OF P = .9750
INPUT IS RANKED DATA, COLUMNS REPRESENT STIMULI
(5F2.0)
TALLY OF COMPARISONS
8.0 8.0 7.5 8.0 7.0 6.5 8.0 7.0 9.0 8.5
FREQUENCY MATRIX - NUMBER OF TIMES STIMULUS (J) (COLUMN)
PREFERRED OR RATED OVER STIMULUS (I) (ROW)
1 2 3 4 5
1 5.0000 8.0000 8.0000 7.5000 8.0000
2 2.0000 5.0000 7.0000 6.5000 8.0000
3 2.0000 3.0000 5.0000 7.0000 9.0000
4 2.5000 3.5000 3.0000 5.0000 8.5000
5 2.0000 2.0000 1.0000 1.5000 5.0000
SUMS 13.5000 21.5000 24.0000 27.5000 38.5000
PROPORTION MATRIX - FREQUENCY MATRIX / NUMBER OF RESPONDENTS
1 2 3 4 5
1 .5000 .8000 .8000 .7500 .8000
2 .2000 .5000 .7000 .6500 .8000
3 .2000 .3000 .5000 .7000 .9000
4 .2500 .3500 .3000 .5000 .8500
5 .2000 .2000 .1000 .1500 .5000
SUMS .8500 1.6500 1.9000 2.2500 3.3500
THETA FOR P MATRIX
1 2 3 4
2 26.5650
3 26.5650 33.2108
4 29.9999 36.2711 33.2108
5 26.5650 26.5650 18.4349 22.7864
Z MATRIX -- STANDARD NORMAL DEVIATES CORRESPONDING TO
THE ENTRIES IN THE PROPORTION (P) MATRIX
**** INDICATES CORRESPONDING PROPORTION IS ABOVE THE HIGHER
LIMIT OR BELOW THE LOWER LIMIT OF P
1 2 3 4 5
1 .0000 .8420 .8420 .6740 .8420
2 -.8420 .0000 .5240 .3850 .8420
3 -.8420 -.5240 .0000 .5240 1.2820
4 -.6740 -.3850 -.5240 .0000 1.0360
5 -.8420 -.8420 -1.2820 -1.0360 .0000
SUMS -3.2000 -.9090 -.4400 .5470 4.0020
ZD (COLUMN DIFFERENCE) MATRIX
ENTRIES ARE DIFFERENCES BETWEEN THE INDICATED COLUMN ENTRIES OF THE Z MATRIX
**** INDICATES A MISSING ENTRY IN EITHER COLUMN OF THE Z MATRIX
2- 1 3- 2 4- 3 5- 4
1 .8420 .0000 -.1680 .1680
2 .8420 .5240 -.1390 .4570
3 .3180 .5240 .5240 .7580
4 .2890 -.1390 .5240 1.0360
5 .0000 -.4400 .2460 1.0360
SUMS 2.2910 .4690 .9870 3.4550
N 5 5 5 5
MEANS .4582 .0938 .1974 .6910
*****FINAL SCALE VALUES*****
STIMULUS # 1 2 3 4 5
SCALE VALUE .0000 .4582 .5520 .7494 1.4404
SCALE TITLE LABEL
SUB TITLE LABEL
NUMBER OF SUBJECTS= 10
1.45 - VAR LABEL 5
.75 - VAR LABEL 4
.55 - VAR LABEL 3
.46 - VAR LABEL 2
.00 - VAR LABEL 1
FINAL DRAWING ON 1.50 SCALE
SCALE TITLE LABEL
SUB TITLE LABEL
NUMBER OF SUBJECTS= 10
1.45 - VAR LABEL 5
.75 - VAR LABEL 4
.55 - VAR LABEL 3
.46 - VAR LABEL 2
.00 - VAR LABEL 1
FINAL DRAWING ON 2.00 SCALE
SCALE TITLE LABEL
SUB TITLE LABEL
NUMBER OF SUBJECTS= 10
1.45 - VAR LABEL 5
.75 - VAR LABEL 4
.56 - VAR LABEL 3
.46 - VAR LABEL 2
.00 - VAR LABEL 1
FINAL DRAWING ON 2.50 SCALE
*****INTERNAL CONSISTENCY CHECK*****
DETERMINATION OF HOW WELL OBSERVED PROPORTION MATRIX (P) AGREES WITH
THEORETICAL PROPORTIONS (P-PRIME) CALCULATED FROM DERIVED SCALE VALUES
Z-PRIME MATRIX, - THEORETICAL NORMAL DEVIATES
CORRESPONDING TO SCALE VALUE DIFFERENCES
1 2 3 4
2 -.4582
3 -.5520 -.0938
4 -.7494 -.2912 -.1974
5 -1.4404 -.9822 -.8884 -.6910
P-PRIME MATRIX, - THEORETICAL PROPORTIONS,
CORRESPONDING TO Z-PRIME MATRIX ABOVE
1 2 3 4
2 .3230
3 .2900 .4620
4 .2260 .3850 .4210
5 .0740 .1620 .1870 .2440
THETA PRIME FOR P PRIME MATRIX
1 2 3 4
2 34.6338
3 32.5826 42.8206
4 28.3850 38.3514 40.4545
5 15.7850 23.7340 25.6221 29.6014
CHI SQ = 5.7904 Z VALUE = .0864
DISCREPANCY MATRIX, -- "P" MINUS "P-PRIME"
1 2 3 4
2 -.1230
3 -.0900 -.1620
4 .0240 -.0350 -.1210
5 .1260 .0380 -.0870 -.0940
SUMS .3630 .2350 .2080 .0940
N 4 3 2 1
AVERAGE DISCREPANCY = .0900