Multidimensional Scaling Tutorial
Multidimensional Scaling Overview
In multidimensional scaling models, we assume the existence of an underlying multidimensional
space that describes the items displayed in the space. For example, this space may represent stimuli
(concepts such as brands) and the attributes that describe them, or the similaries between the objects
themselves, or even between the objects and groups of respondents that scale them. The objects in such
models are represented by points in a two or more dimensional space. The dimensions of the space
represent attributes that are perceived as characterizing the stimuli or respondents.
Multidimensional scaling is generally characterized by respondent judgments concerning either (1)
the degree of similarity of pairs of stimuli, or (2) the preference for a stimuli as measured on attributes
describing the stimuli. The judgements are measured as variables either metric, that is, interval or ratio
scaled, of nonmetric, that is, ordinally scaled. The objective of multidimensional scaling is to display either
the similarity between the objects, or the preference for the objects in a space that represents both the set
of objects considered and the underlying rationale for the similarities or preferences. Because the most
common form of multidimensional scaling used in business and marketing is multidimensional preference
analysis we will consider only this tecnique, and its computer implementation known as MDPREF.
Multidimensional Analysis of Preference Data
In the analysis of preference data, average preferences are measured for a set of objects that are
evaluated on a set of attributes. This evaluation of preferences produces an attribute by object matrix that
contains a set of preference evaluations. The individual values found in the matrix are averages of the
respondents who have given preference evaluations.
Using the nomenclature of MDPREF, the input data matrix is defined as having subjects (rows that
produce vectors on the map). These rows are often defined as attributes that describe the stimuli. The
stimuli are the object defined by the columns of the matrix. In an application that does not involve the
attribute-object matrix, the subject vectors could take on any number of forms, but are usually attributes
descriptive of the stimuli (groups, entities, items, points) defined by the columns.
MDPREF is what is known as a "VECTOR MODEL". This means that the objective of the MDPREF
analysis is to identify a perceptual map displaying subject (attribute) vectors. The vector model assumes a
linear model such that preference is greatest at the end of the subject vector, and infinitely better as one
moves an infinite distance along the vector. To form the subject vectors visually, lines are drawn from the
origin of the plot to each subject point. Next, the stimuli (object) points are plotted by MDPREF. Each
stimuli point projects (at a 90 degree angle to the vector) onto each subject vectors. This projection shows
the average subjects metric preference of the stimuli with respect to the subject vectors.
Operationally, preferences may be measured as a simple ranking (1-8 if 8 items are ranked on
attribute 1), or on a value scale.
TECHNICAL INTRODUCTION
MDPREF is designed to do multidimensional scaling of preference or evaluation data. MDPREF is
a metric model based on a principal components analysis (Eckart-Young decomposition). In this analysis, a
data matrix of dimension i subjects by j stimuli is decomposed into two smaller matrices, each of which
approximates the original data matrix in a least squares sense.
The first of these resulting matrices is called a principal component score (or factor score) matrix of
size (i x r), where r is the number of principal components. This matrix depicts the i subjects in the r
principal component dimensions and is designated as [PCS].
The second matrix is called the principal component loading matrix (or factor loading matrix), and is
of size (r x j). This matrix depicts the j stimuli in the r principal component dimensions and is designated as
[PCL].
The original MDPREF program recognized two forms of input: paired comparisons data, and
stimuli evaluation data. The PC-MDS version of MDPREF has deleted the paired comparisons data option
because of the infrequent collection and use of such data. Originally it was from the paired-comparison
matrices, that MDPREF derived a single matrix called the 'first score matrix' of dimension i rows and j
columns. In the PC version of MDPREF, the first score matrix is the data matrix input by the user, and is
designated S*. Each cell of the S* matrix contains a numerical entry (i,j), which represents the ith subject's
rating of the jth stimuli, as measured by the researcher's survey instrument.
The 'first score matrix' which again, is a subject by stimuli matrix of evaluation scores, is
decomposed into r dimensions or principal components. The first score matrix is additionally used to
produce the [PCS] and [PCL] matrices discussed above. Subsequent to this analysis, a second score
matrix is produced, having dimensions (i x j). The second score matrix contains derived projections of
stimuli onto subject vectors. The values of the second score matrix agree as near as possible, in a least
squares sense, with the first scores matrix.
MDPREF is valued as an analytical procedure because the resulting values in the [PCS] and [PCL]
matrices project the stimuli onto subject vectors within the multidimensional stimuli attribute space. This
multidimensional space allows for visual evaluation of the j stimuli an r dimensional space, where r < j.
Sample 3 D Plot
|  |
EXAMPLE MDPREF DATA SET
5.79 6.49 5.80 2.91 4.29 4.03 5.73 1.38 5.22 2.86
3.42 3.89 4.87 5.66 4.93 4.36 3.14 5.18 5.24 3.89
4.68 5.57 3.36 3.47 3.63 5.40 4.61 4.84 3.80 4.50
3.32 4.24 5.01 6.08 6.22 4.47 2.71 3.73 5.35 3.52
4.56 4.19 5.56 5.08 5.52 4.77 4.15 2.77 5.24 2.78
3.35 2.21 4.05 5.86 6.31 5.10 2.24 5.63 5.35 3.98
3.95 3.70 5.28 5.21 5.61 4.89 3.71 4.03 5.17 2.98
3.07 2.71 4.73 6.33 6.31 4.24 3.08 5.07 5.12 4.15
Coke
Coke Cl.
Diet Pepsi
Diet Slice
Diet 7-up
Dr Pepper
Pepsi
Slice
Tab
7-up
Fruity
Carbonation
Calories
Tart
Thirst
Popularity
Aftertaste
Pick-up
SAMPLE MDPREF OUTPUT
M D P R E F |INPUT DATA: 8 ATTRIBUTES X 10 BRANDS
MULTIDIMENSIONAL ANALYSIS OF PREFERENCE DATA | 8 10 2 2 1 0
PROGRAM WRITTEN BY DR. J. D. CARROLL AND JIH JIE CHANG |(10F5.2)
PC - MDS VERSION |5.79 6.49 5.80 2.91 4.29 4.03 5.73 1.38 5.22 2.86
ANALYSIS TITLE: POP DATA ATTRIBUTE BY OBJECTS IN 2 DIMENSIONS |3.42 3.89 4.87 5.66 4.93 4.36 3.14 5.18 5.24 3.89
DATA IS READ FROM FILE: ATTXBR.POP |4.68 5.57 3.36 3.47 3.63 5.40 4.61 4.84 3.80 4.50
OUTPUT FILE IS: ATTXBR.PRN |3.32 4.24 5.01 6.08 6.22 4.47 2.71 3.73 5.35 3.52
|4.56 4.19 5.56 5.08 5.52 4.77 4.15 2.77 5.24 2.78
NP (NO. OF SUBJECTS) 8 |3.35 2.21 4.05 5.86 6.31 5.10 2.24 5.63 5.35 3.98
NS (NO. OF STIMULI) 10 |3.95 3.70 5.28 5.21 5.61 4.89 3.71 4.03 5.17 2.98
NF (NO. OF DIMENSIONS) 2 |3.07 2.71 4.73 6.33 6.31 4.24 3.08 5.07 5.12 4.15
NFP (NO. OF DIMENSIONS PLOTTED) 2 |
|INPUT SPECIFICATIONS:
IREAD 1=NP X NS SCORE MATRIX WITH ROW MEAN SUBTRACTED 1 | 8 = NUMBER OF SUBJECTS (ATTRIBUTE VECTORS)
2=SAME AS 1 WITH SCORES DIVIDED BY ROW S. D. |10 = NUMBER OF STIMULI (ObjectS)
|The parameters call for a 2 dimensional principal
NORP 0=NORMALIZE SUBJ. VECTORS 0 |components solution. Two dimensions will be
1=DO NOT |plotted.
|The data form and normalization option are
*****IDENTIFICATION KEY FOR PLOTS WITH IDENTIFIED POINTS*****|specified
|
PT # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |Codes for reading the two dimensional plots.
CHAR 1 2 3 4 5 6 7 8 9 A B C D E F |
PT # 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
CHAR G H I J K L M N O P Q R S T U |
PT # 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
CHAR V W X Y Z + / = * & $ @ % ? < |
PT # 46 47 48 49 50 |
CHAR ( ) " ; @ |
POINT NUMBERS ABOVE 50 IDENTIFIED AS >, MULTIPLE POINTS IDENTIFIED AS # |
|
IN JOINT SPACE PLOTS, THE FIRST 10 POINTS ARE STIMULI |
AND THE NEXT 8 POINTS ARE SUBJECTS |
|
INPUT FORMAT = (3X,10F7.4) |
|
MEAN OF THE RAW SCORES (BY SUBJECT) |Mean of the subject variables.
4.4543 4.4624 4.3913 4.4703 4.4666 4.4107|These are the row variables. In the example
4.4561 4.4860 |the means of each of the 8 attributes are given.
|
FIRST SCORE MATRIX (SUBJECT BY STIMULUS) |The original data matrix values minus the subject
1 1.3347 2.0367 1.3527 -1.5423 -.1563 -.4193|(row) mean. (i.e., 1.3347 = 5.789-4.4543 or in
1.2827 -3.0683 .7737 -1.5943 |other words, fruity/coke - average fruitiness
. |
. |The first score matrix is a preference score matrix,
. |where each entry is the preference rating made by
. |the ith subject on the jth stimulus.
8 -1.4160 -1.7670 .2510 1.8470 1.8300 -.2400|The first score matrix is decomposed by MDPREF into
-1.3980 .5840 .6370 -.3280 |NF dimensions.
CROSS PRODUCT MATRIX OF SUBJECTS |The cross products matrix is an intermediate matrix
1 24.5385 -6.2786 .7973 -1.8374 7.8784 -14.3155|used in the computation of the subject x subject
.6241 -10.7526 |(attribute x attribute) correlation matrix.
. |
. |
. |
8 -10.7526 8.6405 -6.5557 11.2627 4.1261 15.7736|
7.3943 14.8171 |
|
CORRELATION MATRIX OF SUBJECTS |The subject x subject (attribute x attribute)
1 1.0000 -.4985 .0680 -.1042 .5223 -.6536|correlation matrix is the basis for computing
.0475 -.5639 |the underlying dimensionality of the data matrix.
. |
. |
. |
8 -.5639 .8829 -.7200 .8223 .3520 .9268|
.7249 1.0000 |
|
CROSS PRODUCT MATRIX OF STIMULI |
1 7.6801 9.1070 .1489 -9.9256 -8.0465 -.7597|
9.7955 -5.9361 -3.1618 1.0983 |
. |
. |
. |The eigenvalues or characteristic roots of the
10 1.0983 .5196 -6.0104 -3.2425 -6.6664 -.5405|principal components factor analysis. For principal
3.3761 7.9758 -5.5578 9.0480 |components analysis, the eigenvalues equal the sum
|of the squared correlations (squared loadings)
ROOTS OF THE FIRST SCORE MATRIX |of the subjects (attributes) on stimuli (objects).
62.5203 30.2225 3.3362 2.0768 .9772 .5864|In other words, this is the sum of the r squares
.1766 .0212 |and shows the amount of variance accounted for by
|each component or dimension underlying the principal
PROPORTION OF VARIANCE ACCOUNTED FOR BY EACH FACTOR |components factor analysis.
1 2 |
.6257 .3025 |The proportion of variance accounted for by dimensions
|one and two shows that 62.57% of all variance is
CUMULATIVE PROPORTION OF VARIANCE ACCOUNTED FOR |accounted for by dimension 1 and 32.25% of variance
1 2 |is accounted for by dimension 2.
.6257 .9282 |
|The cumulative sum of variance accounted for shows
SECOND SCORE MATRIX (SUBJECT BY STIMULUS) |that 92.82% of all preference variance is accounted
1 .2667 .3995 .2930 -.2647 -.0529 -.0627|for by the first two dimensions.
.2813 -.6385 .1074 -.3291 |
. |The second score matrix is derived projections of
. |stimuli (objects) onto subject (attribute) vectors.
. |This is as nearly proportional as possible to the
8 -.3323 -.4352 .0518 .4608 .4299 .0421|first score matrix.
-.4577 .1791 .1969 -.1354 |
POPULATION MATRIX |
FACTOR |The population matrix is the dimension 1 and 2 plot
1 .6091 .7931 |projections of subjects (attributes) vectors.
2 -.9974 .0726 |The coordinates of the subject vectors are on the
3 .8406 -.5416 |unit circle (Euclidean distance from origin = 1.0).
4 -.8559 .5172 |
5 -.3318 .9434 |
6 -.9953 -.0972 |
7 -.7584 .6518 |
8 -.9989 .0460 |
|
STIMULUS MATRIX (NORMALIZED) |
FACTOR |The projections of ten stimuli (brands) on to
1 .3362 .0781 |dimensions 1 and 2. These are the coordinates used
2 .4431 .1634 |in graphs 2 and 3.
3 -.0336 .3953 |
4 -.4604 .0198 |
5 -.4186 .2549 |
6 -.0442 -.0451 |
7 .4583 .0027 |
8 -.2090 -.6446 |
9 -.1843 .2769 |
10 .1125 -.5013 |
|
STIMULUS MATRIX (STRETCHED BY SQ. ROOT OF THE EIGENVALUES) |
FACTOR |By stretching stimuli (objects) relative to the
1 2.6584 .4291 |square root of the eigenvalues, the scales are
2 3.5039 .8983 |weighted for the amount of variance explained by
3 -.2659 2.1731 |each dimension (analogy: weighting by importance).
4 -3.6401 .1090 |This matrix is not included in the plots.
5 -3.3101 1.4012 |
6 -.3496 -.2480 |
7 3.6236 .0146 |
8 -1.6522 -3.5437 |
9 -1.4575 1.5225 |
10 .8893 -2.7561 |
|
|
PLOT OF STIMULI AND SUBJECTS IN DIMENSIONS 1 AND 2 |
+....+....+....+....+....+....+....+....+....+....+....+....+ |
. | . |
. | . |
.97+ F | + |1-9 + A = 10 Stimuli (objects)
. | B . |B - I = 8 subjects (attribute) vectors
. H | . |
.55+ E | + | The projection of stimuli (objects) on each subject
. 3| . | vector is as similar as possible to the order of
. 5 9 | . | preference expressed by the subject in the original
.14+ C | 1 2 + | preference data.
.---------------I-------4-----60------7-----------------------. |
. G | . |
-.28+ | + |
. | . |
. | A D . |
-.69+ 8 | + |
. | . |
. | . |
-1.11+ | + |
+....+....+....+....+....+....+....+....+....+....+....+....+ |
-2.0 -1.7 -1.3 -1.0 -.7 -.3 .0 .3 .7 1.0 1.3 1.7 2.0 |
|