| REQUIRED FILES | SOURCE OF FILE |
| PC-MDS COMMAND FILE => Is created with the PC-MDS EDITOR or your word processing file. | |
| See "THE PC-MDS COMMAND FILE" documentation. Unless new variable transformations are required for different analyses, the command file is the same for the FACTOR, REGRESSION, DISCRIMINANT AND FREQUENCY programs. | |
| ASCII DATA FILE => Is created using the PC-MDS EDITOR OR YOUR WORD PROCESSOR, SAVING THE FILE AS A DOS TEXT FILE | |
| The Data File may be downloaded from a mainframe computer or produced by another computer program. The Data File MUST be an ASCII file (numbers and letters only). ASCII Data Files Do not use commas and do not require spaces to separate variables. | |
Example:
1775113118861141175114117113115817711771114711567430829120987615347821096453 05720938761452209876154360295318742198763450305982174610987624531098762453 057330987416526644142666441646642235132432343324323433142244346643236243234333 0574544334352542354322111023313435 57 1 2 2 2 2 The same data file is used for all analyses. | |
| PC-MDS OUTPUT FILE => is Automatically Created by PC-MDS | |
| The name is specified interactively when running each PC-MDS program. | |
The PC-MDS Command File
The command file defines the various variables, their format and locations, defines missing values for variables and recodes the values of the variables, if desired. The same command file may be used for all analyses.
For purposes of clarification, the command files are designated as files with an "SPS" extension (i.e., *.SPS). The name of the PC-MDS command file is specified interactively by the user when each program is run. (Note that the .SPS designation is used for instructional clarity only. The command file may have any name and does not require the .SPS extension).
TITLE SAMPLE COMMAND FILE FILE NAME 'SAMPLE.DAT' DATA LIST V1 TO V20, V21 21 (3X,F4.0,19F4.1,F4.0) VARIABLE LABELS V1 'CITY' V2 'SEX' V3 'FIRST NEWS SOURCE' V4 'SECOND NEWS SOURCE' V5 'THIRD NEWS SOURCE' V20 'IMP ADS' V21 'READ HERALD' COMPUTE V1 = 20*0.5 COMPUTE V2 = 81**.5 IF(V1 EQ 10) V33=1 IF(V2 EQ 9) V34=2 IF(V1 EQ 10) V35=1 COMPUTE V29 = COS(0.5+0.5*2) COMPUTE V39 = COS(0.5+0.5*2) VARIABLE LABELS V29 'AGE' V39 'INCOME' MISSING VALUES V29 (0)/V21(1110)/V1 TO V19 (5,55) RECODE V5,V7 TO V10 (LOW TO 10 = 55) (LOW TO 20 = 66)/ V12 TO V15 (1 TO HIGH = 5) / V19 TO V20 (1,2,3 = 7) (4,5,6= 8)
EXAMPLE # 2
TITLE 7-11 CREDIT CARD ANALYSIS FILE NAME 'SEVEN.DAT' DATA LIST V1 TO V30 30 (12F1.0,F2.0,2F1.0,9F3.1,5F1.0,F3.0) VARIABLE LABELS V1 'AVERAGE WEEKLY VISITS' V2 'AVERAGE WEEKLY GAS PURCHASES' V3 'GAS QUALITY IMPORTANCE' V4 '7-11 GAS QUALITY IMPRESSION' V5 'SERVICE QUALITY' V6 'MERCHANDISE QUALITY' V7 'HIGHEST QUALITY GAS' V8 'CREDIT CARD FOR GAS' V9 'MOST COMMON GAS PAYMENT' V10 'SECOND GAS PAYMENT CHOICE' V11 'HOW OFTEN GAS CREDIT CARD PAID' V12 'HOW FAR AWAY FROM 7-11' V13 'AGE' V14 'SEX' IF (V2 LE 1) NEWVAR=1 IF ((V2 GT 2) AND (V4 LE 3)) NEWVAR=2 IF ((V2 GT 5) AND (V4 GT 3)) NEWVAR=3 COMPUTE V31 = COS(0.5+0.5*2) COMPUTE V32 = LG10(V1) MISSING VALUES V1 TO V12,V14 TO V29(0) RECODE V13 (0 TO 17=1) (18 TO 23=2) (24 TO 29=3) (30 TO 39=4) (40 TO 49=5) (50 TO 64=6) (65 TO 72=7)/ V14 (4=0) (6=0) SELECT IF ( V14 EQ 2)
The Data File
The data file contains the data in the format described in the Command File. The data files are usually named with a "DAT" extension (i.e., *.DAT). The example command file above identifies the data file as 'SAMPLE.DAT'. The data file is specified in line 2 of the command file (the FILENAME command).
(Note that the .DAT designation is used for instructional clarity only. The data file may have any name and does not, in reality, require the .DAT extension).
533351221103241110 41112307 4224432141111721 5 16 41122308 3233532212122011 5 41121309 544342221505271110 41112310 5423511251013222 5 16 41122311 452342125101291311 6 41121312 322343215111311312 11 41121313 5544532115513113 6 32 41121314 4335524111312913107 12 41121315 1111542011143021 5 6 541121316 2334421112223613208 41122317 4244522151413014115 7 3 2 541122318 342342121502261215 41122319
The Output File
An output file name must be interactively specified by the user while running each of the PC-MDS programs. The output file is the file to which the analysis is printed. A common convention is to name the file with a "PRN" extension to signify a print file (i.e., *.PRN).
1. PRINTBACK (OPTIONAL)
Use: To print the command file as part of the output file. The default is PRINTBACK YES
PRINTBACK NO (Does not print command file)
2. TITLE (REQUIRED)
Use: To specify a title.
(The word TITLE followed by one or more blank spaces and a title (50 Characters maximum length). For example:
TITLE MATURE MARKET STUDY
3. FILE NAME (REQUIRED)
Use: To specify the name and location of the data file.
Specify the complete path for reading the data file. The path and file name must be placed in single quotes. The entire path and file name is limited to 50 characters. Generic example of the FILE NAME command includes:
FILE NAME 'C:\SUB1\FILENAM.EXT' or
FILE NAME 'FILENAM.EXT'
In the first example, the data file is called FILENAM.EXT and is found in the subdirectory called
SUB1 within the C drive. In the second example the data file is found in the default directory and
is called FILENAM.EXT.
4. DATA LIST (REQUIRED)
Use: To specify the variables to be read in from the data file, as well as the column in which each variable is found in the data file.
The DATA LIST command identifies the variables included in the data file. The data file name is specified by the FILE NAME command. The DATA LIST has two parts. The first part specifies the list of variables. The second part specifies how the data is to be read. The list of variable appears:
DATA LIST V17,V28 TO V37,V40,V41,V49,V108 TO V116,
V145,V146,V148,CLUS1
V149,V151,V152,CLUS2
V155,V156,V158,CLUS3
14 (34X,F1.0,10X,9F1.0,F2.0,3X,2F1.0,7X,F1.0)
9 (57X,9F1.0)
4 (25X,2F1.0,2X,F1.0/19X,F1.0)
The words DATA and LIST must be separated by a space and followed by a space. The DATA LIST specification of variables can continue on to several lines of the file, as long as the first columns are blank. The variable names may be up to 8 characters long and must start with a letter (Variable names cannot start with a number). The "TO" convention may be used to specify a continuously numbered sequence. For example, V28 TO V37 is used to specify 10 variables. The word "TO" must be preceded and followed by a blank space. Up to 500 total variables may be defined in the Command file using compute or IF statements. The maximum number of variables that can be included in a given analysis varies by analysis. For example, 50 of the 500 defined variables may be specified interactively for inclusion in the DISCRIM, FACTOR, and REGRESS programs.
The second part of the DATA LIST is the input format for reading each of the variables identified in the first part. The format specification may be up to 15 lines in length (This is 15 lines of specification, not 15 lines per respondent in the data file). Each line begins in column 1 by indicating the number of variables to read. This number is followed by one or more blank spaces and then the FORTRAN format statement for reading the number of variables specified. Note that each format line may specify more than one line of data in the data file. The format statement in each format line is limited to 120 characters in length. Examples of valid statements include:
14 (34X,F1.0,10X,9F1.0,F2.0,3X,2F1.0,7X,F1.0)
9 (57X,9F1.0)
4 (25X,2F1.0,2X,F1.0/19X,F1.0)
The example is a valid DATA LIST sequence for reading 27 variables from 4 lines in the data file. Note that the last format specification reads from two lines within the data file (the "/" causes a skip to the next line of data).
Preparing A Format Statement: Format statements tell PC-MDS how the data file is to be read
for a single observation or respondent. Format statements are prepared using the standard
FORTRAN language coding conventions. These conventions are quite simple and easy to learn
and have previously been discussed in this manual.
5. VARIABLE LABELS (OPTIONAL)
Use: To identify labels defining the variables.
Contains a label (up to 40 characters) identifying variables from the DATA LIST.
Labels must follow a valid variable name. At least one space must appear between the variable name and the label. The label must be enclosed in single quotes. The variable names must start in column two or beyond. VARIABLE LABELS may be declared for new variables after they have been created using COMPUTE or IF statements.
VARIABLE LABELS
V17 'OVERALL HEALTH CONDITION'
V28 'ABILITY TO PERFORM EVERYDAY ACTIVITIES'
V29 'ABILITY TO...TRANSPORTATION'
V30 'ABILITY TO...HOUSE CLEANING'
CLUS 'CLUSTER GROUPING'
6. MISSING VALUES (OPTIONAL)
Use: To specify variable values that are missing or are to be otherwise excluded from the analysis.
The missing values are used to specify those variables for which values are deemed to be missing. The convention used to specify MISSING VALUES is to list the variable names using either individual names, or a "TO" argument which specifies a range of variables for which values are missing. The variable list is followed by a list of the values to be declared as missing.
The keywords "LO" and "HIGH" are valid operators. The slash "/" is used to separate the variable lists having different values declared as missing. The values to be declared as missing are identified within parenthesis. If multiple values are declared as missing, then they must be separated by commas. An example might appear:
MISSING VALUES V17,V28 TO V37,V40,V41,V49,V108 TO V116,V145,V146,V148,CLUS(0)/
V28(6)/V145(8)/
V29 TO V36,V40,V41(4,5)
MISSING VALUES V29 (0)/V21(1110)/V1 TO V19 (5,55)
Note that the last missing value specification must NOT end with a slash. Respondent data that has missing values may be treated for LISTWISE or PAIRWISE deletion, depending on the analysis conducted.
PAIRWISE deletion means that if the value is missing (for the dependent and/or independent variable) then the respondents data for the independent variable in question is eliminated from the analysis.
LISTWISE deletion means that the entire case is excluded from analysis for all variables.
PAIRWISE OR LISTWISE deletion is specified interactively by the user in all bivariate or
multivariate programs. Several programs use only LISTWISE deletion, because pairwise deletion
does not provide complete information required for the analysis. In this case, the LISTWISE
option is invoked automatically if missing values are declared.
7. IF (OPTIONAL)
Use: To specify conditional relationships between variables.
The IF command specifies a conditional relationship between the variables. The IF command may be used to assign values to existing variables or to new variables that are created automatically when defined in the IF statement. Variables may be transformed using the IF statement. Mathematical expressions are supported on the right side of the equality expression. Examples of general form include:
IF (VAR1 EQ 3) VAR2 = 1
IF (VAR1 EQ 3) VAR2 = VAR5
IF (VAR1 EQ V2) VAR3 = COS(VAR4*.5)
IF ((VAR1 EQ 3) AND (VAR2 EQ 4)) VAR3 = 5
IF ((VAR1 EQ VAR3) OR (VAR2 EQ 22)) VAR4 = 25
The command word IF must begin in column 1 of the command file and be followed by a blank space.
Where one of the following RELATIONAL OPERATORS is used to specify the IF relationship:
EQ EQUALS
GT GREATER THAN
LT LESS THAN
GE GREATER THAN OR EQUAL
LE LESS THAN OR EQUAL
NE NOT EQUAL
AND TWO CONDITIONS HOLD
OR EITHER CONDITION HOLDS
8. COMPUTE (OPTIONAL)
Use: To unconditionally compute values for new or existing variables.
The compute statement allows the user to create new variables or compute intermediate values for each case read by the program. The COMPUTE statement may be used to compute new variables not defined in the DATA LIST or to transform current variables. The COMPUTE statement may be used with any of the relational operators. The following 14 examples would each begin in the first column of the line.
COMPUTE V1 = 20*0.5 COMPUTE V2 = 81**.5
COMPUTE V3 = ABS(12) COMPUTE V4 = ABS(V1-V2)
COMPUTE V5 = TRUNC(2.345) COMPUTE V6 = TRUNC(5.95)
COMPUTE V7 = 256 COMPUTE V8 = 50
COMPUTE V9 = SQRT(V8) COMPUTE V10 = LG10(V9)
COMPUTE V11 = ARSIN(0.14) COMPUTE V33 = 0
COMPUTE V29 = COS(0.5+0.5*2)
|
MATHEMATICAL AND FUNCTION OPERATORS | ||
|
KEYWORD |
MEANING | EXAMPLE |
| ABS(value) | Absolute Value | VAR3=ABS(V2-V1) |
| ARCOS(value) | Arc Cosine | VAR3=ARCOS(V2-V1) |
| ARSIN(value) | Arc Sine | VAR3=ARSIN(V2-V1) |
| ARTAN(value) | Arc Tangent | VAR3=ARTAN(V2-V1) |
| COS(value) | Cosine | VAR3=COS(V2-V1) |
| EXP(value) | Exponentiation | VAR3=EXP(V2-V1) |
| LG10(value) | Log Base 10 | VAR3=LG10(V2-V1) |
| LN(value) | Natural Logarithm | VAR3=LN(V2-V1) |
| MOD(value,value) | Remainder | V1/V2 VAR3=MOD(V1,V2) |
| RND(value) | Round to whole # | VAR3=RND(V2-V1) |
| SIN(value) | Sine | VAR3=SIN(V2-V1) |
| SQRT(value) | Square Root | VAR3=SQRT(V2-V1) |
| TAN(value) | Tangent | VAR3=TAN(V2-V1) |
| TRUNC(value) | Truncate | VAR3=TRUNC(V2-V1) |
| / | Division | VAR3=V1/V2 |
| * | Multiplication | VAR3=V1*V2 |
| '+ | Addition | VAR3=V1+V2 |
| - | Subtraction | VAR3=V1-V2 |
| ** | Exponentiation | VAR3=V1**V2 |
NOTE: For any of the above COMPUTE relational operators, you may substitute VARIABLE
NAMES or VALUES for V1 and V2.
9. RECODE (OPTIONAL)
Use: To combine or re-assign data values for defined variables.
The RECODE statement is used to recode data values. A variable list is specified, followed by
one or more value lists specifying the values to be recoded and the value to which they are to be
recoded. The slash "/" operator separates the different variable lists. The keywords LOW, TO
and HIGH are valid arguments.
RECODE V5,V7 TO V10 (LOW TO 10 = 55) (LOW TO 20 = 66) /
V12 TO V15 (1 TO HIGH = 5) /
V18 (2,3,4 TO 10 = 6) /
V19 TO V20 (1,2,3 = 7) (4,5,6= 8)
The last list in the recode statement must NOT end with a slash.
10. WEIGHT (OPTIONAL)
Use: To weight the sample size used in computation of statistics.
The WEIGHT statement is used to weigh the sample size for all variables included in the analysis.
The value used for weighing is the value of the variable declared as the weight variable. For
example, if VAR1 is declared as the weight variable, then for each respondent, the respondent's
value on VAR1 is multiplied by 1.0 and is the increment in the sample size for the individual
respondent, when computing the statistics for variables defined in the analysis. Weights are most
often used for adjusting sampling proportions. The form of the statement is:
WEIGHT VAR1
11. SELECT IF (OPTIONAL)
Use: To select or exclude individual respondents data.
The SELECT IF statement is used to select or include an entire respondent's data for analysis.
The statement is of the same form as the IF statement, but does not assign a mathematical value.
The assignment results in an internal 0 or 1 for including the case in the analysis. The proper form
is:
SELECT IF (VAR2 EQ 2)
The statement must include a variable name, one of the relational operators (EQ GT LT GE LE NE) and then either a value or another variable name.