SAS Data Sets (2)

Basic Concepts

SAS Data Sets

Data Portion

The data portion of a SAS data set is a collection of data values arranged in a rectangular table. In the example below, the name Jones is a data value, the weight 158.3 is a data value, and so on.

Data
portion

Name	Sex	Age	Weight
Jones	M	48	128.6
Laverne	M	58	158.3
Jaffe	F	.	115.5
Wilson	M	28	170.1

Observations (Rows)

Rows (called observations) in the data set correspond to records or data lines in a raw data file or external database. An observation is the information about each entity in a SAS data set. The values Jones, M, 48, and 128.6 make up a single observation.

Observation

Name	Sex	Age	Weight
Jones	M	48	128.6
Laverne	M	58	158.3
Jaffe	F	.	115.5
Wilson	M	28	170.1

The data set shown above has four observations, each containing information about an individual. A SAS data set can store any number of observations.

Variables (Columns)

Columns (called variables) in the data set correspond to fields in a raw data file or external database. A variable is the set of data values that describes a given characteristic. The values Jones, Laverne, Jaffe, and Wilson make up the variable Name in the data set shown below.

Variable

Name	Sex	Age	Weight
Jones	M	48	128.6
Laverne	M	58	158.3
Jaffe	F	.	115.5
Wilson	M	28	170.1

The data set above contains four variables, or categories of information, about each person: Name, Sex, Age, and Weight. A SAS data set can store thousands of variables. Only the capacity of your storage device limits the number of variables in your SAS data sets.

Missing Values

The rectangular arrangement of rows and columns in a SAS data set implies that every variable must exist for each observation. If a data value is unknown for a particular observation, a missing value is recorded in the SAS data set.

Missing
value

Name	Sex	Age	Weight
Jones	M	48	128.6
Laverne	M	58	158.3
Jaffe	F	.	115.5
Wilson	M	28	170.1