SAS OnlineTutor HomeFAQ PageSuggested Learning PathsContents+Searchback||next

Reading Raw Data in Fixed Fields
Lesson Summary

This page contains


I. Text Summary

To go to the page where a task, programming feature, or concept was presented, select a link.

Column Input Review
When data are arranged in columns or fixed fields, you can read them using column input. With column input, the beginning and ending column are specified for each field. Character variables are identified by a dollar ($) sign.

Column input has several features.

  • Fields can be read in any order.

  • Character variables can be up to 32K and can contain embedded blanks.

  • No placeholder is required for missing data. A blank field is read as missing and does not cause other fields to be read incorrectly.

  • Fields or parts of fields can be reread.

  • Fields do not have to be separated by blanks or other delimiters.

Identifying Nonstandard Numeric Data
Standard numeric data values are values that contain only numbers, scientific notation, decimal points, and minus signs. When numeric data contain characters such as commas or dollar signs, the data are considered to be nonstandard. Nonstandard numeric data include
  • values containing special characters, such as percent signs, dollar signs, and commas
  • date and time values
  • data in fraction, integer binary and real binary, and hexadecimal forms.

Choosing an Input Style
Nonstandard data values require an input style with more flexibility than column input. Formatted input combines the features of column input with the ability to read nonstandard, as well as standard data. Whenever you encounter raw data that is organized into fixed fields, you can use

  • column input to read standard data only
  • formatted input to read both standard and nonstandard data.

Using Formatted Input
Formatted input uses column pointer controls to position the input pointer on a specified column. A column pointer control is optional when the first variable is in the first column.

The @n is an absolute pointer control that moves the input pointer to a specific column number. You can read columns in any order with the @n column pointer control.

The +n is a relative pointer control that moves the input pointer forward to a column number relative to the current position.

Using Informats
An informat is a specific set of directions that specifies how SAS software reads raw data. There are informats for reading character values, reading standard data values, and reading nonstandard data values.

Informats always contain a w value to indicate the width of the raw data field. A period (.) ends the informat or separates the w value from the optional d value, which specifies the number of implied decimal places.

Record Formats
A record format specifies the characteristics of the organization of records in a file. Some operating systems have different types of record formats; the two most common are fixed-length records and variable-length records.

When reading variable-length records containing fixed-field data into a SAS data set, there may be values that are shorter than others or that are missing. The PAD option pads each record with blanks so that all data lines have the same length.


II. Syntax

To go to the page where a statement or option was presented, select a link.

LIBNAME libref  'SAS-data-library';
FILENAME fileref 'filename';
DATA SAS-data-set;
         INFILE raw-data-file;
         INPUT pointer-control variable informat.;
RUN;
PROC PRINT DATA=SAS-data-set;
RUN;


III. Sample Program
     libname perm 'c:\data\sales';
     filename vandata 'c:\records\vans.dat';
     data perm.vansales;
        infile vandata;
        input +12 Quarter 1. @1 Region $9.
            +6 TotalSales comma11.;
     run;
     proc print data=perm.vansales;
     run;


IV. Points to Remember
  • When you use column or formatted input, the input pointer rests on the column after the last column read.

  • When using informats, it is not necessary to specify a d value if the data value already contains decimal places.

  • Column input can be used to read standard data only.

  • Formatted input can be used to read both standard and nonstandard data.

  • You can avoid problems when reading variable-length records that contain fixed-field data by using the PAD option in the INFILE statement.


back||next

 

Copyright © 2002 SAS Institute Inc., Cary, NC, USA. All rights reserved.
Terms of Use & Legal Information | Privacy Statement