SAS: How can I point to a specific observation of a value?

I am very new to SAS and I am trying to figure out some basic things available in other languages.

I have a table

ID Number -- ------ 1 2 2 5 3 6 4 1 

I would like to create a new variable in which I sum the value of one observation of a number with each other, for example

 Number2 = Number + Number[3] ID Number Number2 -- ------ ------ 1 2 8 2 5 11 3 6 12 4 1 7 

How to get the value of the third observation of a number and add this to each observation of Number in a new variable?

+4
source share
2 answers

I will start by assuming that Base SAS does not really work this way, as a rule; it's not that it cannot, but usually you can solve most problems by not pointing to a specific line.

So, although this answer will solve your obvious problem, it is probably not something useful in a real-world scenario; usually in the real world you will have a match key or some other element other than the line number that needs to be combined, and if you did, you could do it much more efficiently. You can probably also change the data structure in such a way as to make this operation more convenient.

However, the specific example you give is trivial:

 data have; input ID Number; datalines; 1 2 2 5 3 6 4 1 ;;;; run; data want; set have; _t = 3; set have(rename=number=number3 keep=number) point=_t ; number2=number+number3; run; 

If you have SAS / IML (SAS matrix language), which is somewhat similar to R, then this is a completely different story, how you are likely to perform this operation, and how you do it.

 proc iml; a= {1 2, 2 5, 3 6, 4 1}; *create initial matrix; b = a[,2] + a[3,2]; *create a new matrix which is the 2nd column of a added elementwise to the value in the third row second column; c = a||b; *append new matrix to a - could be done in same step of course; print bc; quit; 

To do this with the first observation, it is much simpler.

 data want; set have; retain _firstpoint; *prevents _firstpoint from being set to missing each iteration; if _n_ = 1 then _firstpoint=number; *on the first iteration (usually first row) set to number value; number = number - _firstpoint; *now subtract that from number to get relative value; run; 

I will talk about this in more detail. SAS operates at the record-by-record level, where each record is independently processed in the DATA phase. (PROC, on the other hand, cannot behave this way, although many do at some level). SAS, like SQl and similar databases, does not really recognize that any string is "first" or "second" or "nth"; however, unlike SQL, it allows you to pretend that it is based on the current sort. The POINT = random access method is one way to do this.

In most cases, you are going to use something in the data to determine what you want to do, and not some related to organizing the data. Here you can do the same as the POINT = method, but using the ID value:

data want; if n = 1, then set (where = (ID = 3) rename = number = number3); installed; number2 = + number number3; to run;

That at the first iteration of the data step ( _N_ = 1) takes a row from HAVE, where Id = 3, and then takes rows from the order (it really does it :)

 *check to see if _n_=1; it is; so take row id=3; *take first row (id=1); *check to see if _n_=1; it is not; *take second row (id=2); ... continue ... 

The variables in the SET statement are automatically saved, so NUMBER3 is automatically saved (yay!) And is not set in the absence of iterations of the data step loop. Until you change the value, it will remain for each iteration.

+2
source

There are several ways to do this; here the option SAS POINT= :

 data have; input ID Number; datalines; 1 2 2 5 3 6 4 1 run; data want; retain adder; drop adder; if _n_=1 then do; adder = 3; set have point=adder; adder = number; end; set have; number = number + adder; run; 

The RETAIN and DROP statements define a temporary variable to store the value you want to add. RETAIN means that the value should not be reinitialized without loss every time through the data step, and DROP means that you do not want to include this variable in the output data set.

The option POINT= allows you to read a specific case from the SAS dataset. The _n_=1 is a control mechanism for only executing this bit of code once, assigning the value of the third observation to the adder variable.

The next section reads the data one observation at a time, and the addition applies your change.

Note that the same dataset is read twice; convenient SAS function.

+4
source

All Articles