Assign a value if the number is between two numbers

Im trying to assign a value of -1 to every number in my vector that is between 2 and 5. I thought the if statement would work. I have a problem's. I do not think (2

x <- c(3.2,6,7.8,1,3,2.5) if (2<x<5){ cat(-1) } else { cat (x) } 
+6
source share
5 answers

There are several syntax errors in the code.

Try using findInterval

 x[findInterval(x, c(2,5)) == 1L] <- -1 x ## [1] -1.0 6.0 7.8 1.0 -1.0 -1.0 

read ?findInterval for more details on using findInterval

You can also use replace

 replace(x, x > 2 & x < 5, -1) 

note that

  • for 2<x<5 you need to write x > 2 & x < 5
  • cat will be output to the console or file / connection. He will not appoint anything.
+14
source

You might just want to replace these elements with -1.

 > x[x > 2 & x < 5] <- -1; x [1] -1.0 6.0 7.8 1.0 -1.0 -1.0 

You can also use ifelse .

 > ifelse(x > 2 & x < 5, -1, x) [1] -1.0 6.0 7.8 1.0 -1.0 -1.0 
+9
source

I compared the solutions with microbenchmark :

 library(microbenchmark) library(TeachingDemos) x = runif(100000) * 1000 microbenchmark(200 %<% x %<% 500 , x > 200 & x < 500 , findInterval(x, c(200, 500)) == 1 , findInterval(x, c(200, 500)) == 1L , times = 1000L ) 

Here are the results:

  expr min lq mean median uq max neval 200 %<% x %<% 500 17.089646 17.747136 20.477348 18.910708 21.302945 113.71473 1000 x > 200 & x < 500 6.774338 7.092153 8.746814 7.233512 8.284603 103.64097 1000 findInterval(x, c(200, 500)) == 1 3.578305 3.734023 5.724540 3.933615 6.777687 91.09649 1000 findInterval(x, c(200, 500)) == 1L 2.042831 2.115266 2.920081 2.227426 2.434677 85.99866 1000 

You should take findInterval . Please compare it with 1L instead of 1 . It is almost twice as fast.

+2
source

Here is another approach that looks a bit like the original:

 library(TeachingDemos) x <- c(3.2,6,7.8,1,3,2.5) (x <- ifelse( 2 %<% x %<% 5, -1, x ) ) 
+1
source

My preference for assigning a value to a variable based on a well-defined numeric interval is to use the base R syntax:

  DF$NewVar[DF$LowerLimit <= DF$OriginalVar & DF$OriginalVar < DF$UpperLimit] = "Normal" DF$NewVar[DF$LowerLimit < DF$OriginalVar] = "Low" DF$NewVar[DF$OriginalVar >= DF$UpperLimit] = "High" 

I think this syntax is clearer than any number of R functions, mainly because the code can be quickly set up to specify inclusive and exclusive intervals. In practice, quite often there are situations when the interval can be defined as inclusive (ie [-x to + x]) or exceptional (ie (-x to + x)) or a combination (i.e. [ - x to + x)).

In addition, the basic syntax gives clarity to the code if someone else considers it later. Each unique function has its own special and slightly different syntax to achieve the same level of specificity as a clear definition of intervals using the base R syntax.

0
source

Source: https://habr.com/ru/post/927983/


All Articles