How can I create an index of multiple criteria in R, including> and <operators?

I am trying to find and merge two data frames that I have according to several criteria, and I have many problems to figure this out.

I have two datasets, one of which I want to index, which contains the validity dates of some products, and the other, which gives me the use of the product, as follows:

  Indexset <- data.table(validfrom=as.Date(c("2015-08-01", "2015-08-02", "2015-08-03", "2015-08-04")), validto=as.Date(c("2015-08-07", "2015-08-08","2015-08-09", "2015-08-10")), username=c("Smith", "Cole", "Amos", "Richardson"), productcode=c(1,2,3,4)) Useset <- data.table(usedate=as.Date(c("2015-08-04", "2015-08-06", "2015-08-06", "2015-08-09")), username=c("Smith", "Richardson", "Cole", "Amos")) 

What I want to do is add a column to “Useset” that will contain the “product code” from “Indexset”, checking that “usedate” is between the dates “validto” and “validfrom”, and then the name matches.

I tried various ways around the "merge" function, but couldn't figure out how to get the syntax in the syntax more and less than operators.

They also tried to establish sliding joints, but tried their best to make them work.

We are currently moving from excel, where it will just coincide with the multiple criteria in the array, but not sure how to shift it to R.

To be clear, I find no errors, I just completely lost the ability to add syntax.

As soon as I get “product codes”, I think I can handle this merger, but here it’s absolutely no problem!

Would thank any help anyone can offer!

+6
source share
1 answer

You can try to use the non-equi join function from the current development version of data.table , v1.9.7. Please read the installation instructions here .

 Useset[Indexset, on = .(username, usedate >= validfrom, usedate <= validto), # which rows? productcode := productcode][] # what to do? # usedate username productcode # 1: 2015-08-01 Smith 1 # 2: 2015-08-04 Richardson 4 # 3: 2015-08-02 Cole 2 # 4: 2015-08-03 Amos 3 

Usenet updated locally.


If performance is not a big issue, join username and then the filter should also work, this does not require the dev version of data.table :

 Useset[Indexset, on = "username"][usedate >= validfrom & usedate <= validto, .(usedate, username, productcode)] # usedate username productcode # 1: 2015-08-01 Smith 1 # 2: 2015-08-02 Cole 2 # 3: 2015-08-03 Amos 3 # 4: 2015-08-04 Richardson 4 
+5
source

All Articles