Filter based on conditional criteria in r

I have a data frame in my R environment that I would like to multiply based on certain criteria - a kind of conditional filter. My data frame is a set of data from daily values ​​for each day between 2004-2014. Each day in a data frame is a separate observation. Each year has 366 days. I would like a subset of the data so that only leap years save the 366th day in the panel data. There are three leap years in this time interval -2004, 2008, 2012. I have a separate column for the year and day of the year. In other words, I need a script that will return a dataset without the 366th day, but only for every year except 2004, 2008 and 2012.

I managed to accomplish this as follows: I inserted my day and year columns together (for example, "2006-366") and simply used the dplyr filter command to subset each year (2005-366, 2006-366, 2007). -366, 2009-366, 2010-366, 2011-366, 2013-366, 2014-366). This, however, is a very crude method. I was hoping someone could point me in the right direction. Here are some reproducible data along with the workflow that I used.

 #Create DF
 year<-rep(c(2004:2014), each=366)
 day<-rep(c(1:366))
 df<-data.frame(day, year)

 #My crude method
 df $reduc<-paste(df$year, df$day, sep="-")

 df <-df %>%
    filter(reduc!="2005-366") %>%
    filter(reduc!="2006-366") %>%
    filter(reduc!="2007-366") %>%
    filter(reduc!="2009-366") %>%
    filter(reduc!="2010-366") %>%
    filter(reduc!="2011-366") %>%
    filter(reduc!="2013-366") %>%
    filter(reduc!="2014-366") 
+4
source share
2 answers

Data setting:

df  <- expand.grid(year=2004:2014,day=1:366)
nrow(df) ## 4026

Now we exclude cases when (the year is not divisible by 4) AND (the day is 366) (identifying non-leap years would be more difficult if you included 2000 and / or centuries in your data set ...)

library(dplyr)
df2 <- df %>% filter(!(year %% 4 > 0 & day==366))
+5

Date . , 1 year, Date, day ( 1) Date.

df$date <- as.Date(paste0(df$year,'-01-01'))+(df$day-1L);

Date year. , , year/day , . , 1 .

df[df$year==as.integer(strftime(df$date,'%Y')),];
##      day year       date
## 1      1 2004 2004-01-01
## ...
## 366  366 2004 2004-12-31
## 367    1 2005 2005-01-01
## ...
## 731  365 2005 2005-12-31
## 733    1 2006 2006-01-01
## ...
## 1097 365 2006 2006-12-31
## 1099   1 2007 2007-01-01
## ...
## 1463 365 2007 2007-12-31
## 1465   1 2008 2008-01-01
## ...
## 1830 366 2008 2008-12-31
## 1831   1 2009 2009-01-01
## ...
## 2195 365 2009 2009-12-31
## 2197   1 2010 2010-01-01
## ...
## 2561 365 2010 2010-12-31
## 2563   1 2011 2011-01-01
## ...
## 2927 365 2011 2011-12-31
## 2929   1 2012 2012-01-01
## ...
## 3294 366 2012 2012-12-31
## 3295   1 2013 2013-01-01
## ...
## 3659 365 2013 2013-12-31
## 3661   1 2014 2014-01-01
## ...
## 4025 365 2014 2014-12-31
+2

All Articles