I have a tab delimited text file that I imported into R. I used the following command to import:
data = read.table(soubor, header = TRUE, sep = "\t", dec = ".", colClasses =c("numeric","numeric","character","Date","numeric","numeric"))
When I run str(data) to check the data types of my columns, I get:
'data.frame': 211931 obs. of 6 variables: $ DataValue : num 0 0 0 0 0 0 0 0 0 NA ... $ SiteID : num 1 1 1 1 1 1 1 1 1 1 ... $ VariableCode: chr "Sucho" "Sucho" "Sucho" "Sucho" ... $ DateTimeUTC : Date, format: "2012-07-01" "2012-07-02" "2012-07-03" "2012-07-04" ... $ Latitude : num 50.8 50.8 50.8 50.8 50.8 ... $ Longitude : num 15.6 15.6 15.6 15.6 15.6 ...
Here is a reproducible sample of the first 20 lines of my data:
my_sample = dput (data [1:20,])
structure(list(DataValue = c(0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, 0), SiteID = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), VariableCode = c("Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho", "Sucho"), DateTimeUTC = structure(c(15522, 15523, 15524, 15525, 15526, 15527, 15528, 15529, 15530, 15531, 15532, 15533, 15534, 15535, 15536, 15537, 15538, 15539, 15540, 15541), class = "Date"), Latitude = c(50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77, 50.77), Longitude = c(15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55, 15.55)), .Names = c("DataValue", "SiteID", "VariableCode", "DateTimeUTC", "Latitude", "Longitude"), row.names = c(NA, 20L), class = "data.frame")
Now I want to filter my table by date. Note that I run my code inside a for loop. Firstly, I select my data by July 1, 2012 and do some processing. Then I will multiply my data until July 2 and do some processing, etc. For example, I want to get all rows with a date equal to July 6, 2012. I tried the code:
startDate = as.Date("2012-07-01"); endDate = as.Date("2012-07-20"); all_dates = seq(startDate, endDate, 1);
But the above code returns an empty data set, starting from step 7 of the loop.
So for example:
subset_one = my_sample[my_sample$DateTimeUTC == all_dates[6],]
returns: 3 obs of 6 variables .
But, for some unknown reason, an example:
subset_two = my_sample[my_sample$DateTimeUTC == all_dates[7],]
returns: 0 obs of 6 variables .
(note: I edited the code above to make my problem 100% reproducible)
Any ideas what I'm doing wrong?