Remove duplicate rows from an XTS object

I am having trouble deleting duplicate lines in an xts object. I have an R script that will download tick financial data in a currency and convert it to an xts object in OHLC format. The script also retrieves new data every 15 minutes. New data is downloaded from the first trade today to the last registered trade today. Previous previously uploaded data was saved in .Rdata format and recalled. Then the new data is added to the old data and the old data in the .Rdata format is overwritten.

Here is an example of what my data looks like:

.Open .High .Low .Close .Volume .Adjusted 2012-01-07 00:00:11 6.69683 7.01556 6.38000 6.81000 48387.58 6.81000 2012-01-08 00:00:09 6.78660 7.20000 6.73357 7.11358 57193.53 7.11358 2012-01-09 00:00:57 7.08362 7.19100 5.81000 6.32570 148406.85 6.32570 2012-01-10 00:01:01 6.32687 6.89000 6.00100 6.36000 110210.25 6.36000 2012-01-11 00:00:07 6.44904 7.13800 6.41266 6.90000 99442.07 6.90000 2012-01-12 00:01:02 6.90000 6.99700 6.33700 6.79999 140116.52 6.79999 2012-01-13 00:02:01 6.78211 6.80400 6.40000 6.41000 60228.77 6.41000 2012-01-14 00:00:23 6.42000 6.50000 6.23150 6.31894 25392.98 6.31894 

Now, if I run the script again, I will add new data to xts.

  .Open .High .Low .Close .Volume .Adjusted 2012-01-07 00:00:11 6.69683 7.01556 6.38000 6.81000 48387.58 6.81000 2012-01-08 00:00:09 6.78660 7.20000 6.73357 7.11358 57193.53 7.11358 2012-01-09 00:00:57 7.08362 7.19100 5.81000 6.32570 148406.85 6.32570 2012-01-10 00:01:01 6.32687 6.89000 6.00100 6.36000 110210.25 6.36000 2012-01-11 00:00:07 6.44904 7.13800 6.41266 6.90000 99442.07 6.90000 2012-01-12 00:01:02 6.90000 6.99700 6.33700 6.79999 140116.52 6.79999 2012-01-13 00:02:01 6.78211 6.80400 6.40000 6.41000 60228.77 6.41000 2012-01-14 00:00:23 6.42000 6.50000 6.23150 6.31894 25392.98 6.31894 2012-01-14 00:00:23 6.42000 6.75000 6.22010 6.57157 75952.01 6.57157 

As you can see, the last line is on the same day as the second line. I want to save the last line for the last date and delete the second in the last line. When I try to execute the following code to remove duplicate lines, it does not work, duplicate lines remain there.

 xx <- mt.xts[!duplicated(mt.xts$Index),] xx .Open .High .Low .Close .Volume .Adjusted 

I do not get any result. How to remove duplicate data records in an xts object using an index as an indicator of duplication?

+4
source share
2 answers

Should it be index(mt.xts) and not mt.xts$Index ? The following seems to work.

 # Sample data library(xts) x <- xts( 1:10, rep( seq.Date( Sys.Date(), by="day", length=5 ), each=2 ) ) # Remove rows with a duplicated timestamp y <- x[ ! duplicated( index(x) ), ] # Remove rows with a duplicated timestamp, but keep the latest one z <- x[ ! duplicated( index(x), fromLast = TRUE ), ] 
+13
source

In my case,

 x <- x[! duplicated( index(x) ),] 

it didn’t work as intended, because the system somehow makes the date and time unique on each line.

 x <- x[! duplicated( coredata(x) ),] 

This may work if the previous solution did not help.

+1
source

All Articles