I often replace data when I convert it to dplyr, especially when working with large data sets. I'm not sure how to do this elegantly when I work with SQL-enabled datasets, at least not with SQLite.
I could not find any discussion of this goal in dignr DB vignettes or on SO, which also makes me wonder if something is wrong with what I do first; however, it seems to be a natural way to work with large data sets.
Anyway, the most intuitive approach doesn't work:
library(dplyr) library(RSQLite) db2 <- src_sqlite("trouble.sqlite", create = TRUE) trouble <- data.frame(values = c(5, 1, 3)) trouble.db <- copy_to(db2, trouble, temporary = FALSE) collect(trouble.db) # 5, 3, 1 trouble.db <- trouble.db %>% arrange(values) collect(trouble.db) # 1, 3, 5 trouble.in <- tbl(db2, sql("SELECT * from trouble")) collect(trouble.in) # 5, 3, 1
Another intuitive syntax for the in-place copy gives the error "table already exists":
trouble.db <- copy_to(db2, as.data.frame(trouble.db), name="trouble", temporary = FALSE)
One solution is to manually delete the table and restore it, which I did:
db2$con %>% db_drop_table(table = "trouble") trouble <- collect(trouble.db) trouble.db <- copy_to(db2, trouble, temporary = FALSE)
Another is to abandon the replacement and create a series of temporary tables that I consider unaesthetic, but which, I believe, may be the recommended paradigm:
trouble_temp <- data.frame(values = c(5, 1, 3)) trouble_temp.db <- copy_to(db2, trouble_temp, temporary = TRUE) trouble <- trouble.db %>% arrange(values) trouble.db <- copy_to(db2, trouble, temporary = FALSE)
I suspect that βa drop and a duplicateβ will return to the answer, but because of the abundance of love for beautiful solutions, I thought I would ask if there is a better way.