Fixing multiple unknown column warning

Question

Fixing multiple unknown column warning

I have a constant multiple "unknown column" warning for all types of commands (like str (x) to install package updates), and am not sure how to debug this or fix it.

The “unknown column” warning is clearly related to the variable in tbl_df, which I renamed, but the warning appears in all kinds of commands that are apparently not related to tbl_df (for example, installing updates on a package, str (x) where x is the vector characters).

+137

r dplyr

ssp3nc3r Aug 19 '16 at

source share

9 answers

zeehio · Answer 1 · 2017-03-01 15:47

Update: this problem was partially fixed with this fix in RStudio v1.1.103 or later @ kevin-ushey . It still appears (albeit with less frequency).

This is a problem with the diagnostic tool in RStudio (a tool that displays warnings and possible errors in your code).

https://support.rstudio.com/hc/en-us/community/posts/115001180488-Diagnostics-and-tibble-warning

As a workaround, you can add at the beginning of the open file (s):

# !diagnostics off

Then save the files and warnings should stop appearing.

You can also simply disable the diagnostics in Preferences / Code / Diagnostics.

I believe that warnings appear because the diagnostic tool in RStudio analyzes the source code to detect errors, and when it performs diagnostic checks, it looks at the columns in your table that are not initialized and generates a warning that we see. Warnings do not appear because you start unrelated things, they appear when RStudio diagnostics are performed (when the file is saved, then changed when you start something ...).

sabre · Answer 2 · 2016-09-28 21:43

I faced the same problem, and although I do not know why this happens, I was able to fix it when this happens, and thus prevent it.

The problem seems to be related to adding a new column, derived from indexing, to the base data frame R or as a data frame. Take this example when you add a new column ( age ) to the base data frame R:

 base_df <- data.frame(id = c(1:3), name = c("mary", "jill","steve")) base_df$age[base_df$name == "mary"] <- 47

This works without returning a warning. But when the same thing is done with Tibet, it gives a warning (and therefore, I think that you are causing a strange, seemingly unprovoked problem with several warnings):

 library(tibble) tibble_df <- tibble(id = c(1:3), name = c("mary", "jill","steve")) tibble_df$age[tibble_df$name == "mary"] <- 47 Warning message: Unknown column 'age'

There are, of course, better ways to avoid this, but I found that first creating the NA vector does the job:

 tibble_df$age <- NA tibble_df$age[tibble_df$name == "mary"] <- 47

Varun · Answer 3 · 2017-01-31 10:21

I encountered this problem when using the "dplyr" package.
For those who encounter this problem after using the "group_by" function in the "dplyr" library:

I found that ungrouping the variables solves the problem of warning about an unknown column. Sometimes I had to repeat the ungrouping many times until the problem was resolved.

stok · Answer 4 · 2017-12-16 17:28

Converting the class to data.frame solved the problem for me:

 library(dplyr) df <- data.frame(id = c(1,1:3), name = c("mary", "jo", "jill","steve")) dfTbl <- df %>% group_by(id) %>% summarize (n = n()) class(dfTbl) # [1] "tbl_df" "tbl" "data.frame" dfTbl = as.data.frame(dfTbl) class(dfTbl) # [1] "data.frame"

Borrowed a partial script from @adts

adts · Answer 5 · 2017-01-20 18:18

I ran into this problem, except through a piece created using the dyplyr block. Here's a small modification to the saber code to show how I came up with the same error.

 library(dplyr) df <- data.frame(id = c(1,1:3), name = c("mary", "jo", "jill","steve")) t <- df %>% group_by(id) %>% summarize (n = n()) t str(t) t$newvar[t$id==1] <- 0

JelenaČuklina · Answer 6 · 2017-03-10 11:30

Say I wanted to select the next column (s)

 best.columns = 'id'

For me, the following warning:

 df%>% select_(one_of(best.columns))

For now, this worked as expected, although as far as I know dplyr , it should be the same.

 df%>% select_(.dots = best.columns)

user2279564 · Answer 7 · 2019-01-30 05:37

I am new to R.

While I am executing the code snippet below in the user-guide.rmd file

  target = "y", index_var = "index", name = "example") %>% add_holdout_samples(splits = c(.6, .2, .2)) %>% set_measure(RMSE) %>% add_model(pipe = NULL, method = "auto.arima", param_map = list(lambda = c(0, .5, 1)), uid = "auto-arima") %>% add_model(pipe = roll_pipe, method = "auto.arima", param_map = list(), uid = "auto-arima-roll") %>% add_model(pipe = NULL, method = "ets", param_map = list(), uid = "ets") %>% train_models() map_df(f1$models, "status") Failed to create bus connection: No such file or directory Warning in log(x) : NaNs produced Warning in log(x) : NaNs produced Warning in InvBoxCox(pred$pred, lambda, biasadj, var(residuals.Arima(object), : biasadj information not found, defaulting to FALSE. Warning in InvBoxCox(pred$pred, lambda, biasadj, var(residuals.Arima(object), : biasadj information not found, defaulting to FALSE. Warning in InvBoxCox(pred$pred, lambda, biasadj, var(residuals.Arima(object), : biasadj information not found, defaulting to FALSE. Quitting from lines 534-559 (User-Guide.Rmd) Error: processing vignette 'User-Guide.Rmd' failed with diagnostics: Unknown column NA

hemu123 · Answer 8 · 2019-06-15 17:43

maybe there is a space in the column at the end or at the beginning of the column. This also causes an unknown column error. I encountered a similar error in the markdown R file.

michael joseph · Answer 9 · 2019-07-30 15:31

I had this problem when I was dealing with tibble and lapply functions together. Teible seemed to save things as a list inside a data frame.

I solved this with unlist before adding the results of the lapply function to the table.

Fixing multiple unknown column warning

More articles: