Rename columns in multiple data frames, R

I am trying to rename columns from multiple data.frame s.

To give an example, let's say I have a list of data.frame dfA , dfB and dfC . I wrote the changeNames function to set the names accordingly, and then used lapply as follows:

 dfs <- list(dfA, dfB, dfC) ChangeNames <- function(x) { names(x) <- c("A", "B", "C" ) } lapply(dfs, ChangeNames) 

However, this does not work properly. It seems that I am not assigning new names to data.frame , but only creating new names. What am I doing wrong here?

Thank you in advance!

+8
r dataframe
source share
3 answers

There are two things here:

  • 1) You must return the desired value from your function. Otherwise, the last value is returned. In your case, it is names(x) . So, instead you should add the final line, return(x) or just x . So your function will look like this:

     ChangeNames <- function(x) { names(x) <- c("A", "B", "C" ) return(x) } 
  • 2) lapply does not modify your source objects by reference. He is working on a copy. So, you will need to return the results. Or another alternative is to use for-loops instead of lapply :

     # option 1 dfs <- lapply(dfs, ChangeNames) # option 2 for (i in seq_along(dfs)) { names(dfs[[i]]) <- c("A", "B", "C") } 

Even using for-loop , you will still make a copy (because names(.) <- . Does). You can verify this using tracemem .

 df <- data.frame(x=1:5, y=6:10, z=11:15) tracemem(df) # [1] "<0x7f98ec24a480>" names(df) <- c("A", "B", "C") tracemem(df) # [1] "<0x7f98e7f9e318>" 

If you want to change by reference, you can use the data.table package setnames :

 df <- data.frame(x=1:5, y=6:10, z=11:15) require(data.table) tracemem(df) # [1] "<0x7f98ec76d7b0>" setnames(df, c("A", "B", "C")) tracemem(df) # [1] "<0x7f98ec76d7b0>" 

You see that the location of the df memory is mapped, has not changed. Names have been changed by reference.

+13
source share

If data frames were not in the list, but simply in the global environment, you can reference them using a vector of string names.

 dfs <- c("dfA", "dfB", "dfC") for(df in dfs) { df.tmp <- get(df) names(df.tmp) <- c("A", "B", "C" ) assign(df, df.tmp) } 

EDIT

To simplify the code above, you can use

 for(df in dfs) assign(df, setNames(get(df), c("A", "B", "C"))) 

or using data.table , which does not require reassignment.

 for(df in c("dfA", "dfB")) data.table::setnames(get(df), c("G", "H")) 
+10
source share

I had the problem of importing a public dataset and having to rename each data frame and rename each column in each data frame to trim spaces, lowercase letters and replace internal spaces with periods.

Combining the above methods, I got:

 for (eachdf in dfs) df.tmp <- get(eachdf) for (eachcol in 1:length(df.tmp)) colnames(df.tmp)[eachcol] <- str_trim(str_to_lower(str_replace_all(colnames(df.tmp)[eachcol], " ", "."))) } assign(eachdf, df.tmp) } 
-one
source share

All Articles