Split a line into more than one space

I am trying to load some data into R which is in the following format (as a text file)

Name                  Country            Age
John,Smith            United Kingdom     20
Washington,George     USA                50
Martin,Joseph         Argentina          43

The problem is that the "columns" are separated by spaces, so that they all line up well, but one line can have 5 spaces between the values ​​and the next 10 spaces. So when I load it when using read.delim, I get one data.frame column with

"John,Smith            United Kingdom     20"

as a first observation, etc.

Is there a way I can:

  • Load data into R in a convenient format? or
  • Separate character strings into separate columns as soon as I load them in single column format?

, , , 2 x (, , "United Kingdom" "United" "" "Kingdom"). , .

strsplit(data.frame[,1], sep="\\s"), , :

"John,Smith" "" "" "" "" "" "" "" "United" "" "Kingdom" "" ""...

, .

+4
2

, " " .

"x". "x" /, read.delim.

:

x <- tempfile()
cat("Name                  Country            Age\nJohn,Smith            United Kingdom     20\nWashington,George     USA                50\nMartin,Joseph         Argentina          43\n", file = x)

R (read.fwf), , , , , . , , - :

read.fwf(x, c(22, 18, 4), strip.white = TRUE, skip = 1, 
         col.names = c("Name", "Country", "Age"))
#                Name        Country Age
# 1        John,Smith United Kingdom  20
# 2 Washington,George            USA  50
# 3     Martin,Joseph      Argentina  43

fwf_widths "readr" , read_fwf:

library(readr)
read_fwf(x, fwf_empty(x, col_names = c("Name", "Country", "Age")), skip = 1)
#                Name        Country Age
# 1        John,Smith United Kingdom  20
# 2 Washington,George            USA  50
# 3     Martin,Joseph      Argentina  43
+3

base R, , 1 :

txt = "Name                  Country            Age
John,Smith            United Kingdom     20
Washington,George     USA                50
Martin,Joseph         Argentina          43"

conn = textConnection(txt)
do.call(rbind, lapply(readLines(conn), function(u) strsplit(u,'\\s{2,}')[[1]]))
#     [,1]                [,2]             [,3] 
#[1,] "Name"              "Country"        "Age"
#[2,] "John,Smith"        "United Kingdom" "20" 
#[3,] "Washington,George" "USA"            "50" 
#[4,] "Martin,Joseph"     "Argentina"      "43" 
+1

All Articles