Divide the "Name" into the "FirstName" and "LastName" columns of the data frame

Question

Divide the "Name" into the "FirstName" and "LastName" columns of the data frame

I am trying to figure out how to take one column “Name” in a data framework, split it into two other columns FistName and LastName in the same data frame. The problem is that some of my names have several last names. Essentially, I want to take the first word (or row element) and put it in the FirstName columns, and then put all the next text (minus the space, of course) in the LastName column.

This is my DataFrame "tteam"

NAME <- c('John Doe','Peter Gynn','Jolie Hope-Douglas', 'Muhammad Arnab Halwai')
TITLE <- c("assistant", "manager", "assistant", "specialist")
tteam<- data.frame(NAME, TITLE)

My desired result:

FirstName <- c("John", "Peter", "Jolie", "Muhammad")
LastName <- c("Doe", "Gynn", "Hope-Douglas", "Arnab Halwai")
tteamdesire <- data.frame(FirstName, LastName, TITLE)

I tried the following code to create a new data frame only for names that allow me to extract the first names from the first column. However, I cannot put the last names in any order.

names <- tteam$NAME ##  puts full names into names vector
namesdf <- data.frame(do.call('rbind', strsplit(as.character(names),' ',fixed=TRUE))) 
## splits out all names into a dataframe PROBLEM IS HERE!

+5

r strsplit

Ryanl 21 . '14 14:32

4

Try:

> firstname = sapply(strsplit(NAME, ' '), function(x) x[1])
> firstname 
[1] "John"     "Peter"    "Jolie"    "Muhammad"

> lastname = sapply(strsplit(NAME, ' '), function(x) x[length(x)])
> lastname
[1] "Doe"          "Gynn"         "Hope-Douglas" "Halwai"

> ll = strsplit(NAME, ' ')
> 
> firstname = sapply(ll, function(x) x[1])
> lastname = sapply(ll, function(x) x[length(x)])
> 
> firstname
[1] "John"     "Peter"    "Jolie"    "Muhammad"
> lastname
[1] "Doe"          "Gynn"         "Hope-Douglas" "Halwai"

+4

rnso 21 . '14 15:13

1) sub

data.frame(FirstName = sub(" .*", "", tteam$NAME), 
           LastName = sub("^\\S* ", "", tteam$NAME),
           tteam[-1])

2) gsubfn :: read.pattern In NAME<-we can omit as.characterif its already a character (as opposed to a factor):

library(tteam)

cn <- c("FirstName", "LastName")
NAME <- as.character(tteam$NAME)

cbind( read.pattern(text = NAME, pattern = "^(\\S*) (.*)", col.names = cn), tteam[-1])

Refresh Refresh the solution in terms tteamand add a second solution.

+3

G. grothendieck Oct 21 '14 at 14:37

source share

You can use unglue package:

library(unglue)
unglue_unnest(tteam, NAME, "{FirstName} {LastName}")
#>        TITLE FirstName     LastName
#> 1  assistant      John          Doe
#> 2    manager     Peter         Gynn
#> 3  assistant     Jolie Hope-Douglas
#> 4 specialist  Muhammad Arnab Halwai

0

Moody_Mudskipper Oct 08 '19 at 15:17

source share

akrun · Accepted Answer · 2014-10-21T15:10:07+0000

extract tidyr

 library(tidyr)
 extract(tteam, NAME, c("FirstName", "LastName"), "([^ ]+) (.*)")
 #  FirstName     LastName      TITLE
 #1      John          Doe  assistant
 #2     Peter         Gynn    manager
 #3     Jolie Hope-Douglas  assistant
 #4  Muhammad Arnab Halwai specialist

Divide the "Name" into the "FirstName" and "LastName" columns of the data frame

More articles: