Messy messaging

I have some data in Notepad that is a mess. Between any different columns in which different data is stored, there is no space between them. I know the gaps for the data. For example, columns 1-2 - X, columns 7-10 - Y ....

How can I arrange this? Can this be done in R? What is the best way to do this?

0
source share
2 answers

?read.fwf may be a good bet for this circumstance.

Set the file path:

 temp <- "\pathto\file.txt" 

Then set the width of the variables in the file as shown below.

 #1-2 = x, 3-10=y widths <- c(2,8) 

Then specify the column names.

 cols <- c("X","Y") 

Finally, import the data into a new variable in your session:

 dataset <- read.fwf(temp,widths,header=FALSE,col.names=cols) 
+1
source

Something that I did in the past to handle this kind of mess actually imports it into excel as width delimited text and then saves it as a CSV.

Just an offer for you. If this is one project, this should be good. no encoding at all. But if it's a repeater criminal ... then you can look at regular expressions.

i.e. ^ (. {6}) (. {7}) (. {2}) (. {5}) $ for 4 fields with a width of 6.7.2 and 5 characters in order.

0
source

All Articles