I have a large data set with three variables (State, Zipcode, Name). Here's a little extract:
zz <- "State Zipcode Name
IL 60693 THISISTHEFIRST
IL 60693 TISISTHEFIRS
OH 45271 THISISTHEFIRST
CA 94085 THISISTHESECOND
CA 94085 THISISTHESECOND
CA 94085 THISISTHESECCOND
SC 29645 THISISTHETHIRD
SC 29645 THISISTHETHIRD
SC 29645 THISISTHETHIRD
SC 29645 THISISTHEFOURTH
SC 29645 ISISTHEFOURTH"
Data <- read.table(text=zz, header = TRUE)
I need to create a unique identifier for observations characterized by the same state, Zipcode, Name. However, some of the names have a typo, even if they do represent the same object (for example, THISISTHEFIRST vs. TISISTHEFIRS).
I would like to get something similar to this:
State Zipcode Name ID
IL 60693 THISISTHEFIRST 1
IL 60693 TISISTHEFIRS 1
OH 45271 THISISTHEFIRST 2
CA 94085 THISISTHESECOND 3
CA 94085 THISISTHESECOND 3
CA 94085 THISISTHESECCOND 3
WI 53022 THISISTHETHIRD 4
WI 53022 THISISTHETHIRD 4
WI 53022 THISISTHETHIRD 4
SC 29645 THISISTHEFOURTH 5
SC 29645 ISISTHEFOURTH 5
How can I create a unique identifier in a quick and efficient way?
source
share