Invalid string length.

I am trying to use TraMineR (version 1.8.4) seqdef-funciton to define a sequence object, but I always get this error message that makes no sense to me: Error in row.names<-.data.frame ( *tmp* , value = value): Invalid string length.

My code entry:

 sample.sts <- seqdef(sample, var=c("jan2005", "feb2005", "mar2005", "apr2005", "may2005", "jun2005", "jul2005", "aug2005", "sep2005", "oct2005", "nov2005", "dec2005"), alphabet=c("Employee (full-time)", "Employee (part-time)", "Self-employed (full-time)", "Self-employed (part-time)", "unemployed", "Retired", "Student", "Other inactive", "Compulsory military service"), states=c("EF", "EP", "SF", "SP", "UE", "RE", "ST", "IA", "MS"), id="pidc") 

The sample data sample looks like this:

  pidc jan2005 feb2005 ... dec2005 sex edufirst age05 -------------------------------------------------------------------------- 1. 150163920001 . . ... . 1 5 62 2. 211518110003 . . ... . 2 2 17 3. 170295160002 . . ... . 2 1 47 4. 240386550002 2 2 ... 2 2 2 50 5. 320099920001 . . ... . 1 3 38 -------------------------------------------------------------------------- 6. 200167850001 . . ... . 1 5 39 7. 340401190002 6 6 ... 6 1 3 61 8. 180501260002 . . ... . 1 3 29 9. 230083560001 . . ... . 1 3 61 10. 240335270002 3 3 ... 3 2 3 30 

The whole conclusion says:

[!] found the "-" symbol in status codes, not recommended | [>] found missing values โ€‹โ€‹("NA") in sequence data
[>] preparation of 3266 sequences
[>] invalid encoding elements with "%" and missing values โ€‹โ€‹with "*"
[!] sequence with index: 1,2,3, ...
[>] state coding:
[alphabet] [label] [long label]
1 Full-time employee EF EF
2 Employee (part-time) EP EP
3 Private entrepreneur SF
4 Self-employed (part-time) SP SP
5 unemployed UE UE
6 Senior Citizens RE RE
7 Student ST ST
8 Other Inactive IA IA
9 Mandatory military service MS MS
[>] 3266 sequences in a data set
[>] minimum / maximum sequence length: 12/12
Fehler in row.names<-.data.frame ( *tmp* , value = value):
Invalid string length.

I repeated it after re-marking the states without a "-", which does not affect the error. Maybe someone can help me and know what causes this error?

+4
source share
1 answer

The id argument seqdef must be a vector containing one record for each sequence (that is, the length of the id vector must equal the number of sequences). Try using id = as.character (sample $ pid). You can also try id = sample $ pid (without as.character)

 sample.sts <- seqdef(sample, var=c("jan2005", "feb2005", "mar2005", "apr2005", "may2005", "jun2005", "jul2005", "aug2005", "sep2005", "oct2005", "nov2005", "dec2005", "jan2006", "feb2006", "mar2006", "apr2006", "may2006", "jun2006", "jul2006", "aug2006", "sep2006", "oct2006", "nov2006", "dec2006", "jan2007", "feb2007", "mar2007", "apr2007", "may2007", "jun2007", "jul2007", "aug2007", "sep2007", "oct2007", "nov2007", "dec2007", "jan2008", "feb2008", "mar2008", "apr2008", "may2008", "jun2008", "jul2008", "aug2008", "sep2008", "oct2008", "nov2008", "dec2008"), alphabet=c("Employee (full-time)", "Employee (part-time)", "Self-employed (full-time)", "Self-employed (part-time)", "unemployed", "Retired", "Student", "Other inactive", "Compulsory military service"), states=c("EF", "EP", "SF", "SP", "UE", "RE", "ST", "IA", "MS"), d=as.character(sample$pid)) 

There are several discrepancies between the states in the data and the argument of the alphabet, because "-" is replaced by ".". You should probably change the argument of the alphabet (try using the seqstatl function to find out what state labels are in your data).

+6
source

All Articles