Invalid string length.

Question

Invalid string length.

I am trying to use TraMineR (version 1.8.4) seqdef-funciton to define a sequence object, but I always get this error message that makes no sense to me: Error in row.names<-.data.frame ( *tmp* , value = value): Invalid string length.

My code entry:

 sample.sts <- seqdef(sample, var=c("jan2005", "feb2005", "mar2005", "apr2005", "may2005", "jun2005", "jul2005", "aug2005", "sep2005", "oct2005", "nov2005", "dec2005"), alphabet=c("Employee (full-time)", "Employee (part-time)", "Self-employed (full-time)", "Self-employed (part-time)", "unemployed", "Retired", "Student", "Other inactive", "Compulsory military service"), states=c("EF", "EP", "SF", "SP", "UE", "RE", "ST", "IA", "MS"), id="pidc")

The sample data sample looks like this:

  pidc jan2005 feb2005 ... dec2005 sex edufirst age05 -------------------------------------------------------------------------- 1. 150163920001 . . ... . 1 5 62 2. 211518110003 . . ... . 2 2 17 3. 170295160002 . . ... . 2 1 47 4. 240386550002 2 2 ... 2 2 2 50 5. 320099920001 . . ... . 1 3 38 -------------------------------------------------------------------------- 6. 200167850001 . . ... . 1 5 39 7. 340401190002 6 6 ... 6 1 3 61 8. 180501260002 . . ... . 1 3 29 9. 230083560001 . . ... . 1 3 61 10. 240335270002 3 3 ... 3 2 3 30

The whole conclusion says:

[!] found the "-" symbol in status codes, not recommended | [>] found missing values ("NA") in sequence data
[>] preparation of 3266 sequences
[>] invalid encoding elements with "%" and missing values with "*"
[!] sequence with index: 1,2,3, ...
[>] state coding:
[alphabet] [label] [long label]
1 Full-time employee EF EF
2 Employee (part-time) EP EP
3 Private entrepreneur SF
4 Self-employed (part-time) SP SP
5 unemployed UE UE
6 Senior Citizens RE RE
7 Student ST ST
8 Other Inactive IA IA
9 Mandatory military service MS MS
[>] 3266 sequences in a data set
[>] minimum / maximum sequence length: 12/12
Fehler in row.names<-.data.frame ( *tmp* , value = value):
Invalid string length.

I repeated it after re-marking the states without a "-", which does not affect the error. Maybe someone can help me and know what causes this error?

+4

r

user1870829 Dec 6 '12 at 1:12

source share

1 answer

Matthias studer · Accepted Answer · 2012-12-06T09:23:30+0000

The id argument seqdef must be a vector containing one record for each sequence (that is, the length of the id vector must equal the number of sequences). Try using id = as.character (sample $ pid). You can also try id = sample $ pid (without as.character)

 sample.sts <- seqdef(sample, var=c("jan2005", "feb2005", "mar2005", "apr2005", "may2005", "jun2005", "jul2005", "aug2005", "sep2005", "oct2005", "nov2005", "dec2005", "jan2006", "feb2006", "mar2006", "apr2006", "may2006", "jun2006", "jul2006", "aug2006", "sep2006", "oct2006", "nov2006", "dec2006", "jan2007", "feb2007", "mar2007", "apr2007", "may2007", "jun2007", "jul2007", "aug2007", "sep2007", "oct2007", "nov2007", "dec2007", "jan2008", "feb2008", "mar2008", "apr2008", "may2008", "jun2008", "jul2008", "aug2008", "sep2008", "oct2008", "nov2008", "dec2008"), alphabet=c("Employee (full-time)", "Employee (part-time)", "Self-employed (full-time)", "Self-employed (part-time)", "unemployed", "Retired", "Student", "Other inactive", "Compulsory military service"), states=c("EF", "EP", "SF", "SP", "UE", "RE", "ST", "IA", "MS"), d=as.character(sample$pid))

There are several discrepancies between the states in the data and the argument of the alphabet, because "-" is replaced by ".". You should probably change the argument of the alphabet (try using the seqstatl function to find out what state labels are in your data).

Invalid string length.

More articles: