Sprintf in R does not consider umlauts

I have a character vector, and I want to make sure that all elements of the vector are the same length. Therefore, I fill in the short elements with spaces, for example:

vec <- c("fjdlksa01dada","rau","sjklf")
x <- sprintf("%-15s", vec)
nchar(x)
# returns
[1] 15 15 15

as the answers to my previous question . This is normal, but the umlats seem to have problems. For example, if my vector looks like this:

vec2 <- c("fjdlksa01dada","rauü","sjklf")
y <- sprintf("%-15s", vec)
nchar(y)
# returns
[1] 15 14 15

I am running R on Mac OS X (10.6). How can i fix this?

EDIT: Note, I do not want to correct the nchar output because it is correct. The problem is that sprintf is losing the umlaut.

EDIT: update R, change to DWins locale - no changes at all. But:

vec2 <- c("fjdlksa01dada","rauü","sjklf")
Encoding(vec2)
# returns
[1] "unknown" "UTF-8"   "unknown"

weird.

+5
source share
2 answers

There may be a cleaner way ... but this works:

sapply(vec, function(x){
      paste(x, paste(rep(" ", 13-nchar(x)), collapse=""), "")
      })

(. [] - 13)

+1

?sprintf:

- fmt UTF-8, UTF-8 , UTF-8. .

Rgui ( ); . .

, , :

> vec2 <- c("fjdlksa01dada","rauü","sjklf")
> y <- sprintf("%-15s", vec)
> nchar(y)
[1] 15 15 15

MacOs, , R, , Mac , :

Rgui --encoding=utf-8
+1

All Articles