R: Generate a histogram from data samples

Suppose I have a vector a :

 c(1, 6, 2, 4.1, 1, 2) 

And vector count b :

 c(2,3,2,1,1,0) 

I would like to generate a vector c :

 c(1, 1, 6, 6, 6, 2, 2, 4.1, 1) 

To call:

 hist(c) 

How can I build c , or is there a way to generate a histogram directly from a and b ? Note the duplicates in a , as well as the unequal spacing.

The requirement of a vectorized solution. a and b too big for friends and friends.

+6
source share
3 answers

?rep

 > rep(a, b) [1] 1.0 1.0 6.0 6.0 6.0 2.0 2.0 4.1 1.0 > 

Edit, as I was curious!

 a <- sample(1:10, 1e6, replace=TRUE) b <- sample(1:10, 1e6, replace=TRUE) > system.time(rep(a, b)) user system elapsed 0.140 0.016 0.156 > system.time(inverse.rle(list(lengths=b, values=a))) user system elapsed 0.024 0.004 0.028 
+10
source

Just for something other than rep :

 > inverse.rle(list(lengths=b,values=a)) [1] 1.0 1.0 6.0 6.0 6.0 2.0 2.0 4.1 1.0 
+5
source

Some benchmarking and a faster solution. rep.int is a faster implementation of rep in a standard use case (from ?rep )

 rep.int(a, b) 

I was not sure about the benchmarking above

inverse.rle is just a wrapper for rep.int . rep.int should be faster than rep . I think the wrapper inverse.rle component should be slower than interpreting rep() as a primitive function

Some micro lenses

 library(microbenchmark) microbenchmark(rep(a,b), rep.int(a,b), inverse.rle(list(values = a, lengths =b))) Unit: milliseconds expr min lq median uq 1 inverse.rle(list(values = a, lengths = b)) 29.06968 29.26267 29.36191 29.67501 2 rep(a, b) 25.65125 25.76246 25.84869 26.52348 3 rep.int(a, b) 20.38604 23.31840 23.38940 23.69600 max 1 72.80645 2 69.00169 3 66.40759 

There is little in it, but the rep.int appears to rep.int - which he owes.

+4
source

All Articles