How to separate the two left-most histogram cells in R

Suppose I need to build such a data set as shown below:

set.seed(1) dataset <- sample(1:7, 1000, replace=T) hist(dataset) 

As you can see in the figure below, the two leftmost drawers have no space between them, unlike the other bins.

enter image description here

I tried changing xlim, but that didn't work. Basically, I would like to have every number (from 1 to 7) presented in the form of a basket, and, in addition, I would like for any two neighboring bins to have space around them ... Thank you!

+7
source share
2 answers

The best way is to set the breaks argument manually. Using the data from your code,

 hist(dataset,breaks=rep(1:7,each=2)+c(-.4,.4)) 

gives the following graph:

enter image description here

The first part of rep(1:7,each=2) is that you want the bars to be centered around. The second part determines how wide the stripes are; if you change it to c(-.49,.49) they will almost touch it, if you change it to c(-.3,.3) you will get narrower bars. If you set it to c(-.5,.5) , then R will yell at you because you are not allowed to have the same number in your breaks vector twice.

Why does it work?

If you split the breaks vector, you get one part that looks like this:

 > rep(1:7,each=2) [1] 1 1 2 2 3 3 4 4 5 5 6 6 7 7 

and the second part, which looks like this:

 > c(-.4,.4) [1] -0.4 0.4 

When you add them together, R cycles through the second vector as many times as necessary to make it until the first vector. So you get

  1-0.4 1+0.4 2-0.4 2+0.4 3-0.4 3+0.4 [etc.] = 0.6 1.4 1.6 2.4 2.6 3.4 [etc.] 

Thus, you have one bar from 0.6 to 1.4 - with a center of 1, with a width of 2 * .4 - another bar from 1.6 to 2.4 with a center of 2 with 2 * .4 and etc. If you had data between them (for example, 2.5), then the histogram would look silly, because it would create a bar from 2.4 to 2.6, and the width of the strip would not be even (since this strip would only .2 wide, while all others .8). But with integer values ​​that are not a problem.

+9
source

You need six bars not seven bars; this is what your histogram has. But then you get seven bars. This is mistake.

make a sample (1: 6, 1000, replace = T) instead of a sample (1: 7, 1000, replace = T)

If you need seven bars, then seed with 0

-3
source

All Articles