The total density of the histogram with a zero sum does not stack with 1

Taking a hint from another thread ( @EnricoGiampieri answer to cumulative python distribution graphs ), I wrote

# plot cumulative density function of nearest nbr distances # evaluate the histogram values, base = np.histogram(nearest, bins=20, density=1) #evaluate the cumulative cumulative = np.cumsum(values) # plot the cumulative function plt.plot(base[:-1], cumulative, label='data') 

I set density = 1 from the np.histogram documentation that says:

"Please note that the sum of the histogram values ​​will not be 1, unless units of unit width are selected, it is not a function of the mass of probability.

Well, indeed, when they are laid, they do not stack up to 1. But I do not understand the "bins of the width of unity." Of course, when I set the boxes to 1, I get an empty schedule; when I set them to the size of the population, I do not get the amount up to 1 (more than 0.2). When I use the proposed 40 cells, they add up to approximately 0.006.

Can someone give me some advice? Thanks!

+7
python numpy
source share
2 answers

You need to make sure your baskets have a width of 1. This:

 np.all(np.diff(base)==1) 

To do this, you need to manually specify your baskets:

 bins = np.arange(np.floor(nearest.min()),np.ceil(nearest.max())) values, base = np.histogram(nearest, bins=bins, density=1) 

And you will get:

 In [18]: np.all(np.diff(base)==1) Out[18]: True In [19]: np.sum(values) Out[19]: 0.99999999999999989 
+6
source share

You can simply normalize your values variable like this:

unity_values = values / values.sum()

A complete example would look something like this:

 import numpy as np import matplotlib.pyplot as plt x = np.random.normal(size=37) density, bins = np.histogram(x, normed=True, density=True) unity_density = density / density.sum() fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows=2, ncols=2, sharex=True, figsize=(8,4)) widths = bins[:-1] - bins[1:] ax1.bar(bins[1:], density, width=widths) ax2.bar(bins[1:], density.cumsum(), width=widths) ax3.bar(bins[1:], unity_density, width=widths) ax4.bar(bins[1:], unity_density.cumsum(), width=widths) ax1.set_ylabel('Not normalized') ax3.set_ylabel('Normalized') ax3.set_xlabel('PDFs') ax4.set_xlabel('CDFs') fig.tight_layout() 

enter image description here

+12
source share

All Articles