Matplotlib logarithmic scale with zero value

Question

Matplotlib logarithmic scale with zero value

I have a very large and rare dataset for twitter spam accounts, and I need to scale the x axis to be able to visualize the distribution (histogram, kde, etc.) and cdf of various variables (tweets_count, number of followers / next, etc.) .d.).

> describe(spammers_class1$tweets_count) var n mean sd median trimmed mad min max range skew kurtosis se 1 1 1076817 443.47 3729.05 35 57.29 43 0 669873 669873 53.23 5974.73 3.59

In this dataset, the value 0 is of great importance (in fact, 0 must have the highest density). However, on a logarithmic scale, these values are ignored. I was thinking of changing the value to 0.1, for example, but it doesn't make sense that there are spam accounts that have 10 ^ -1 followers.

So what will be the workaround in python and matplotlib?

+4

python matplotlib logarithm

amaatouq May 05 '13 at 8:53

source share

2 answers

 ax1.set_xlim(0, 1e3)

Here is an example from the matplotlib documentation.

And there he sets the limit values of the axes as follows:

 ax1.set_xlim(1e1, 1e3) ax1.set_ylim(1e2, 1e3)

0

Stephane rolling May 05 '13 at 9:25

source share

unutbu · Accepted Answer · 2013-05-05T09:35:44+0000

Add 1 to each x value, then take the log:

 import matplotlib.pyplot as plt import numpy as np import matplotlib.ticker as ticker fig, ax = plt.subplots() x = [0, 10, 100, 1000] y = [100, 20, 10, 50] x = np.asarray(x) + 1 y = np.asarray(y) ax.plot(x, y) ax.set_xscale('log') ax.set_xlim(x.min(), x.max()) ax.xaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x-1))) ax.xaxis.set_major_locator(ticker.FixedLocator(x)) plt.show()

Use

 ax.xaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x-1))) ax.xaxis.set_major_locator(ticker.FixedLocator(x))

to change label labels to values other than the x log.

(My initial suggestion was to use plt.xticks(x, x-1) , but that would affect all axes. To highlight the changes for individual axes, I changed all command calls to ax , not plt calls. )

matplotlib deletes points containing NaN , inf or -inf . Since log(0) -inf , the point corresponding to x=0 will be removed from the logarithm graph.

If you increase all x-values by 1 since log(1) = 0 , the point corresponding to x=0 will not be built on x=log(1)=0 in the log chart.

The remaining x values will also be shifted by one, but this does not matter for the eye, since log(x+1) very close to log(x) for large x values.

Matplotlib logarithmic scale with zero value

More articles: