Slightly modified from: Python Pandas Dataframe: normalize data between 0.01 and 0.99? , but from some comments it is considered that this is relevant (sorry if you count repost though ...)
I need an individual normalization in that the regular zero percentile or z-score was insufficient. Sometimes I knew what the possible max and minimum numbers of the population were, and therefore I wanted to define it, except for my sample, or another middle, or something else! This can often be useful for scaling and normalizing data for neural networks, where you may need all inputs between 0 and 1, but some of your data may need to be scaled more individually ... because percentiles and stdevs assume your samples cover the population, but sometimes we know that this is not true. It was very useful for me when visualizing data in heatmaps. Therefore, I created a custom function (additional steps in the code were used here to make it as readable as possible):
def NormData(s,low='min',center='mid',hi='max',insideout=False,shrinkfactor=0.): if low=='min': low=min(s) elif low=='abs': low=max(abs(min(s)),abs(max(s)))*-1.#sign(min(s)) if hi=='max': hi=max(s) elif hi=='abs': hi=max(abs(min(s)),abs(max(s)))*1.#sign(max(s)) if center=='mid': center=(max(s)+min(s))/2 elif center=='avg': center=mean(s) elif center=='median': center=median(s) s2=[x-center for x in s] hi=hi-center low=low-center center=0. r=[] for x in s2: if x<low: r.append(0.) elif x>hi: r.append(1.) else: if x>=center: r.append((x-center)/(hi-center)*0.5+0.5) else: r.append((x-low)/(center-low)*0.5+0.) if insideout==True: ir=[(1.-abs(z-0.5)*2.) for z in r] r=ir rr =[x-(x-0.5)*shrinkfactor for x in r] return rr
It takes a series of Pandas or even just a list and normalizes it to your low, central and high points. there is also a compression ratio! so you can scale the data far from endpoints 0 and 1 (I had to do this when combining color palettes in matplotlib: A single pcolormesh with more than one color scheme using Matplotlib ) This way you can see how the code works, but mostly say that you have the values ββ[-5,1,10] in the sample, but you want to normalize based on the range from -7 to 7 (so that nothing above 7, our "10" is treated as 7 efficiently) with a middle of 2, but reduces it so that it matches the 256-bit color map:
#In[1] NormData([-5,2,10],low=-7,center=1,hi=7,shrinkfactor=2./256)
It can also turn your data inside out ... it may seem strange, but I found it useful for thermal material. Suppose you want a darker color for values ββclose to 0, not hi / low. You can heat the map based on normalized data, where inout = True:
#In[2] NormData([-5,2,10],low=-7,center=1,hi=7,insideout=True,shrinkfactor=2./256)
So, now β2β, which is closest to the center, defined as β1β, is the highest value.
In any case, I thought that my application matters if you want to rescale the data in other ways that may have useful applications for you.