How hard sigmoid is determined

I am working on Deep Nets using keras. There is a hard sigmoid activation. What is its mathematical definition?

I know what a sigmoid is. Someone asked a similar question about Quora: https://www.quora.com/What-is-hard-sigmoid-in-artificial-neural-networks-Why-is-it-faster-than-standard-sigmoid-Are -there-any-disadvantages-over-the-standard-sigmoid

But I could not find the exact mathematical definition anywhere?

+6
source share
3 answers

this is

max(0, min(1, (x + 1)/2)) 
0
source

Since Keras supports both Tensorflow and Theano, the exact implementation may be different for each backend - I will cover only Theano. For the backend, Theano Keras uses T.nnet.hard_sigmoid , which, in turn, is a linearly approximated standard sigmoid :

 slope = tensor.constant(0.2, dtype=out_dtype) shift = tensor.constant(0.5, dtype=out_dtype) x = (x * slope) + shift x = tensor.clip(x, 0, 1) 

i.e. this is max (0, min (1, x * 0.2 + 0.5))

+10
source

For reference, a hard sigmoid function can be defined differently in different places. In Courbariaux et al. 2016 [1] it is defined as:

Οƒ is the β€œrigid sigmoidal” function: Οƒ (x) = clip ((x + 1) / 2, 0, 1) = max (0, min (1, (x + 1) / 2))

The goal is to provide a probability value (hence it must be between 0 and 1 ) for use in stochastic binarization of neural network parameters (e.g. weight, activation, gradient). You use the probability p = Οƒ(x) returned from the rigid sigmoid function to set the parameter x to +1 with probability p or -1 with probability 1-p .

[1] https://arxiv.org/abs/1602.02830 - "Binarized neural networks: training deep neural networks with weights and activations tied to +1 or -1", Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El- Yaniv, Yoshua Bengio, (Posted on February 9, 2016 (v1), last modified on March 17, 2016 (this version, version 3))

+1
source

All Articles