Uddeshya Singh
1 min readSep 22, 2018

--

Akshat Pathak , Well, mostly auto encoder like architectures use sigmoid as activation functions. But this being the decoding architecture, I believe the researchers took the liberty and used the non-linearity as it is. Mostly the advantage it offers is the presence of non zero gradients everywhere and hence helps in back-propagation without the risk of “dying “ case!

--

--

Uddeshya Singh
Uddeshya Singh

Written by Uddeshya Singh

Software Engineer @Gojek | GSoC’19 @fossasia | Loves distributed systems, football, anime and good coffee. Writes sometimes, reads all the time.

Responses (1)