Activation Function - Should I use sigmoid or ReLU?


Because of its efficiency, ReLU is usually the superior option. In addition to being much quicker than sigmoid functions, ReLU also avoids the issue of vanishing gradients and has minimal computing complexity.