what is the difference of torch. nn. Softmax, torch. nn. funtional. softmax . . . Why would you need a log softmax? Well an example lies in the docs of nn Softmax: This module doesn't work directly with NLLLoss, which expects the Log to be computed between the Softmax and itself Use LogSoftmax instead (it's faster and has better numerical properties) See also What is the difference between log_softmax and softmax?