From 661548a32fb570b458fa289da7cfc607458ceee8 Mon Sep 17 00:00:00 2001 From: greentec Date: Fri, 8 Feb 2019 15:47:09 +0900 Subject: [PATCH] correct value of softmax softmax of [4,2,6] is [0.11731043, 0.01587624, 0.86681333]. python code from https://stackoverflow.com/a/38250088/2689257 ``` def softmax(x): """Compute softmax values for each sets of scores in x.""" e_x = np.exp(x - np.max(x)) return e_x / e_x.sum(axis=0) ``` --- .../Cartpole REINFORCE Monte Carlo Policy Gradients.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Policy Gradients/Cartpole/Cartpole REINFORCE Monte Carlo Policy Gradients.ipynb b/Policy Gradients/Cartpole/Cartpole REINFORCE Monte Carlo Policy Gradients.ipynb index d661e15..59a3d50 100644 --- a/Policy Gradients/Cartpole/Cartpole REINFORCE Monte Carlo Policy Gradients.ipynb +++ b/Policy Gradients/Cartpole/Cartpole REINFORCE Monte Carlo Policy Gradients.ipynb @@ -194,7 +194,7 @@ "The idea is simple:\n", "- Our state which is an array of 4 values will be used as an input.\n", "- Our NN is 3 fully connected layers.\n", - "- Our output activation function is softmax that squashes the outputs to a probability distribution (for instance if we have 4, 2, 6 --> softmax --> (0.4, 0.2, 0.6)" + "- Our output activation function is softmax that squashes the outputs to a probability distribution (for instance if we have 4, 2, 6 --> softmax --> (0.11731043, 0.01587624, 0.86681333)" ] }, {