티스토리 뷰
2. Neural Networks / L2. Implementing Gradient Descent - Implementing Backpropagation
chrisysl 2018. 7. 13. 00:41Implementing backpropagation
- 위는 output layer의 error term을 구하는 방법이다.
- 또한 위는 hidden layer의 error term을 구하는 방법이다.
- 지금부터 하나의 hidden layer와 하나의 output unit을 가진 네트워크에 대해 예를 들어보자.
backpropagation을 통한 weight들을 업데이트하는 알고리즘
· 각 layer에 대한 weight step들을 0으로 설정
- input → hidden weights = 0
- hidden → output weights = 0
· 각 training data들에 대해
- 네트워크를 통해 순서대로 yHat을 구함
- 그리고나서, output unit에서 error gradient를 위와같이 계산
- 이 때, z는 위와같이 구할 수 있다. (output unit으로 들어가는 input)
- 그 다음 hidden layer에 error를 전달(전파, propagate)해줌(hidden_error)
- weight step(기울기)들을 위와같이 업데이트
· learning rate와 number of records를 반영하여 weights들을 업데이트
- weights 업데이트
· e 로 설정해둔 epochs 만큼 반복.
Backpropagation exercise
- forward pass 구현
- backpropagation 알고리즘 구현
- weights 업데이트 에 초점을 맞춰 코드를 짜면 됨
import numpy as np from data_prep import features, targets, features_test, targets_test np.random.seed(21) def sigmoid(x): """ Calculate sigmoid """ return 1 / (1 + np.exp(-x)) # Hyperparameters n_hidden = 2 # number of hidden units epochs = 900 learnrate = 0.005 n_records, n_features = features.shape last_loss = None # Initialize weights weights_input_hidden = np.random.normal(scale=1 / n_features ** .5, size=(n_features, n_hidden)) weights_hidden_output = np.random.normal(scale=1 / n_features ** .5, size=n_hidden) for e in range(epochs): del_w_input_hidden = np.zeros(weights_input_hidden.shape) del_w_hidden_output = np.zeros(weights_hidden_output.shape) for x, y in zip(features.values, targets): ## Forward pass ## # TODO: Calculate the output hidden_input = np.dot(x, weights_input_hidden) hidden_output = sigmoid(hidden_input) output = sigmoid(np.dot(hidden_output,weights_hidden_output)) ## Backward pass ## # TODO: Calculate the network's prediction error error = y - output # TODO: Calculate error term for the output unit output_error_term = error * output * (1 - output) ## propagate errors to hidden layer # TODO: Calculate the hidden layer's contribution to the error hidden_error = output_error_term * weights_hidden_output # TODO: Calculate the error term for the hidden layer hidden_error_term = hidden_error * hidden_output * (1 - hidden_output) # TODO: Update the change in weights del_w_hidden_output += output_error_term * hidden_output del_w_input_hidden += hidden_error_term * x[:,None] # TODO: Update weights weights_input_hidden += learnrate * del_w_input_hidden / n_records weights_hidden_output += learnrate * del_w_hidden_output / n_records # Printing out the mean square error on the training set if e % (epochs / 10) == 0: hidden_output = sigmoid(np.dot(x, weights_input_hidden)) out = sigmoid(np.dot(hidden_output, weights_hidden_output)) loss = np.mean((out - targets) ** 2) if last_loss and last_loss < loss: print("Train loss: ", loss, " WARNING - Loss Increasing") else: print("Train loss: ", loss) last_loss = loss # Calculate accuracy on test data hidden = sigmoid(np.dot(features_test, weights_input_hidden)) out = sigmoid(np.dot(hidden, weights_hidden_output)) predictions = out > 0.5 accuracy = np.mean(predictions == targets_test) print("Prediction accuracy: {:.3f}".format(accuracy))
backpropagation 알고리즘 관련 자료
- From Andrej Karpathy : Yes, you should understand backprop
- From Andrej Karpathy : a lectur from Stanford's CS231n course