Deep neural networks, or more precisely here fully connected neural networks, can be flexibly built which makes their application more challenging than other ML algorithms.
In the following, we use the ‘keras’ (Allaire and Chollet (2022); Chollet et al. (2015)) (Python: ‘keras’ (Chollet et al. (2015)); Julia: ‘Flux’ (Innes et al. (2018))) package which is a higher level API on the python ‘tensorflow’ framework (Abadi et al. (2016)).
library(keras)X =scale(as.matrix(iris[,1:4]))Y =as.integer(iris$Species)# We need to one hot encode our response classesYT =k_one_hot(Y-1L, num_classes =3)DNN =keras_model_sequential() %>%# first hidden layerlayer_dense(input_shape =ncol(X), units =10, activation ="relu") %>%# second hidden layer with regularizationlayer_dense(units =20, activation ="relu",kernel_regularizer =regularizer_l1()) %>%# output layer, 3 output neurons for our three classes# and softmax activation to get quasi probabilities # that sum up to 1 for each observationlayer_dense(units =3, activation ="softmax")# print architecturesummary(DNN)
from tensorflow import kerasfrom tensorflow.keras.layers import*from sklearn import datasetsfrom sklearn.preprocessing import scaleiris = datasets.load_iris()X = scale(iris.data)Y = iris.target# We need to one hot encode our response classesYT = keras.utils.to_categorical(Y, num_classes =3)DNN = keras.Sequential()# first hidden layerDNN.add(Dense( input_shape=[X.shape[1]], units =10, activation ="relu")) # second hidden layer with regularizationDNN.add(Dense( units =20, activation ="relu", kernel_regularizer = keras.regularizers.l1()))# output layer, 3 output neurons for our three classes# and softmax activation to get quasi probabilities # that sum up to 1 for each observationDNN.add(Dense( units =3, activation ="softmax"))# print architectureDNN.summary()# add loss function and optimizer
library(keras)X =scale(as.matrix(iris[,2:4]))Y =as.matrix(iris[,1,drop=FALSE])DNN =keras_model_sequential() %>%# first hidden layerlayer_dense(input_shape =ncol(X), units =10, activation ="relu") %>%# second hidden layer with regularizationlayer_dense(units =20, activation ="relu",kernel_regularizer =regularizer_l1()) %>%# output layer, one output neuron for one response# and no activation functionlayer_dense(units =1)# print architecturesummary(DNN)
# add loss function and optimizerDNN %>%compile(loss = loss_mean_squared_error,optimizer =optimizer_adamax(0.01))# train modelDNN %>%fit(X, YT, epochs =50, verbose =0)
Make predictions:
head(predict(DNN, X), n =3)
[,1]
[1,] 0.3252823
[2,] 0.3261368
[3,] 0.3257285
from tensorflow import kerasfrom tensorflow.keras.layers import*from sklearn import datasetsfrom sklearn.preprocessing import scaleiris = datasets.load_iris()data = iris.dataX = scale(data[:,1:4])Y = data[:,0]DNN = keras.Sequential()# first hidden layerDNN.add(Dense( input_shape=[X.shape[1]], units =10, activation ="relu")) # second hidden layer with regularizationDNN.add(Dense( units =20, activation ="relu", kernel_regularizer = keras.regularizers.l1()))# output layer, 3 output neurons for our three classes# and softmax activation to get quasi probabilities # that sum up to 1 for each observationDNN.add(Dense( units =1, activation =None))# print architectureDNN.summary()# add loss function and optimizer
Abadi, Martı́n, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. 2016. “Tensorflow: A System for Large-Scale Machine Learning.” In 12th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 16), 265–83.