I would really appreciate constructive feedback and suggestions for this fully-connected neural network I have written with TensorFlow 2, Python 3. It estimates the period and amplitude of a sine curve given 100 sample y-points on the curve.

I am mainly interested in how this can be optimised for speed, improving the styling & readability, if I have made any strange programming choices, if there’s a powerful DL technique I am missing which might be useful (etc). Snippets of this script will be used in a pedagogical document to introduce DNN concepts alongside the TensorFlow methods so improvements given this context would be very beneficial.

One immediate flaw I will can address later is that currently the model size is far too big so it will not actually generalise outside the range of parameters in the training examples. That’s fine; for now I just wanted to get everything working and will tune hyperparameters and do regularisation later.

Any advice would be deeply appreciated. Additionally, I am yet to implement a normalisation layer (I would like this to normalise (in the context of them being curves however) using the whole training data and automatically inputs to the model once trained. I am also yet to vectorise the make_curve function. Suggestions for either of these next steps would be fantastic also.

This is of course a toy problem and I will adapt the network to a different problem in which I will be interested in the efficiency and high-dimensional inputs. I have access to both cluster CPU and GPU cores, as well as my fairly laptop with GeForce GTX 1050 Ti Max-Q GPU so I would be interested in optimising this for taking advantage of the parallel computing availability.

The 3d plot is just for fun and shows how the squared error of a prediction blows up for degenerate cases such as zero period or amplitude. Would I be right to assume that a network which has generalised well would have better error specifically at the boundaries?

With the current settings, this take 2.5 mins to run on my 2018 Dell XPS laptop (‘s CPU?) with the following output:

```
Average test loss: 8.497130045554286
Average val loss: 7.136056077585638
Time taken: 146.38214015960693
```

Here is the code:

```
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Dropout, BatchNormalization
import tensorflow.keras.backend as kb
from tensorflow import math as tm
import math, time
import numpy as np
from datetime import datetime
#import warnings
#warnings.filterwarnings("ignore")
kb.set_floatx('float64')
start = time.time()
num_sample_pts = 100
train_size = (5*10**2)**2 # Funky format so it is a square and easily configurable during development
train_sqrt = int(train_size**0.5)
epoch_nums = 2**3
minibatch = 2**6 # Remember to have minibatch << train_size*epochs
callbacks = False # TensorBoard logging is much slower than the learning itself
learning_rate = 0.00063
num_layers = 28
reg1_rate = 0.001
reg2_rate = 0.001
act_func = 'elu'
dropout = 0.2
units = 37
def make_curve(period, amp, aug=False):
if False:
a = 2
# Make this vectorised so randoms are a vector of randoms.
return ( amp * np.sin(( np.linspace(-4,4, num_sample_pts) + aug*np.random.rand() ) * (2*np.pi/period))
+ aug*np.random.rand() ).reshape(num_sample_pts)
def data(sample_interval = (-4, 4), amp_interval = (0,30), aug=False, gridsize=train_sqrt):
sample = np.linspace(*sample_interval, num_sample_pts)
mesh = np.meshgrid(0.05 + 10*np.pi*np.random.rand(gridsize), # Periods
amp_interval(0) + (amp_interval(1) - amp_interval(0)) *
np.random.rand(gridsize)) # Amplitudes
pairs = np.array(mesh).T.reshape(-1, 2)
curves = np.array((make_curve(w,a,aug) for w,a in pairs)) # Change when make_curve is vectorised
glob_centre, glob_max = np.mean(curves), max(amp_interval) # Globally centre all curves within pm 1
curves = (curves - glob_centre) / glob_max # - Replace with a normalisation layer
return (curves, pairs)
# Returns list of 2 arrays (x,y):
# - x is an array of each curve sample list
# - y is an array of the period-amplitude pairs corresponding to curve samples in x
df = data(aug=True)
# Not used. Another possibility could be rmse against a sample from predicted curve
def custom_mean_percentage_loss(y_true, y_pred): #Minimax on the percentage error
diff = y_pred - y_true
non_zero = y_true + 10**-8
res = tm.reduce_mean(tm.abs(tm.divide(diff,non_zero)))
return res
if callbacks:
logdir = "logs\scalars\" + datetime.now().strftime("%Y%m%d-%H%M-%S")
tensorboard_callback = (keras.callbacks.TensorBoard(log_dir=logdir, update_freq='epoch'))
else:
tensorboard_callback = ()
kb.set_floatx('float64')
def model_builder():
initializer = keras.initializers.TruncatedNormal(mean=0., stddev=0.5)
model = keras.Sequential()
model.add( Dense( units = num_sample_pts, # Number of input nodes equals input dimension
kernel_initializer = initializer, # Initialize weights
activation = act_func,
kernel_regularizer = keras.regularizers.l1_l2(reg1_rate, reg2_rate),
dtype='float64'))
BatchNormalization()
for layer in range(num_layers):
Dropout(dropout)
model.add( Dense( units = units, # Number of nodes to make number of inputs
activation = act_func,
kernel_regularizer = keras.regularizers.l1_l2(reg1_rate, reg2_rate),
dtype='float64'))
BatchNormalization()
model.add( Dense(units = 2, activation = 'linear', dtype='float64'))
# Outputting amplitude-period pair requires 2 nodes in the output layer.
model.compile(
optimizer = keras.optimizers.Adam(learning_rate = learning_rate),
loss = 'mse',
metrics = ('mse') ) # Measures train&test performance
return model
model = model_builder()
training_history = model.fit(*df,
batch_size = minibatch, # Number of data per gradient step
epochs = epoch_nums,
verbose = 0,
validation_split = min(0.2, (train_sqrt**2)/5000), # Fraction of data used for validation set
callbacks=tensorboard_callback)
print("Average test loss: ", np.average(training_history.history('loss')(:10)))
print("Average val loss: ", np.average(training_history.history('val_loss')(:10)))
print('Time taken: ', time.time()-start)
print(' ')
import winsound
for i in range(2):
winsound.Beep(1000, 250)
```

This is the first neural network I have written. Thank you very much for your thoughts, improvements and contributions.