machine learning – Remove unnecessary Neurons from a Neural Network regarding a particular output

Suppose we have a Neural Network with a binary output (0 or 1). What I am trying to do is to remove neurons or layers from the NN while maintaining a correct classification for all the instances that were classified as 1 in the original NN, same thing for the output 0. Said differently, is there any way to spot neurons that are paramount to the correct classification of a particular output ? The aim is to remove all the unnecessary neurons for that output.

A research track could be compiling the NN to a Boolean Formula and reasoning on it to spot the neurons that does not contribute to the chosen output, but it is not always obvious to carry out this compilation.

python – LSTM Neural Network

I have tried to build a neural network comprised of ten neurons, but I don’t know if my code in Python is any good. I applied the equations for an LSTM neuron in the code, for it to work, and then I made a network of neurons. Would this work or is it just a failed attempt?

h = hidden layer

num_hl = number of hidden layers

Could you please give some feedback on my code and tell me what I can make better etc. Thank you! Here is my code:

import math

num_hl = 10
inp = list(range(num_hl))

h = list(range(num_hl))

fg_p1 = list(range(num_hl))
fg_p2 = list(range(num_hl))

fg_p3 = list(range(num_hl))
forget_gate = list(range(num_hl))

for i in range(1, (num_hl - 1)):

    fg_p1(i) = h(1) * h(i - 1)
    fg_p2(i) = h(1) * inp(i)

    fg_p3(i) = fg_p1(i) + fg_p2(i) + len(h)

    forget_gate(i) = 1/(1 + math.exp(-fg_p3(i)))

inp_p1 = list(range(num_hl))
inp_p2 = list(range(num_hl))

inp_p3 = list(range(num_hl))
input_gate = list(range(num_hl))

for i in range(1, (num_hl - 1)):

    inp_p1(i) = h(2) * h(i - 1)
    inp_p2(i) = h(2) * inp(i)

    inp_p3(i) = inp_p1(i) + inp_p2(i) + len(h)

    input_gate(i) = 1/(1 + math.exp(-inp_p3(i)))

act_vec_p1 = list(range(num_hl))
act_vec_p2 = list(range(num_hl))

act_vec_p3 = list(range(num_hl))
activation_vector = list(range(num_hl))

for i in range(1, (num_hl - 1)):

    act_vec_p1(i) = h(3) * h(i - 1)
    act_vec_p2(i) = h(3) * inp(i)

    act_vec_p3(i) = act_vec_p1(i) + act_vec_p2(i) + len(h)

    activation_vector(i) = math.tanh(act_vec_p3(i))

state_vector = list(range(num_hl))

for i in range(1, (num_hl - 1)):

    state_vector(i) = forget_gate(i) * state_vector(i - 1) + input_gate(i) * activation_vector(i)

out_p1 = list(range(num_hl))
out_p2 = list(range(num_hl))

out_p3 = list(range(num_hl))
out_gate = list(range(num_hl))

for i in range(1, (num_hl - 1)):

    out_p1(i) = h(3) * h(i - 1)
    out_p2(i) = h(3) * inp(i)

    out_p3(i) = out_p1(i) + out_p2(i) + len(h)

    out_gate(i) = 1/(1 + math.exp(-out_p3(i)))

for i in range(1, (num_hl - 1)):

    h(i) = out_gate(i) * math.tanh(state_vector(i))

Add an interpolated function to neural networks via ElementwiseLayer

1.Try to add an interpolated function to neural networks.

ifun = Interpolation[Table[{x, Tanh[x]}, {x, -100, 100, 0.2}], 
  InterpolationOrder -> 1]
Plot[{ifun[x], Tanh[x]}, {x, -4, 4}]

2.We implement the function.

net = NetChain[{30, ElementwiseLayer[ifun], 20, 
   ElementwiseLayer[ifun], 3, SoftmaxLayer[]}, "Input" -> {2}, 
  "Output" -> NetDecoder[{"Class", {Red, Green, Blue}}]]

3. Errors

ElementwiseLayer::invscf: InterpolatingFunction[{{-100.,100.}},... could not be symbolically evaluated as a unary scalar function.

4.Please tell me how I can fix the problem.

Python – preprocessing audio data sets (cleaning useless WAV files) to train a neural network

I need to remove bad audio files to train the neural network for command recognition. I am using Google's voice command record, but many files have too much background noise or are cut in the middle of the pronunciation so that they cannot be fed into a neural network.
I tried using PCA (Principal Component Analysis) to draw a 3D diagram that I saw from here:

https://www.kaggle.com/davids1992/speech-representation-and-data-exploration

but it didn't bring many results
P.S. I tried 300 audio files with 1 word, not a full folder, and manually compared the file name to the 3D plot

Create loss ports for a neural network with multiple outputs

I am creating a neural network with multiple classifications for a data set. I created the network, but I think I have to provide a loss port for each classification

Here are the labels for the classification and the encoders & decoders.

labels = {"Dark Colour", "Light Colour", "Mixture"}
sublabels = {"Blue", "Yellow", "Mauve"}
labeldec = NetDecoder({"Class", labels});
sublabdec = NetDecoder({"Class", sublabels});
bothdec = NetDecoder({"Class", Flatten@{labels, sublabels}})

enc = NetEncoder({"Class", {"Dark Colour", "Light Colour", "Mixture", 
    "Blue", "Yellow", "Mauve"}})

Here is the network

SNNnet(inputno_, outputno_, dropoutrate_, nlayers_, class_: True) := 
 Module({nhidden, linin, linout, bias},
  nhidden = Flatten({Table({(nlayers*100) - i},
      {i, 0, (nlayers*100), 100})});
  linin = Flatten({inputno, nhidden((;; -2))});
  linout = Flatten({nhidden((1 ;; -2)), outputno});
  NetChain(
   Join(
    Table(
     NetChain(
      {BatchNormalizationLayer(),
       LinearLayer(linout((i)), "Input" -> linin((i))),
       ElementwiseLayer("SELU"),
       DropoutLayer(dropoutrate)}),
     {i, Length(nhidden) - 1}),
    {LinearLayer(outputno),
     If(class, SoftmaxLayer(),
      Nothing)})))

net = NetInitialize@SNNnet(4, 6, 0.01, 8, True);

Here are the nodes used for the Netgraph function

nodes = Association("net" -> net, "l1" -> LinearLayer(3), 
   "sm1" -> SoftmaxLayer(), "l2" -> LinearLayer(3), 
   "sm2" -> SoftmaxLayer(),
   "myloss1" -> CrossEntropyLossLayer("Index", "Target" -> enc),
   "myloss2" -> CrossEntropyLossLayer("Index", "Target" -> enc));

Here's what the NetGraph should do

connectivity = {NetPort("Data") -> 
    "net" -> "l1" -> "sm1" -> NetPort("Label"),
   "sm1" -> NetPort("myloss1", "Input"),
   NetPort(sublabels) -> NetPort("myloss1", "Target"), 
   "myloss1" -> NetPort("Loss1"),
   "net" -> "l2" -> "sm2" -> NetPort("Sublabel"),
   "myloss2" -> NetPort("Loss2"),
   "sm2" -> NetPort("myloss2", "Input"),
   NetPort(labels) -> NetPort("myloss2", "Target")};

The data deviate from each other with each classification at "net" and go through the subsequent linear and softmax layer as well as the corresponding NetPort
The problem is the loss port that diverges on each Softmax layer.

When I run this code

NetGraph(nodes, connectivity, "Label" -> labeldec, 
 "Sublabel" -> sublabdec)

I get the error message: NetGraph :: invedgesrc: NetPort ({blue, yellow, mauve}) is not a valid source for NetPort ({myloss1, Target}).

Can anyone tell me why this happens?

Thank you for reading.

Neural Networks – Zero Sum Games and Halting Problem

Wikipedia states on the page of the stop problem: "For every program f that could determine whether programs are to be stopped, a" pathological "program called with an input can pass its own source and its input to f and then specifically the opposite of do what f predicts g will do. "

Suppose we have two neural networks that approximate f and g (where infinite size and depth limits are allowed). Is it wrong to assume that these two NNs are participating in a zero-sum game against each other and that the Nash equilibrium of these two NNs contains the solution to the stop problem?

object-oriented – MNIST Neural network in C ++

When reading an online book (Michael A. Nielsen, Neural Networks and Deep Learning, Determination Press, 2015) about neural networks, I decided to build a neural network that does not require a predefined network size, i.e. the layer depth and size are determined by Input arguments defined.

My goal was to make the network object modular so that different training principles can then be attached. The main responsible person is then responsible for calling up the modules in such a way that they lead to training, tests or to display the results.

I tried programming with OOP concepts. However, I have problems with what the NeuralNetwork object should handle and what should be handled in Main. The batch overflow mentions that an object should be responsible for all of its affairs, including I / O. But where do I draw the line? For example, the network is responsible for saving and loading the results, but not for reading the parameter file that specifies the network size to be loaded.

As a fairly inexperienced C ++ programmer, I welcome all of the insights to improve my skills.

The code is also on GitHub: https://github.com/vanderboog/mnist-neural-network

You can find the manual in the GitHub link.

Neural_Network.h

struct CLayer
{
    // Declaration of variables used in a each layer
    arma::dvec a;
    arma::dvec z;
    arma::dvec b;
    arma::dvec db;
    arma::dmat w;
    arma::dmat dw;
    arma::dvec kD;
};

class NeuralNetwork
{
    int numOfLayers_;
    int learnSetSize_;
    double learnRate_;
    double regularization_;
    double halfRegularization_;
    int learnReductionCycle_;
    double learnReductionFactor_;
    int iCountEpoch_;
    int digit_;
    std::string setSavePath_;

    //Smart pointers are used to ensure freeing of memory. The pointers are not always used and can therefore not be freed in a destructor
    std::unique_ptr sizeLayer;
    std::unique_ptr pLayer;

public:
    arma::dvec cost;

    NeuralNetwork();

    void initializeLayers(int, int *, std::string);
    void setHyperParameters(double, double, double);
    void layerInfo();
    void training(const arma::dmat &, const arma::uvec &);
    arma::uvec yVectorGenerator(const arma::uword &);
    arma::dvec sigmoid(arma::dvec &);
    arma::dvec Dsigmoid(arma::dvec &);
    int computePerformance(const arma::dmat &, const arma::uvec &);
    int feedForward(const arma::dvec &);
    void setLearningReductionParameters(double, int);
    void reduceLearnRate(double);
    void storeResults();
    void loadResults(const std::string &, int, int *);
};

Neural_Network.cpp

#include 
#include 
#include 
#include 
#include "Neural_Network.h"

NeuralNetwork::NeuralNetwork() :
    learnSetSize_(100),
    learnReductionCycle_(1000),
    learnReductionFactor_(1),
    learnRate_(0.1),
    regularization_(0),
    halfRegularization_(regularization_ / 2),
    iCountEpoch_(0)
{}


void NeuralNetwork::initializeLayers(int numOfLayers, int *pLayerSize, std::string setSavePath)
{
    ///////////////////////////////////////////////////////
    /// Creates layers and sets component sizes.
    /// Layers are initialized ready for training
    //////////////////////////////////////////////////////
    setSavePath_ = setSavePath;
    numOfLayers_ = numOfLayers;
    sizeLayer = std::unique_ptr(new int(numOfLayers_));
    for (int iLayer = 0; iLayer < numOfLayers_; iLayer++)
    {
        sizeLayer(iLayer) = pLayerSize(iLayer);
    }

    /// Create the layers and initialize parameters;
    pLayer = std::unique_ptr(new CLayer(numOfLayers_));
    pLayer(0).a.set_size(sizeLayer(0)); // Treat first layer different as it does not have b, w, nor kD
    for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
    {
        // Initialize: matrix and vector sizes
        pLayer(iLayer).a.set_size(sizeLayer(iLayer));
        pLayer(iLayer).z.set_size(sizeLayer(iLayer));
        pLayer(iLayer).b = arma::randn(sizeLayer(iLayer));
        pLayer(iLayer).w.set_size(sizeLayer(iLayer), sizeLayer(iLayer - 1));
        pLayer(iLayer).kD.set_size(sizeLayer(iLayer));
        pLayer(iLayer).db = pLayer(iLayer).b;
        pLayer(iLayer).dw = pLayer(iLayer).w;

        /// Generate gaussian random generated values with standard deviation dependent on layer sizes.
        std::default_random_engine generator{static_cast(std::chrono::high_resolution_clock::now().time_since_epoch().count())}; // Use high precision time to determine random seed
        std::normal_distribution distribution(0.0, sqrt((double)sizeLayer(iLayer - 1)));                                                    // Generate random values of with stdev based on incoming layer
        for (arma::uword iRow = 0; iRow < sizeLayer(iLayer); iRow++)
        {
            for (arma::uword iCol = 0; iCol < sizeLayer(iLayer - 1); iCol++)
            {
                pLayer(iLayer).w(iRow, iCol) = distribution(generator);
            }
        }
    }
}

void NeuralNetwork::setHyperParameters(double learnSetSize, double learnRate, double regularization)
{
    learnSetSize_ = learnSetSize;
    learnRate_ = learnRate;
    regularization_ = regularization;
    halfRegularization_ = regularization_ / 2;
    std::cout << "Hyper parameters settings:nt- Learning set size = " << learnSetSize_ << "nt- Learning parameter (learnRate_) = " << learnRate_ << "nt- Regularization_ parameter (lambda) = " << regularization_ << "n";
}

void NeuralNetwork::layerInfo()
{
    /// Outputs layers information
    std::cout << "Number of layers: t" << numOfLayers_ << "n";
    // std::cout << "Number of neurons in layer 1: t" << sizeLayer(0) << "n";
    for (int iLayer = 0; iLayer < numOfLayers_; iLayer++)
    {
        std::cout << "Number of neurons in layer " << iLayer + 1 << ": t" << sizeLayer(iLayer) << "n";
    }

    for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
    {
        std::cout << "Weight matrix size (rows by cols) to layer " << iLayer + 1 << ": t" << pLayer(iLayer).w.n_rows << " x " << pLayer(iLayer).w.n_cols << "n";
    }
}

void NeuralNetwork::training(const arma::dmat &trainingSet, const arma::uvec &trainingLabels)
{
    ///////////////////////////////////////////////////////
    /// Training the neural network by feeding it one epoch
    ///////////////////////////////////////////////////////
    /// Initialize
    int numOfCol = trainingSet.n_cols;
    int numOfRow = trainingSet.n_rows;
    arma::uvec yVector(sizeLayer(numOfLayers_ - 1));
    arma::uvec oneVector(sizeLayer(numOfLayers_ - 1), arma::fill::ones);
    arma::uvec sampleStack_i = arma::randperm(numOfCol);

    /// Reduce learnRate_ if -reduceLearnRate is used
    if(iCountEpoch_ % learnReductionCycle_ == 0 && iCountEpoch_ != 0)
    {
        reduceLearnRate(learnReductionFactor_);
    }

    int numOfCyclesPerEpoch = numOfCol / learnSetSize_; // Compute amount of cycles making up one epoch and only loop over complete cycles, omitting remaining samples
    /// Cycle through the epoch and apply learning after each cycle
    cost = arma::zeros(numOfCyclesPerEpoch);
    for (int iCycle = 0; iCycle < numOfCyclesPerEpoch; iCycle++)
    {
        int iSampleOffset = iCycle * learnSetSize_;

        /// Set dw and db to zero before each cycle
        for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
        {
            pLayer(iLayer).db.zeros(pLayer(iLayer).db.n_rows, pLayer(iLayer).db.n_cols);
            pLayer(iLayer).dw.zeros(pLayer(iLayer).dw.n_rows, pLayer(iLayer).dw.n_cols);
        }

        for (int iSample = 0; iSample < learnSetSize_; iSample++)
        {
            /// Load the image and create label vector (yVector)
            pLayer(0).a = trainingSet.col(sampleStack_i(iSample + iSampleOffset));
            yVector = yVectorGenerator(trainingLabels(sampleStack_i(iSample + iSampleOffset)));

            /// Feed forward
            digit_ = feedForward(pLayer(0).a);

            /// Compute cost (-= is used instead of -1*)
            cost(iCycle) -= as_scalar(trans(yVector) * log(pLayer(numOfLayers_ - 1).a) + trans(oneVector - yVector) * log(oneVector - pLayer(numOfLayers_ - 1).a));
            /// Add regularization_ term:
            if (regularization_ != 0)  // Skip overhead computation in case of 0
            {
                for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
                {
                    cost(iCycle) += halfRegularization_ * accu(pLayer(iLayer).w % pLayer(iLayer).w);  //Expensive
                }
            }

            /// Back propagation
            /// Compute error terms: dC/dz
            pLayer(numOfLayers_ - 1).kD = pLayer(numOfLayers_ - 1).a - yVector;
            for (int iLayer = numOfLayers_ - 2; iLayer > 0; iLayer--)
            {
                pLayer(iLayer).kD = pLayer(iLayer + 1).w.t() * pLayer(iLayer + 1).kD % Dsigmoid(pLayer(iLayer).z);
            }
            /// Compute gradient descent of w and b (seperate loop for clarity)
            for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
            {
                pLayer(iLayer).dw += arma::kron(pLayer(iLayer).kD, pLayer(iLayer - 1).a.t());
                pLayer(iLayer).db += pLayer(iLayer).kD;
            }
        }

        /// Apply gradient descent on w and b
        for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
        {
            pLayer(iLayer).w -= learnRate_ * (pLayer(iLayer).dw + regularization_ * pLayer(iLayer).w) / learnSetSize_; // with regularization_ term
            pLayer(iLayer).b -= learnRate_ * pLayer(iLayer).db / learnSetSize_;
        }

        cost = cost / learnSetSize_;
    }
    iCountEpoch_++;
}

arma::uvec NeuralNetwork::yVectorGenerator(const arma::uword &label)
{
    /// Generates a vector representation of the label: vector of zeros, with at the labelth index a 1
    arma::uvec y = arma::zeros(sizeLayer(numOfLayers_ - 1));
    y(label) = 1;
    return y;
}

arma::dvec NeuralNetwork::sigmoid(arma::dvec &z)
{
    return 1 / (1 + exp(-z));
}

arma::dvec NeuralNetwork::Dsigmoid(arma::dvec &z)
{
    arma::dvec dS = sigmoid(z);
    return (dS % (1 - dS)); // %: Schur product, i.e. element-wise product
}

int NeuralNetwork::computePerformance(const arma::dmat &testSet, const arma::uvec &testLabels)
{
    ////////////////////////////////////////////
    /// Compute network performance based on the test set
    ////////////////////////////////////////////

    int iCountCorrect = 0;
    int sizeSet = testSet.n_cols;
    for (int iSample = 0; iSample < sizeSet; iSample++)
    {
        // Load testimage & apply feedforward. Count the correct answers
        if (feedForward(testSet.col(iSample)) == testLabels(iSample))
        {
            iCountCorrect++;
        }
    }
    std::cout << "Performance: " << iCountCorrect << " / " << sizeSet << "n";
    return iCountCorrect;
}

int NeuralNetwork::feedForward(const arma::dvec &imVector)
{
    /// Apply feedforward to determine and return the network answer
    pLayer(0).a = imVector;
    for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
    {
        pLayer(iLayer).z = pLayer(iLayer).w * pLayer(iLayer - 1).a + pLayer(iLayer).b;
        pLayer(iLayer).a = sigmoid(pLayer(iLayer).z);
    }
    return pLayer(numOfLayers_ - 1).a.index_max();
}

void NeuralNetwork::setLearningReductionParameters(double learnReductionFactor, int learnReductionCycle)
{
    learnReductionFactor_ = learnReductionFactor;
    learnReductionCycle_ = learnReductionCycle;
    std::cout << "Learning rate reduction factor: " << learnReductionFactor_ << "n";
    std::cout << "Learning rate reduction cycle: " << learnReductionCycle_ << "n";
}

void NeuralNetwork::reduceLearnRate(double factor)
{
    learnRate_ = learnRate_ / factor;
    std::cout << "learnRate_ reduced to:t" << learnRate_ << "n";
}

void NeuralNetwork::storeResults()
{
    /// Store essential parameters of the network: weights and biases
    for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
    {
        pLayer(iLayer).w.save(setSavePath_ + "/w" + std::to_string(iLayer + 1));
        pLayer(iLayer).b.save(setSavePath_ + "/b" + std::to_string(iLayer + 1));
    }
}

void NeuralNetwork::loadResults(const std::string &setSavePath, int numOfLayers, int *layerSize)
{
    setSavePath_ = setSavePath;
    numOfLayers_ = numOfLayers;

    /// Load the actual stored data
    for (int iLayer = 1; iLayer < numOfLayers_; iLayer++)
    {
        std::cout << "Loading file: " << (setSavePath_ + "/w" + std::to_string(iLayer + 1)) << "n";
        pLayer(iLayer).w.load(setSavePath_ + "/w" + std::to_string(iLayer + 1));
        std::cout << "Loading file: " << (setSavePath_ + "/b" + std::to_string(iLayer + 1)) << "n";
        pLayer(iLayer).b.load(setSavePath_ + "/b" + std::to_string(iLayer + 1));
    }

    layerInfo();
}

Main.cpp

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include "Neural_Network.h"
#include "ReadMNIST.h"
#include "Visualization.h"


std::string setPathSave(std::string const setPath)
{
    /// Make sure Result_Network directory exists
    if (!boost::filesystem::exists(setPath))
    {
        boost::filesystem::create_directory(setPath);
    }

    /// Set save path to a unique path of 'Results_##', found by incrementing from 1 
    /// to 32. If the full range is used, the save path is set to 'Result_32'
    std::string setSavePath;
    for (int iFolder = 1; iFolder < 33; iFolder++)
    {
        setSavePath = setPath + "/Results_" + std::to_string(iFolder);
        if (!boost::filesystem::exists(setSavePath))
        {
            boost::filesystem::create_directory(setSavePath);
            break;
        }
    }

    std::cout << "Save path is set to: " << setSavePath << "n";
    return setSavePath;
}

void showUsage()
{
    std::cout << std::left << std::setw(92) << "Options available in this program:" << std::endl;
    std::cout << std::setw(2) << "" << std::setw(18) << "-train" << std::setw(72) << "Train a new neural network. This mode requires the training set and " << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "labels. See training options below for more details." << std::endl;
    std::cout << std::setw(2) << "" << std::setw(18) << "-test" << std::setw(72) << "Test a trained network. This mode requires a trained network stored in " << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "Results_Network and the test set. After '-test' refer to the folder " << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "containing the results by the trailing number in the folder name, e.g." << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "'-test 1' to test the network in 'Network_Results/Results_1'. See test " << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "options below for more details.n"
              << std::endl;

    std::cout << std::left << std::setw(92) << "Training options: " << std::endl;
    std::cout << std::setw(2)  << "" << std::setw(18) << "-layers" << std::setw(72) << "Set the total amount of layers and layer sizes used in the network," << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "including the input and output layer. After '-layers', the total number" << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "of layers is required. Thereafter, the layer size should be given in" << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "curly brackets, e.g. 'layers 3 {784,30,10}'." << std::endl;
    std::cout << std::setw(2)  << "" << std::setw(18) << "-param" << std::setw(72) << "Set learning hyperparameters. Parameters which are to be set are: batch" << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "size before learning step, learning rate, and the regularization" << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "parameter, respectively. In case no regularization is to be used, the" << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "parameter is to be set to zero, e.g, '-param {1000,0.1,0}'" << std::endl;
    std::cout << std::setw(2)  << "" << std::setw(18) << "-reduceLearning" << std::setw(72) << "Used to reduce the learning parameter by {factor x, per y epoch}," << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "e.g. -reduceLearning {2,20}.n"
              << std::endl;

    std::cout << std::left << std::setw(92) << "Test options:" << std::endl;
    std::cout << std::setw(2)  << "" << std::setw(18) << "-display" << std::setw(72) << "Opens a window to visualize the test images in a random sequence." << std::endl;
    std::cout << std::setw(20) << "" << std::setw(72) << "Visualization can be stopped by pressing ." << std::endl;
}

int main(int argc, const char **argv)
{
    /// Test if sufficient arguments are given
    if (argc < 2)
    {
        std::cerr << "No arguments are given. Use --help to show options.nTerminating program." << std::endl;
        return 1;
    }

    /// Initialize paths
    std::string const setPath = getCurrentDir(); // part of "readmnist.h"
    std::string const setPathTrainingImages = setPath + "/../Training_images/train-images.idx3-ubyte";
    std::string const setPathTrainingLabels = setPath + "/../Training_images/train-labels.idx1-ubyte";
    std::string const setPathTestImages = setPath + "/../Test_images/t10k-images.idx3-ubyte";
    std::string const setPathTestLabels = setPath + "/../Test_images/t10k-labels.idx1-ubyte";
    std::string const setPathResults = setPath + "/../Results_Network";

    NeuralNetwork network;

    /// Interpret if program is used for training or testing
    if (std::string(argv(1)) == "-train")
    {
        /// Determine path to store results:
        std::string setSavePath = setPathSave(setPathResults);

        /// Store file containing input arguments:
        std::ofstream outputFile;
        outputFile.open(setSavePath + "/Input_parameters");
        for (int iArgv = 2; iArgv < argc + 1; iArgv++)
        {
            outputFile << argv(iArgv) << "t";
        }
        outputFile.close();

        // Cycle through arguments given and apply settings to the neural network
        for (int iArgc = 2; iArgc < argc; iArgc++)
        {
            if (std::string(argv(iArgc)) == "-layers")
            {
                /// Used to set the layers of the neural network.
                /// The first trailing argument should be the amount of layers. Subsequent the layer sizes are to be given in seperate arguments,
                /// starting from the input layer, up to the output layer. E.g. '-layers 3 {784,30,10}'
                int *pLayers = new int(atoi(argv(iArgc + 1)));
                std::cout << "Layers found: n";
                for (int iLayer = 0; iLayer < atoi(argv(iArgc + 1)); iLayer++)
                {
                    pLayers(iLayer) = atoi(argv(iArgc + 2 + iLayer));
                    std::cout << pLayers(iLayer) << "t";
                }
                std::cout << "n";
                network.initializeLayers(atoi(argv(iArgc + 1)), pLayers, setSavePath);
                delete() pLayers;
                network.layerInfo();
                iArgc += atoi(argv(iArgc + 1)) + 1;
            }
            else if (std::string(argv(iArgc)) == "-param")
            {
                /// Used to set hyperparameters directly related to learning { samplesize before learning, eta (learning rate), lambda (regulatization)}
                network.setHyperParameters(atof(argv(iArgc + 1)), atof(argv(iArgc + 2)), atof(argv(iArgc + 3)));
                iArgc += 3;
            }
            else if (std::string(argv(iArgc)) == "-reduceLearning")
            {
                /// Use to reduce learning rate at given intervals. Parameter order: { reduction factor, after # cycles }
                network.setLearningReductionParameters(atof(argv(iArgc + 1)), atoi(argv(iArgc + 2)));
                iArgc += 2;
            }
            else
            {
                std::cerr << "The argument '" << argv(iArgc) << "' is unknown to the program. Use --help to show viable options." << std::endl;
                return 2;
            }
        }

        /// Load data for training:
        std::cout << "Loading data...n";
        // Reads images and returns a matrix(pxValue, numOfImages)
        arma::dmat const trainingSet = readMnistImages(setPathTrainingImages);
        arma::uvec const trainingLabels = readMnistLabels(setPathTrainingLabels, trainingSet.n_cols);

        // Read test images to determine the score
        arma::dmat const testSet = readMnistImages(setPathTestImages);
        arma::uvec const testLabels = readMnistLabels(setPathTestLabels, testSet.n_cols);

        /// Start training:
        int iCountScore = 0;
        int iEpocheCount = 0;
        while (iEpocheCount < 70)
        {
            // Perform a training cycle (one epoche)
            network.training(trainingSet, trainingLabels);
            iEpocheCount += 1;

            std::cout << "Epoche counter: " << iEpocheCount << "ttAverage cost: " << arma::mean(network.cost) << "n";
            iCountScore = network.computePerformance(testSet, testLabels);

            /// Store results every epoche
            network.storeResults();
        }
    }
    else if (std::string(argv(1)) == "-test")
    {
        /// Load test files
        std::cout << "Loading data...n";
        arma::dmat const testSet = readMnistImages(setPathTestImages);
        arma::uvec const testLabels = readMnistLabels(setPathTestLabels, testSet.n_cols);

        /// Read parameters from parameter file
        std::ifstream inFile;
        std::string const setPathToLoad = setPathResults + "/Results_" + argv(2) + "/Input_parameters";

        inFile.open(setPathToLoad);
        if (inFile.is_open())
        {
            /// Read parameters to determine set correct network size
            int numOfLayers;
            int *pLayer;
            std::string arg;
            while (inFile >> arg)
            {
                if (arg == "-layers")
                {
                    inFile >> arg;
                    numOfLayers = stoi(arg);
                    pLayer = new int(numOfLayers);
                    for (int iLayer = 0; iLayer < numOfLayers; iLayer++)
                    {
                        inFile >> arg;
                        pLayer(iLayer) = stoi(arg);
                    }

                    /// Initialize weights and biases sizes and load results
                    network.initializeLayers(numOfLayers, pLayer, setPathResults + "/Results_" + argv(2));
                    network.loadResults(setPathResults + "/Results_" + argv(2), numOfLayers, pLayer);
                }
            }
            /// Compute and output the score
            network.computePerformance(testSet, testLabels);
            inFile.close();
            delete() pLayer;
        }
        else
        {
            std::cerr << "Unable to open a result file: " << setPathToLoad << std::endl;
            return 3;
        }

        // Cycle through arguments given and apply settings
        for (int iArgc = 3; iArgc < argc; iArgc++)
        {
            if (std::string(argv(iArgc)) == "-display")
            {
                /// Display results in random order
                arma::arma_rng::set_seed(std::chrono::high_resolution_clock::now().time_since_epoch().count());
                arma::uvec sequence = arma::randperm(testSet.n_cols);

                int digit;
                std::string strDigit;
                int countDisplays = 0;
                arma::Mat imArma;
                for (arma::uword iSequence : sequence)
                {
                    /// Run a sample through the network and obtain result
                    digit = -1;
                    digit = network.feedForward(testSet.col(iSequence));
                    strDigit = std::to_string(digit);

                    /// Reshape the image vector into a matrix and convert to openCV format
                    imArma = reshape(round(testSet.col(iSequence) * 256), 28, 28);
                    cv::Mat imDigit(28, 28, CV_64FC1, imArma.memptr());

                    /// Display the sample image with the networks answer
                    displayImage(imDigit, strDigit);
                    countDisplays++;

                    /// Give option to end the program
                    if (cv::waitKey(3000) == 'q')
                    {
                        break;
                    };
                }
            }
            else
            {
                std::cerr << "The argument '" << argv(iArgc) << "' is unknown to the program. Use --help to show viable options." << std::endl;
                return 2;
            }
        }
    }
    else if (std::string(argv(1)) == "--help")
    {
        showUsage();
    }
    else
    {
        std::cerr << "The argument " << argv(1) << " is unknown to this program. Use --help to show viable options." << std::endl;
        return 2;
    }
    return 0;
}

ReadMNIST.h

arma::dmat readMnistImages( std::string);
arma::uvec readMnistLabels( std::string, arma::uword );
std::string getCurrentDir();

ReadMNIST.cpp

#include 
#include 
#include 
#include "ReadMNIST.h"

#ifdef WINDOWS
#include 
#define GetCurrentDir _getcwd
#else
#include 
#define GetCurrentDir getcwd
#endif

// Miscellaneous function
int reverseInt(int iSample)
{
    unsigned char ch1, ch2, ch3, ch4;
    ch1 = iSample & 255;
    ch2 = (iSample >> 8) & 255;
    ch3 = (iSample >> 16) & 255;
    ch4 = (iSample >> 24) & 255;
    return ((int)ch1 << 24) + ((int)ch2 << 16) + ((int)ch3 << 8) + ch4;
}

// Return a matrix containing the trainingset images. Format: (numOfImages, pxValue)
arma::dmat readMnistImages(std::string setPath)
{
    arma::umat imSet;
    std::ifstream file(setPath, std::ios::binary);
    if (file.is_open())
    {
        int magicNumber = 0;
        int numOfImages = 0;
        int imRows = 0;
        int imCols = 0;
        file.read((char *)&magicNumber, sizeof(magicNumber));
        magicNumber = reverseInt(magicNumber);
        file.read((char *)&numOfImages, sizeof(numOfImages));
        numOfImages = reverseInt(numOfImages);
        file.read((char *)&imRows, sizeof(imRows));
        imRows = reverseInt(imRows);
        file.read((char *)&imCols, sizeof(imCols));
        imCols = reverseInt(imCols);

        std::cout << "Images in the set: " << numOfImages << "n";
        std::cout << "Image size: " << imRows << "*" << imCols << "n";
        imSet.resize(numOfImages, imRows * imCols);

        for (int i = 0; i < numOfImages; ++i)
        {
            for (int r = 0; r < (imRows * imCols); ++r)
            {
                unsigned char input = 0;
                file.read((char *)&input, sizeof(input));
                imSet(i, r) = (double)input;
            }
        }
    }
    return (arma::conv_to::from(imSet.t())/256);
}

// Return a column containing the labels per image
arma::uvec readMnistLabels(std::string setPath, arma::uword numOfLabels)
{
    arma::uvec labelVector(numOfLabels);
    std::cout << "Number of labels: " << numOfLabels << "nn";

    std::ifstream file(setPath, std::ios::binary);
    if (file.is_open())
    {
        int magicNumber = 0;
        int numOfLabels = 0;
        file.read((char *)&magicNumber, sizeof(magicNumber));
        magicNumber = reverseInt(magicNumber);
        file.read((char *)&numOfLabels, sizeof(numOfLabels));
        numOfLabels = reverseInt(numOfLabels);

        for (int iSample = 0; iSample < numOfLabels; ++iSample)
        {
            unsigned char input = 0;
            file.read((char *)&input, sizeof(input));
            labelVector(iSample) = (double)input;
        }
    }
    return labelVector;
}

std::string getCurrentDir() {
   char buff(FILENAME_MAX); //create string buffer to hold path
   GetCurrentDir( buff, FILENAME_MAX );
   std::string currentWorkingDir(buff);
   return currentWorkingDir;
}

Visualization h

void displayImage( const cv::Mat &, const std::string );

Visualization.cpp

#include 
#include 

void displayImage(const cv::Mat &im, const std::string strDigit)
{
    ////////////////////////////////////////////////////////////////////////////////////////////
    /// Scales the image into readable size and prints the network result onto image
    ////////////////////////////////////////////////////////////////////////////////////////////

    cv::Mat imScaled;
    cv::resize(im, imScaled, cv::Size(280, 280));

    // Write digit label on image
    cv::putText(imScaled,
                strDigit,
                cv::Point(5, 20),               // Coordinates
                cv::FONT_HERSHEY_COMPLEX_SMALL, // Font
                1.0,                            // Scale. 2.0 = 2x bigger
                cv::Scalar(255, 0, 0),          // BGR Color
                1);                             // Line Thickness (Optional)

    /// Write required action to close the program
    cv::putText(imScaled,
                "Press  to close",
                cv::Point(5, 275),              // Coordinates
                cv::FONT_HERSHEY_COMPLEX_SMALL, // Font
                0.5,                            // Scale. 2.0 = 2x bigger
                cv::Scalar(255, 0, 0),          // BGR Color
                1);                             // Line Thickness (Optional)

    cv::imshow("Test image", imScaled);
}