machine learning – neural network from scratch in Python

After seeing Week 5 of the machine learning course on Coursera by Andrew Ng, I decided to use Python to write a simple neural network from scratch. Here is my code:

import numpy as np
import csv
global e
global epsilon
global a
global lam
global itr
e = 2.718281828
epsilon = 0.12
a = 1
lam = 1
itr = 1000

# The sigmoid function(and its derivative)
def sigmoid(x, derivative=False):
    if derivative:
        return sigmoid(x) * (1 - sigmoid(x))
    return 1 / (1 + e**-x)

# The cost function
def J(X, theta1, theta2, y, lam, m):
    j = 0 
    for i in range(m):
        # The current case
        currX = X(i).reshape(X(i).shape(0), 1)
        z2 = theta1 @ currX
        a2 = sigmoid(z2)
        a2 = np.append((1), a2).reshape(a2.shape(0) + 1, 1)
        z3 = theta2 @ a2
        a3 = sigmoid(z3)
        j += sum(-y(i) * np.log(a3) - (1 - y(i)) * np.log(1 - a3)) / m + (lam / (2 * m)) * (sum(sum(theta1(:, 1:) ** 2)) + sum(sum(theta2(:, 1:) ** 2)))
    return j

# The gradients
def gradient(X, theta1, theta2, y, lam, m):
    theta1Grad = np.zeros(theta1.shape)
    theta2Grad = np.zeros(theta2.shape)
    Delta1 = np.zeros(theta1.shape)
    Delta2 = np.zeros(theta2.shape)
    for i in range(m):
        # The current case
        currX = X(i).reshape(X(i).shape(0), 1)
        z2 = theta1 @ currX
        a2 = sigmoid(z2)
        a2 = np.append((1), a2).reshape(a2.shape(0) + 1, 1)
        z3 = theta2 @ a2
        a3 = sigmoid(z3)
        delta3 = a3 - y(i)
        delta2 = theta2(:, 1:).T @ delta3 * sigmoid(z2, derivative=True)
        Delta1 += delta2 @ currX.reshape(1, -1)
        Delta2 += delta3 * a2.reshape(1, -1)
    theta1Grad = Delta1 / m
    theta2Grad = Delta2 / m
    theta1Grad(:, 1:) += (lam / m) * theta1(:, 1:)
    theta2Grad(:, 1:) += (lam / m) * theta2(:, 1:)
    thetaGrad = np.append(theta1Grad.reshape(theta1Grad.shape(0) * theta1Grad.shape(1), 1), theta2Grad.reshape(theta2Grad.shape(0) * theta2Grad.shape(1), 1))
    thetaGrad = thetaGrad.reshape(thetaGrad.shape(0), 1)
    return thetaGrad

# Gradient descent
def gradientDescent(X, theta1, theta2, y, lam, m):
    for i in range(itr):
        grad = gradient(X, theta1, theta2, y, lam, m)
        theta1Grad = grad(0:theta1.shape(0) * theta1.shape(1)).reshape(theta1.shape)
        theta2Grad = grad(theta1.shape(0) * theta1.shape(1):).reshape(theta2.shape)
        theta1 = theta1 - a * theta1Grad
        theta2 = theta2 - a * theta2Grad
    return (theta1, theta2)

with open('data.csv', 'r') as f:
    data = csv.reader(f)
    d = ()
    c = 0
    # Read the data
    for row in data:
        # Don't add the first line(it's our features' labels)
        if c == 0:
            c += 1
            continue
        curr_row = ()
        k = 0
        for j in row:
            if j != '':
                if k == 1:
                    # Add a 1 between the y and x values(for the bias)
                    curr_row.append(1)
                curr_row.append(float(j))   
                k += 1
        d.append(curr_row)
    d = np.array(d)
    x = d(:, 1:)
    y = d(:, 0)
    # Split the data into training cases(80%) and test cases(20%)
    x_train = x(0:(d.shape(0)//5) * 4, :)
    y_train = y(0:(d.shape(0)//5) * 4)
    x_test = x((d.shape(0)//5) * 4 : d.shape(0), :)
    y_test = y((d.shape(0)//5) * 4 : d.shape(0))
    # Initialize theta(s)
    theta1 = np.random.rand(5, x(0).shape(0)) * 2 * epsilon - epsilon
    theta2 = np.random.rand(1, 6) * 2 * epsilon - epsilon
    print(J(x_train, theta1, theta2, y_train, lam, x_train.shape(0)))
    theta1, theta2 = gradientDescent(x_train, theta1, theta2, y_train, lam, x_train.shape(0))

Please note that my data has only 2 possible outputs, so no one-to-all classification is required.

Thank you in advance!

Settings – How can I disable the downloaded neural network for speech output?

My voice-to-text feature worked great on my Pixel 3 until I decided to accept the optional downloaded neural network, which provides "faster detection" and works offline. The result was an amazing reduction in accuracy until it became unusable, and I want to disable the feature and return to normal detection.

How can I do that?

Tensorflow – Can a neural network independently change its learning parameters, while the error between what was predicted and what was not the least in R?

I ran a script that predicted the usd / btc pair. The data comes from open source
https://www.cryptodatadownload.com/apac/
https://www.cryptodatadownload.com/cdd/Binance_BTCUSDT_1h.csv
Here dput (but you can use records from the original website)

structure(list(Date = structure(c(106L, 103L, 101L, 99L, 97L, 
95L, 93L, 91L, 89L, 87L, 85L, 83L, 105L, 102L, 100L, 98L, 96L, 
94L, 92L, 90L, 88L, 86L, 84L, 82L, 104L, 79L, 77L, 75L, 73L, 
71L, 69L, 67L, 65L, 63L, 61L, 59L, 81L, 78L, 76L, 74L, 72L, 70L, 
68L, 66L, 64L, 62L, 60L, 58L, 80L, 55L, 53L, 51L, 49L, 47L, 45L, 
43L, 41L, 39L, 37L, 35L, 57L, 54L, 52L, 50L, 48L, 46L, 44L, 42L, 
40L, 38L, 36L, 34L, 56L, 31L, 29L, 27L, 25L, 23L, 21L, 19L, 17L, 
15L, 13L, 11L, 33L, 30L, 28L, 26L, 24L, 22L, 20L, 18L, 16L, 14L, 
12L, 10L, 32L, 9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L), .Label = c("2019-08-09 03-PM", 
"2019-08-09 04-PM", "2019-08-09 05-PM", "2019-08-09 06-PM", "2019-08-09 07-PM", 
"2019-08-09 08-PM", "2019-08-09 09-PM", "2019-08-09 10-PM", "2019-08-09 11-PM", 
"2019-08-10 01-AM", "2019-08-10 01-PM", "2019-08-10 02-AM", "2019-08-10 02-PM", 
"2019-08-10 03-AM", "2019-08-10 03-PM", "2019-08-10 04-AM", "2019-08-10 04-PM", 
"2019-08-10 05-AM", "2019-08-10 05-PM", "2019-08-10 06-AM", "2019-08-10 06-PM", 
"2019-08-10 07-AM", "2019-08-10 07-PM", "2019-08-10 08-AM", "2019-08-10 08-PM", 
"2019-08-10 09-AM", "2019-08-10 09-PM", "2019-08-10 10-AM", "2019-08-10 10-PM", 
"2019-08-10 11-AM", "2019-08-10 11-PM", "2019-08-10 12-AM", "2019-08-10 12-PM", 
"2019-08-11 01-AM", "2019-08-11 01-PM", "2019-08-11 02-AM", "2019-08-11 02-PM", 
"2019-08-11 03-AM", "2019-08-11 03-PM", "2019-08-11 04-AM", "2019-08-11 04-PM", 
"2019-08-11 05-AM", "2019-08-11 05-PM", "2019-08-11 06-AM", "2019-08-11 06-PM", 
"2019-08-11 07-AM", "2019-08-11 07-PM", "2019-08-11 08-AM", "2019-08-11 08-PM", 
"2019-08-11 09-AM", "2019-08-11 09-PM", "2019-08-11 10-AM", "2019-08-11 10-PM", 
"2019-08-11 11-AM", "2019-08-11 11-PM", "2019-08-11 12-AM", "2019-08-11 12-PM", 
"2019-08-12 01-AM", "2019-08-12 01-PM", "2019-08-12 02-AM", "2019-08-12 02-PM", 
"2019-08-12 03-AM", "2019-08-12 03-PM", "2019-08-12 04-AM", "2019-08-12 04-PM", 
"2019-08-12 05-AM", "2019-08-12 05-PM", "2019-08-12 06-AM", "2019-08-12 06-PM", 
"2019-08-12 07-AM", "2019-08-12 07-PM", "2019-08-12 08-AM", "2019-08-12 08-PM", 
"2019-08-12 09-AM", "2019-08-12 09-PM", "2019-08-12 10-AM", "2019-08-12 10-PM", 
"2019-08-12 11-AM", "2019-08-12 11-PM", "2019-08-12 12-AM", "2019-08-12 12-PM", 
"2019-08-13 01-AM", "2019-08-13 01-PM", "2019-08-13 02-AM", "2019-08-13 02-PM", 
"2019-08-13 03-AM", "2019-08-13 03-PM", "2019-08-13 04-AM", "2019-08-13 04-PM", 
"2019-08-13 05-AM", "2019-08-13 05-PM", "2019-08-13 06-AM", "2019-08-13 06-PM", 
"2019-08-13 07-AM", "2019-08-13 07-PM", "2019-08-13 08-AM", "2019-08-13 08-PM", 
"2019-08-13 09-AM", "2019-08-13 09-PM", "2019-08-13 10-AM", "2019-08-13 10-PM", 
"2019-08-13 11-AM", "2019-08-13 11-PM", "2019-08-13 12-AM", "2019-08-13 12-PM", 
"2019-08-14 12-AM"), class = "factor"), Symbol = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "BTCUSDT", class = "factor"), 
    Open = c(10892.71, 10897.12, 10892.47, 10914.24, 10936.18, 
    10925.37, 10910, 10850, 10951.09, 11000.17, 11152.52, 11253.83, 
    11280.39, 11260.02, 11302.77, 11347.1, 11364.96, 11363.33, 
    11393, 11353.07, 11394.06, 11393.18, 11399.06, 11427.68, 
    11396.08, 11418.78, 11397.93, 11438.14, 11443.89, 11419, 
    11415.14, 11410.12, 11399.11, 11380.48, 11379.18, 11397.54, 
    11328.01, 11379.51, 11357.8, 11371.91, 11411.97, 11380.99, 
    11399, 11435.95, 11446.44, 11450, 11494.76, 11480.8, 11549.97, 
    11514.18, 11503.27, 11435, 11383.14, 11397.01, 11372.02, 
    11402.03, 11378, 11388.74, 11404.22, 11458.91, 11469.94, 
    11342.82, 11364, 11324.68, 11371.71, 11353.5, 11374.91, 11378.54, 
    11399.1, 11377.64, 11365.17, 11402.6, 11309.31, 11323.43, 
    11335.54, 11346.21, 11329.1, 11376.13, 11383.37, 11359.42, 
    11413.87, 11430.38, 11469.13, 11449.35, 11395.23, 11875.71, 
    11892.24, 11841.89, 11829.75, 11824.98, 11847.48, 11803.62, 
    11839.03, 11801.73, 11897.05, 11928.71, 11879.99, 11845, 
    11928.11, 11889.97, 11837.18, 11853.15, 11809.07, 11795.64, 
    11743.37, 11791.96), High = c(10897.48, 10942.67, 10938.99, 
    10924.67, 10981.69, 10962, 10958.87, 10935.93, 10993, 11024.57, 
    11152.52, 11269.62, 11298.93, 11309.98, 11310.73, 11349.28, 
    11364.96, 11388.81, 11393, 11394, 11402.39, 11403.15, 11429.96, 
    11456.16, 11435.54, 11426.83, 11450.86, 11452.75, 11474.03, 
    11449.15, 11429.77, 11441, 11441.03, 11400, 11402.83, 11412.51, 
    11425, 11394.23, 11399.8, 11391.35, 11415.01, 11440.03, 11414.42, 
    11453.2, 11471.29, 11465, 11499, 11539.3, 11577.89, 11599.98, 
    11533, 11566.37, 11450, 11409.32, 11417, 11420, 11445.57, 
    11418.3, 11427, 11482.65, 11479.97, 11482.86, 11400, 11425, 
    11392.64, 11376.99, 11395.23, 11391.99, 11424.94, 11421.05, 
    11399.99, 11411.84, 11465, 11368.19, 11336.24, 11390.33, 
    11397, 11379, 11405, 11431.65, 11422.91, 11443.43, 11469.13, 
    11517, 11472.22, 11879.78, 11904.36, 11897.65, 11884.94, 
    11856, 11879.26, 11890.2, 11850, 11839.04, 11906.55, 11940, 
    11985, 11901.36, 11965, 11933, 11900, 11910, 11859.97, 11823.26, 
    11880, 11793.31), Low = c(10807.94, 10880.01, 10863.64, 10858.04, 
    10913.74, 10905.2, 10882.13, 10788.45, 10850, 10921, 10911, 
    11150, 11250, 11240, 11200, 11255.01, 11326, 11311, 11361, 
    11346.7, 11350, 11370, 11383.3, 11399, 11355.99, 11380.39, 
    11375.28, 11395, 11382.56, 11395.57, 11385.48, 11390, 11375, 
    11340, 11361.42, 11357, 11235.32, 11305.55, 11350, 11343, 
    11313.14, 11343.13, 11340, 11371.15, 11420, 11410, 11420.06, 
    11463.74, 11450, 11441.01, 11480.02, 11402.57, 11301, 11369.89, 
    11360.69, 11360, 11362.77, 11350.01, 11372, 11366.81, 11406.53, 
    11342.82, 11330, 11112.11, 11286.46, 11343.48, 11330.11, 
    11354.14, 11373.01, 11375.12, 11365, 11361.43, 11286.3, 11307.92, 
    11270, 11275, 11316.96, 11270, 11325.91, 11272.1, 11330, 
    11381.82, 11368.88, 11412.48, 11310, 11350, 11851, 11839.95, 
    11827.28, 11809.52, 11801.9, 11794, 11795.47, 11770, 11751.61, 
    11869, 11878.07, 11842.87, 11832.31, 11865.01, 11825.6, 11808.41, 
    11801, 11746.75, 11700, 11720), Close = c(10819.24, 10892.71, 
    10897.12, 10892.47, 10914.24, 10936.18, 10925.37, 10910, 
    10850, 10951.09, 11000.17, 11152.52, 11253.83, 11280.39, 
    11260.02, 11302.77, 11347.1, 11364.96, 11363.33, 11393, 11353.07, 
    11394.06, 11393.18, 11399.06, 11427.68, 11396.08, 11418.78, 
    11397.93, 11438.14, 11443.89, 11419, 11415.14, 11410.12, 
    11399.11, 11380.48, 11379.18, 11397.54, 11328.01, 11379.51, 
    11357.8, 11371.91, 11411.97, 11380.99, 11399, 11435.95, 11446.44, 
    11450, 11494.76, 11480.8, 11549.97, 11514.18, 11503.27, 11435, 
    11383.14, 11397.01, 11372.02, 11402.03, 11378, 11388.74, 
    11404.22, 11458.91, 11469.94, 11342.82, 11364, 11324.68, 
    11371.71, 11353.5, 11374.91, 11378.54, 11399.1, 11377.64, 
    11365.17, 11402.6, 11309.31, 11323.43, 11335.54, 11346.21, 
    11329.1, 11376.13, 11383.37, 11359.42, 11413.87, 11430.38, 
    11469.13, 11449.35, 11395.23, 11875.71, 11892.24, 11841.89, 
    11829.75, 11824.98, 11847.48, 11803.62, 11839.03, 11801.73, 
    11897.05, 11928.71, 11879.99, 11845, 11928.11, 11889.97, 
    11837.18, 11853.15, 11809.07, 11795.64, 11743.37), Volume.BTC = c(1119.56, 
    579.5, 561.43, 850.09, 682.56, 617.53, 1105.72, 3020.41, 
    2357.78, 2000.39, 5458.14, 2918.86, 654.17, 896.48, 2089.63, 
    1631.4, 565.8, 1377.86, 381.81, 619.2, 570.32, 387.16, 336.93, 
    428.26, 610.35, 342.97, 542.96, 409.19, 813.26, 509.92, 458.3, 
    684.83, 963.02, 672.14, 521.18, 799.34, 1657.41, 896.51, 
    565.17, 546.71, 970.66, 635.42, 791.37, 664.24, 390.8, 654.81, 
    483.14, 469.44, 912.2, 1359.48, 811.04, 2042.8, 1133.41, 
    332.52, 555.95, 513.05, 973.45, 662.91, 655.41, 907.32, 966.88, 
    2136.92, 1358.66, 3653.8, 960.84, 905.05, 669.35, 625.36, 
    572.19, 446.91, 514.58, 747.89, 1357.45, 559.18, 931.69, 
    851.28, 729.3, 764.09, 884.2, 2064.46, 1316.81, 1697.72, 
    2195.76, 4019.31, 4265.79, 8173.95, 748.19, 740.37, 665.21, 
    599.46, 1253.97, 755.9, 526.47, 996.29, 1576.33, 1025.63, 
    1801.16, 691.4, 1428.81, 1437.94, 950.33, 2398.67, 1082.97, 
    994.33, 1750.67, 1014.27), Volume.USDT = c(12142045.76, 6322216.18, 
    6120312.86, 9255061.4, 7475780.27, 6752012.58, 12080784.85, 
    32806359.69, 25740945.71, 21954343.98, 60252668.31, 32718259.34, 
    7371427.35, 10114964.27, 23525692.98, 18433278.8, 6416881.08, 
    15645401.55, 4343721.6, 7037666.2, 6482416.59, 4405999.98, 
    3843857.8, 4891800.85, 6954726.39, 3909560.13, 6196097.79, 
    4672977.87, 9308300.35, 5825173.51, 5226352.78, 7818725.4, 
    10982543.81, 7647921.11, 5934175.73, 9096983.31, 18828481.01, 
    10168243.59, 6430176.56, 6216727, 11021222.23, 7232285.47, 
    9005976.08, 7576153.04, 4475591.03, 7494911.65, 5540002.69, 
    5398186.75, 10495847.17, 15674811.19, 9334072.12, 23483919.93, 
    12912854.58, 3786993.9, 6334674.14, 5845465.04, 11110376.11, 
    7543539.36, 7468688.36, 10358829.15, 11067786.55, 24363820.15, 
    15443493.02, 41088066.01, 10896026.28, 10280920.56, 7607472.65, 
    7112394.03, 6519276.04, 5093968.98, 5857007.17, 8516092.75, 
    15445410.51, 6342057.6, 10531093.76, 9640800.68, 8286164.41, 
    8664287.91, 10055172.12, 23445598.6, 14990727.87, 19389301.77, 
    25075139.28, 46112760.38, 48557111.06, 94104710.42, 8883384.19, 
    8787286.45, 7886632.06, 7090759.07, 14848256.74, 8954629.74, 
    6219185.44, 11761440.64, 18626258.93, 12202594.7, 21495300.48, 
    8207614.58, 17002179.53, 17113122.14, 11281462.88, 28477001.51, 
    12816752.5, 11713825.29, 20658712.36, 11921074)), .Names = c("Date", 
"Symbol", "Open", "High", "Low", "Close", "Volume.BTC", "Volume.USDT"
), class = "data.frame", row.names = c(NA, -106L))

and here my own script. I use Tensorflow backend

reticulate::use_condaenv("r-tensorflow")
library(data.table)
library(keras)
library(ggplot2)
dt <- fread("Binance_BTCUSDT_1h.csv", skip = 1)
dt


dt(, Time := purrr::transpose(strsplit(Date, " "))((2)))
# dt(, pm_am := substring(Time, regexpr("(A-z)", Time)))
dt(, Time := gsub("-(A-z)*", "", Time))
dt(, Time := as.numeric(Time))


dt(grep("PM", Date), Time := Time + 12)
dt(, Time := paste0(Time, ":00", ":00"))

dt(grep("12-AM", Date), Time := "00:00:00")

dt(, Date := purrr::transpose(strsplit(Date, " "))((1)))

dt(, Date := as.POSIXct(paste(Date, Time), 
                        format = "%Y-%m-%d %H:%M:%S", tz = "GMT"))

dt(, Time := NULL)


dt <- dt(Date >= as.POSIXct("2019-01-01", format = "%Y-%m-%d"))


dt <- dt(.N:1, )
data <- dt(, .(Open, High, Low,  Close))



p1 <- dt %>%
  ggplot(aes(Date, Open)) +
  geom_point(alpha = 0.5) +
  labs(title = "open")
p1

p2 <- dt %>%
  ggplot(aes(Date, Close)) +
  geom_point(alpha = 0.5) +
  labs(title = "close")
p2

p3 <- dt %>%
  ggplot(aes(Date, High)) +
  geom_point(alpha = 0.5) +
  labs(title = "MAX")
p3

p4 <- dt %>%
  ggplot(aes(Date, Low )) +
  geom_point(alpha = 0.5) +
  labs(title = "MIN")
p4


lookback <- 7 * 24
step <- 1
delay <- 24

batch_size <- 32


generator <- function(data, 
                      lookback,
                      delay,
                      min_index,
                      max_index,
                      batch_size = 32,
                      step = 1) {

  if (is.null(max_index)) max_index <- nrow(data) - delay - 1

  i <- min_index + lookback

  function() {

    if (i + batch_size >= max_index) i <<- min_index + lookback
    rows <- c(i:min(i + batch_size - 1, max_index))
    i <<- i + length(rows)

    samples <- array(0, dim = c(length(rows),
                                lookback / step,
                                dim(data)((-1))))
    # targets <- array(0, dim = c(length(rows), delay, dim(data)((-1))))

    # targets <- array(0, dim = c(length(rows), dim(data)((-1))))
    # Для delay точек
    targets <- array(0, dim = c(length(rows), dim(data)((-1)) * delay))

    for (j in 1:length(rows)) {
      indices <- seq(rows((j)) - lookback, rows((j)) - 1,
                     length.out = dim(samples)((2)))
      samples(j, ,) <- as.matrix(data(indices, ))
      # targets(j, ,) <- as.matrix(data(rows((j)):(rows((j)) + delay - 1), ))
      #targets(j, ) <- as.matrix(data(rows((j)) + delay - 1, ))

      targ <- as.matrix(data(rows((j)):(rows((j)) + delay - 1), ))

      targets(j, ) <- matrix(targ)
    }
    list(samples, targets)
  }

}


train_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 1,
  max_index = 4200,
  step = step, 
  batch_size = batch_size
)

val_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 4201,
  max_index = 4800,
  step = step,
  batch_size = batch_size
)

test_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 4801,
  max_index = 5404,
  step = step,
  batch_size = 1
)

train_steps <- (4201 - lookback) / batch_size

val_steps <- (4800 - 4201 - lookback) / batch_size

test_steps <- (nrow(data) - 4801 - lookback) / batch_size


model <- keras_model_sequential() %>%
  layer_flatten(input_shape = c(lookback / step, dim(data)(-1))) %>%
  layer_dense(units = 16, activation = "relu") %>%
  layer_dense(units = 4 * delay)

model %>% compile(
  optimizer = "adam", #optimizer_rmsprop()
  loss = "mae"
)

history <- model %>% fit_generator(
  train_gen,
  steps_per_epoch = train_steps,
  epochs = 12,
  validation_data = val_gen,
  validation_steps = val_steps
)


# model <- keras_model_sequential() %>%
#     layer_lstm(units            = 50, 
#                input_shape      = c(lookback / step, dim(data)(-1)), 
#                batch_size       = batch_size,
#                return_sequences = TRUE, 
#                stateful         = TRUE) %>% 
#     layer_lstm(units            = 50, 
#                return_sequences = FALSE, 
#                stateful         = TRUE) %>% 
#     layer_dense(units = 4 * 1)
# 
# model %>% 
#     compile(loss = "mae", optimizer = "adam")
# 
# history <- model %>% fit_generator(
#   train_gen,
#   steps_per_epoch = as.integer(train_steps),
#   epochs = 5,
#   validation_data = val_gen,
#   validation_steps = as.integer(val_steps)
# )

plot(history)



preds <- predict_generator(model, 
                           test_gen, steps = 5404-4801)



preds_open <- preds(-1, 24)
preds_high <- preds(-1, 48)
preds_low <- preds(-1, 72)
preds_close <- preds(-1, 96)

res <- data.table("id" = rep(1:24, 4),
                  "preds" = preds(1, ), 
                  "var" = rep(c("Open", "High", "Low", "Close"), each = 24))
res <- dcast(res,  id ~ var, value.var = "preds")
res(, id := NULL)
res <- rbind(res, data.table(Open = preds_open, 
                             High = preds_high, 
                             Low = preds_low, 
                             Close = preds_close))
res <- res(1:604, )
res(, Date := dt(4801:5404, Date))

res <- rbind(dt(4801:5404, .(Date, Open, High, Low,  Close)), res)
res(, true_pred := rep(c("true", "pred"), each = 604))
write.csv(res,"res.csv")

ggplot(res, aes(Date, Open)) +
  geom_point(aes(colour = true_pred)) +
  labs(title = "OPEN")

After the script I see big inaccuracies.
For example, let's look at this plot (to open)
Enter image description here

We see a big difference between the actual values ​​and the predicted ones.
Somewhere the difference is only 46 (very good) and somewhere 2000 (not good), because the BTC is very volatile.
Thus, the question arises as to whether there are ways to "make" the neural network independently (i.e., self) to configure the training parameters until the error does not become minimal. I.U. Set the iteration so that the maximum error for a given date in the training sample is not more than 100.
That is, the difference between the actual value and the predicted value is no longer 100.
Or what should I do to improve my accuracy? Maybe add a covariate, but what is it?

Neural Networks – Placement of categorization and semantic mapping on a mobile robot

I am currently working on a project where I have to create a semantic map of the environment through which my robot moves. An RGB-D camera is mounted on the robot. So, as it moves through the environment, it can perceive a continuous stream of images, each directed through a CNN trained to identify various locations such as kitchen, elevator, corridor, door, etc., and the result the location classification is then iterated through a Bayesian filter to create an occupancy grid map of the environment that is labeled with different location categories (colors corresponding to different locations) as described in this document. An example is shown below:

Enter image description here

Now the problem is here, if you look at yourself Point X Although the robot is in the corridor at this time, the camera already recognizes an office (orange) and therefore identifies part of the corridor as an office.

I am looking for a solution to circumvent this and segment the map as accurately as possible in real time as the robot moves through the environment. This can also be done iteratively, z. For example, to refine the result when the robot crosses point X again sometime in the future.

P.S. The robot is also equipped with two laser rangefinders to create the occupancy map. Maybe I can somehow combine the laser range point cloud with the depth point cloud of the RGB D camera.

Mathematical Optimization – Custom SGD Optimizer in the Mathematica Framework for Neural Networks?

I have a new approach to the SGD optimizer and wanted to test it in Mathematica. It uses gradients to simultaneously maintain the online parabola model for smarter step size selection – I just have to ask for color gradients and to be able to update parameters manually,

Is it feasible within the neural network of Mathematica? I see NetPortGradient, but how do I change parameters?

Maybe there is, it seems, a simple example of a common research direction?

ai – Neural Network Library for Java

I've written a neural network library for Java that seems to work fine for simpler problems (such as the 94.94% accuracy XOR issue). I only get an accuracy of 0.07%. I suspect it's a hidden problem with the implementation itself.

Repository: https://github.com/anirudhgiri/TinyNN4J

Example of using the library:

//(number of inputs, number of nodes in hidden layer, number of outputs)
NeuralNetwork n = new NeuralNetwork(2,3,1);

float() trainingInputs = {0.0,1.0};
float() trainingOutputs = {1.0};

//trains the neural network with the given in
n.train(trainingInputs,trainingOutputs);

float() testData = {1.0,0.0};
float outputs() = n.predict(testData);

Machine Learning – Is this the right way to use a neural network for name generation?

I decided to write a neural network from scratch and want it to generate names. This is the first time I have learned anything related to machine learning.

Before you generate anything, some assumptions are required:

  • The names can be up to 12 characters long
  • All names are given in lowercase letters
  • Each character belongs to the English alphabet

Here is the plan:

       Outputs random         A trained
         characters         neural network -------------------
       -------------        --------------/                   
       | generator | -----> | classifier |      |
       -------------        --------------                   /
                                           -------------------
                                                   /
                                              yes /   no
                                                 /    
                        <-------      
                                                   

The neural network itself looks something like this:
the network

After entering a string (converted correctly), a single value is output – the network's certainty that the specified string is a valid name.

This is the page from which I got my idea: http://www.cs.bham.ac.uk/~jxb/INC/nn.html.

Now. Is my plan useful? Are there better ways to generate credible names? Please tell me if I missed something.

Neural Networks – Implement early stopping after a certain number of epochs

How can I implement Keras' EarlyStopping () callback function after only a certain number of epochs have passed?

My model has many variations in accuracy and loss in the early epochs. So, if you implement an early stop from the beginning, the training for my model will be terminated much too soon. How can I implement an early stop after a certain amount of time (eg 20) when my model has stabilized?

Machine Learning – How to create and train a neural network to predict a stock close at T + 5min using OHLC and other data

I try to create a neural network and train it to test the predictability of short-term stock price movements.

I've put together a 1-minute open, high, low, closing, and volume-based record for a particular stock. The idea is to train a network to analyze the data for times T-1, T-2, T-3, T-4, T-5 and predict the closing price of T + 5. Ideally, the network receives 7 input vectors

{{open}, {high}, {low}, {close}, {volume}, {dayofweek}, {minutes_since_open}}

in the last 5 minutes, d. H. 7 inputs x 5 time steps to create a single output: {shut down} at T + 5.

I'm still a bit rough and have lost my Mathematica skills since I last used it with V7, above all the new features 🙂

Would I really appreciate help …

Here is a link to the record (csv) 🙂