Jul 20, 2024

I Made Neural Network Library in NodeJS to learn ML Fundamentals

The best way to learn is learn the fundamentals. This hands-on approach allowed me to grasp essential concepts, understand the inner workings of neural networks, and appreciate the complexity and beauty of machine learning. In this journey, I not only built a functional library but also gained invaluable knowledge and experience.

To learning machine learning fundamentals, I create my own neural network library “mikir” from scratch and the project available in GitHub!

So, let get started on walkthrough how I made it.

Neurons concept

The first thing that come to neural network and machine learning is neuron. It’s inspired by biological neurons in the brain. In biology, neurons receive signals, process them, and transmit the result to other neurons.

Neuron activation

Components of an artificial neuron

Inputs: these are the incoming signals, often outputs from other neurons or intial data.
Weights: each input has an associated weight, indicating its importance.
Bias: an additional parameter that allows the neuron to shift its activation function.
Summation Function: Combines all weighted inputs and the bias.
Activation Function: Determines the neuron’s output based on the summation.

How neuron works

It receives inputs
Each input is multiplied by its corresponding weight
All weighted inputs are summed together, and the bias is added
This sum is passed through an activation function to produce the output

Understanding weight and biases

These two are confusing concept if you dont have solid example. So let’s break them down:

Weights determine strength of the connection between neurons. They multiply the input values, increasing or decreasing their importance. During training, weights are adjusted to make network’s predictions more accurate. Think of weights like the importance you give to different factors when making a decision.

Bias allows the neuron to shift its activation function left or right. Adding to the sum of weighted inputs before the activation function is applied. Like weights, biases are adjusted during training. Think of bias as threshold that the weighted sum needs to overcome to “activate” the neuron.

Let’s use simple example to illustrate:

imagine a neuron that decides whether to recommend an ice cream flavour based on two inputs: temperature (in Celcius) and your hunger level (on a scale of 0-10).

class IceCreamNeuron {
    constructor() {
        this.weightTemperature = 0.7;  // Temperature is quite important
        this.weightHunger = 0.3;       // Hunger is less important
        this.bias = -15;               // You need some convincing to eat ice cream
    }

    recommend(temperature, hungerLevel) {
        const weightedSum = (temperature * this.weightTemperature) + 
                            (hungerLevel * this.weightHunger);
        const activation = weightedSum + this.bias;
        
        // Simple step function instead of sigmoid for this example
        return activation > 0 ? "Yes, eat ice cream!" : "No, skip the ice cream.";
    }
}

const iceCreamDecider = new IceCreamNeuron();

console.log(iceCreamDecider.recommend(30, 8));  // "Yes, eat ice cream!"
console.log(iceCreamDecider.recommend(15, 2));  // "No, skip the ice cream."

In this example:

Weights: temperature has a higher weight than hunger, meaning it influences the decision more.
Bias: the negative bias means that the weighted sum needs to be at least 15 for the neuron to recommend ice cream. This could represent a general reluctance to eat ice cream unless conditions are really favorable.
Decision making:

On hot day (30C) when you’re quite hungry (8/10), the neuron recommends ice cream: (30 * 0.7) + (8 * 0.3) - 15 = 23.4
On a cool day (15C) when you’re not very hungry (2/10), it doesn’t: (15 * 0.7) + (2 * 0.3) - 15 = -3.9

In neural network, these weights and biases are adjusted during training to make better predictions. For instance, if this neuron consistently recommended ice cream when it shouldn’t, the training process might decrease the weights or lower the bias to make it “more conservative”.

Activation function

The activation function introduces non-linearity, allowing neural network to learn complex patterns. Common activiation functions include:

Sigmoid: maps input to a value between 0 and 1. Useful for predicting probabilities.
ReLU (Rectified Linear Unit): returns 0 for negative inputs, and the input value for positive input.
Tanh: similiar to sigmoid, but maps to values between -1 and 1.

What does it mean to be non-linearity

non-linear vs linear

A linear function is a straight line when graphed. It has the form y = mx + b.
A non-linear function produces a curve or any shape that’s not a straight line when graphed.

Let’s illustrate this with some examples:

Linear example: Temperature conversion Fahrenheit to Celsius: C = (F - 32) * 5/9

This is linear because it’s a straight line when graphed Every 9°F increase always results in a 5°C increase

Non-linear example: Compound interest A = P(1 + r)^t, where A is the final amount, P is principal, r is interest rate, t is time

This is non-linear because it curves upward when graphed The growth accelerates over time, it’s not constant

For instance, in image recognition, the relationship between pixel values and the object in the image is highly non-linear. Non-linear activation functions enable neural networks to capture these complex patterns.

This broader predictive capability is why neural networks with non-linear activation functions are so powerful. They can approximate a wide variety of functions, allowing them to model complex relationships in data that linear models simply cannot capture.

Implement basic neuron

Now, let’s implement a basic neuron class to illustrate these concepts:

class Neuron {
    public weights: number[]
    public bias: number

    constructor(inputCount) {
        this.weights = new Array(inputCount).fill(0).map(() => Math.random());
        this.bias = Math.random();
    }

    activate(inputs) {
        const sum = inputs.reduce((sum, input, index) => sum + input * this.weights[index], 0);
        return this.sigmoid(sum + this.bias);
    }

    sigmoid(x) {
        return 1 / (1 + Math.exp(-x));
    }
}

module.exports = { Neuron };

Let’s break down this implementation

Constructor:

I love defined and book variables in class, we defined weights as array of number and bias as number
Initialized weights and bias with random values between 0 and 1 using Math.random
The number of weights matches the number of Inputs

Activate method:

This is the summation function.

sum variable multiplies each input by its corresponding weight and sums the results
the bias is added to sum
Sigmoid method:

This is the activation function.
It takes the big sum and maps it to value between 0 and 1 using sigmoid formula
The sigmoid function is useful because:

It introduces non-linearity, allowing the network to learn complex patterns
Its output can be interpreted as a probability
It has smooth gradient, beneficial for learning

By the end of the process, our goal is to feed neuron either from input or another neuron’s output, process it and produce output to be feed to other neuron or make a decision (classify) from between 0 and 1.

Neural network components

Input layer: receives intial data
Hidden layer(s): processes information
Output layer: produces final results
Learning: adjusting weights and biases to improve predictions, often uses backpropagation and gradient descent

Every layer has bunch of neurons. Each neuron or nodes receives inputs, applies weights and biases, and produces an output

Neurons are connected between layers. Each connection has an associated weight.

Forward propagation vs Backpropagation

Forward Propagation:

Purpose: To generate predictions or outputs from input data. Direction: Moves from input layer through hidden layers to output layer. Process:

Takes input data and feeds it through the network.
At each neuron, computes weighted sum of inputs and applies activation function.
Passes results to the next layer until reaching the output.

When used: During both training and inference (prediction). In the code: Implemented in the predict method.

Backpropagation:

Purpose: To train the network by adjusting weights and biases. Direction: Moves backwards from output layer through hidden layers to input layer. Process:

Calculates the error at the output layer.
Propagates this error backwards through the network.
Computes gradients to determine how each weight contributed to the error.
Updates weights and biases to minimize error.

When used: Only during training, not during inference. In the code: Implemented in the train method.

Key Differences:

Purpose: Forward propagation makes predictions, backpropagation improves those predictions.
Direction: Forward propagation moves forward through the network, backpropagation moves backward.
Calculation: Forward propagation calculates outputs, backpropagation calculates gradients and errors.
Timing: Forward propagation happens first, followed by backpropagation during training.

You’ll see forward propagation on predict and train method and backpropagation on train method after forward propagation. See the implementation section later.

Error Calculation

Error calculation is the process of determining how far off the network’s predictions are from the actual target values.

const outputErrors = outputs.map((output, i) => {
    const error = targets[i] - output
    return error * output * (1 - output)
})

This code calculates the error for each output neuron:

targets[i] - output is the raw error (difference between target and prediction). error * output * (1 - output) is the error gradient, using the derivative of the sigmoid function.

Weight Updates

Weight updates are how the network “learns” from its errors. The general formula is: new_weight = old_weight + learning_rate * error * input

neuron.weights = neuron.weights.map((weight, j) => {
    return weight + learningRate * outputErrors[i] * hiddenOutputs[j]
})

This updates each weight based on:

The current weight value
The learning rate
The calculated error
The input that contributed to this error (hiddenOutputs[j] for output layer, inputs[j] for hidden layer)

Learning Rate

The learning rate (often denoted as η or alpha) determines the step size at each iteration while moving toward a minimum of the loss function.

Key points about learning rate:

A high learning rate can cause the model to converge quickly but may overshoot the optimal solution.
A low learning rate leads to slower learning but can find more precise solutions.
It often requires tuning to find the optimal value for a specific problem.

Epochs

These three elements work together in the learning process:

Forward propagation produces an output. Error calculation determines how far off this output is from the target. Backpropagation distributes this error back through the network. Weight updates adjust the network parameters to reduce this error. The learning rate controls how big these adjustments are.

This process repeats for many examples (often called epochs), gradually improving the network’s performance.

Implement neural network

import Neuron from './neuron.js'

class NeuralNetwork {
    private hiddenLayer: Neuron[]
    private outputLayer: Neuron[]

    constructor(inputCount: number, hiddenCount: number, outputCount: number) {
        this.hiddenLayer = new Array(hiddenCount).fill(null).map(() => new Neuron(inputCount))
        this.outputLayer = new Array(outputCount).fill(null).map(() => new Neuron(hiddenCount))
    }

    predict(inputs: number[]) {
        const hiddenOutputs = this.hiddenLayer.map(neuron => neuron.activate(inputs))
        return this.outputLayer.map(neuron => neuron.activate(hiddenOutputs))
    }

    train(inputs: number[], targets: number[], learningRate: number = 0.1) {
        // Forward pass
        const hiddenOutputs = this.hiddenLayer.map(neuron => neuron.activate(inputs))
        const outputs = this.outputLayer.map(neuron => neuron.activate(hiddenOutputs))
        // Calculate output layer errors
        const outputErrors = outputs.map((output, i) => {
            const error = targets[i] - output
            return error * output * (1 - output)
        })
        // Update output layer weights
        this.outputLayer.forEach((neuron, i) => {
            neuron.weights = neuron.weights.map((weight, j) => {
                return weight + learningRate * outputErrors[i] * hiddenOutputs[j]
            })
            neuron.bias += learningRate * outputErrors[i]
        })
        // Calculate hidden layer errors
        const hiddenErrors = this.hiddenLayer.map((_, i) => {
            return this.outputLayer.reduce((sum, outputNeuron, j) => {
                return sum + outputErrors[j] * outputNeuron.weights[i]
            }, 0) * hiddenOutputs[i] * (1 - hiddenOutputs[i])
        })
        // Update hidden layer weights
        this.hiddenLayer.forEach((neuron, i) => {
            neuron.weights = neuron.weights.map((weight, j) => {
                return weight + learningRate * hiddenErrors[i] * inputs[j]
            })
            neuron.bias += learningRate * hiddenErrors[i]
        })
    }
}

export default NeuralNetwork

Let’s break down this implementation

Constructor

Takes inputCount, hiddenCount and outputCount. How much hidden and output layer(s) depend on parameter.
Creates arrays of neuron objects for the hidden and output layer

Predict method

Performs forward propagation through the network.
Activates each neuron in the hidden layer, then uses those outputs to activate the output layer.

Train method

Implements backpropagation algorithm for training the network

Forward pass (same as predict)
Calculate output layer errors
Update output layer weights and biases
Calculate hidden layer errors
Update hidden layer weights and biases
Error calculation

Uses the derivative of the sigmoid function (output * (1 - output)) in error calculations.

Weight updates

Uses a simple gradient descent approach with a learning rate.

To use this NeuralNetwork class, you would typically:

Create an instance of the network:

const nn = new NeuralNetwork(2, 3, 1);  // 2 inputs, 3 hidden neurons, 1 output

Train the network on a dataset:

for (let i = 0; i < 10000; i++) {
  nn.train([0, 1], [1]);  // Example: training for XOR function
  nn.train([1, 0], [1]);
  nn.train([0, 0], [0]);
  nn.train([1, 1], [0]);
}

Use the trained network for predictions:

typescriptCopyconsole.log(nn.predict([0, 1]));  // Should be close to [1]

This implementation is a good starting point for understanding neural networks. Some potential improvements or extensions could include:

Adding more hidden layers for deep learning.
Implementing different activation functions (like ReLU).
Adding regularization to prevent overfitting.
Implementing batch training for better efficiency.
Adding methods to save and load trained networks.

Summarize

This implementation provides a solid foundation for understanding the basics of neural networks and neurons. It demonstrates key concepts like weight, bias, activation function, forward propagation, backpropagation, and gradient descent in a simple, feedforward network with one hidden layer.

As you continue your journey in machine learning and neural networks, exploring these enhancements will deepen your understanding and allow you to tackle more challenging problems. Remember, the field of neural networks is vast and constantly evolving, so there’s always more to learn and explore beyond this starting point.

I am excited to extending library and understanding about machine learning and networks, but we’ll wrap here and this the good point to wrap and let our neuron grasp concept and rest.

Hopefully helpful, Farrel Nikoson 😁.

Save for Yourself, Share with Others.