Neural Networks: The Basics

Introduction

Neural networks are computational models inspired by biological neurons. They form the foundation of modern deep learning.

The simplest unit:

Mathematical representation:

y = f(w₁x₁ + w₂x₂ + ... + wₙxₙ + b)

Sigmoid: σ(x) = 1/(1 + e⁻ˣ)

ReLU: f(x) = max(0, x)

Tanh: tanh(x) = (eˣ - e⁻ˣ)/(eˣ + e⁻ˣ)

The algorithm that makes learning possible:

Chain rule is key:

∂L/∂w = ∂L/∂y × ∂y/∂z × ∂z/∂w

Mean Squared Error (Regression):

MSE = (1/n)Σ(yᵢ - ŷᵢ)²

Cross-Entropy (Classification):

H(p,q) = -Σ p(x)log(q(x))

w = w - α∇L(w)

Combines momentum and adaptive learning rates.