LLM Lab
Search

AI/ML Dictionary

The comprehensive A-Z guide to Artificial Intelligence concepts, formulas, and terminology. Demystifying the jargon one term at a time.

Topics:

A

Attention Mechanism

NLPTrending

A mechanism that allows neural networks to focus on specific parts of the input sequence when generating output, enabling long-range dependencies.

Click to view details

Activation Function

Deep Learning

A mathematical gate that decides a neuron's output, introducing non-linearity to the network.

Click to view details

Autoencoder

Deep Learning

A neural network that learns to compress input data into a latent representation and then reconstruct it.

Click to view details

B

Backpropagation

Deep Learning

The algorithm used to train neural networks by calculating independent gradients for each weight via the chain rule.

Batch Normalization

Deep Learning

A technique to standardize inputs to a layer for each mini-batch, stabilizing the learning process.

BERT (Bidirectional Encoder Representations from Transformers)

NLPPopular

A transformer-based model focusing on understanding context from both left and right directions.

Click to view details

Bias (Inductive)

Math

Assumptions built into a model to help it learn effectively (e.g., CNNs assume spatial locality).

Click to view details

C

Chain of Thought (CoT)

NLPTrending

A prompting technique enabling LLMs to decompose complex problems into intermediate reasoning steps.

Click to view details

Convolutional Neural Network (CNN)

Deep Learning

A network architecture specialized for processing grid-like data such as images.

Click to view details

Cross-Entropy Loss

Math

A loss function typically used in classification tasks, measuring the difference between true and predicted distributions.

D

Data Augmentation

General

Increasing the diversity of training data by modifying existing samples.

Click to view details

Decoder

NLP

The component of a Transformer that generates the output sequence.

Diffusion Model

Deep LearningTrending

Generative models that create data by reversing a noise addition process.

Click to view details

Dropout

Deep Learning

Regularization technique dropping random neurons during training.

Click to view details

E

Embedding

NLPPopular

A continuous vector representation of discrete variables (words, images) where semantic similarity translates to geometric proximity.

Click to view details

Encoder

NLP

The part of a model that processes input data into a context vector or embedding.

Epoch

General

One full cycle through the entire training dataset.

Click to view details

F

Few-Shot Learning

NLPTrending

Providing a model with a small number of examples (shots) to guide its performance on a new task.

Click to view details

Fine-Tuning

Deep LearningPopular

Taking a pre-trained model and training it further on a specific dataset.

Click to view details

G

Gradient Descent

Math

An optimization algorithm used to minimize the loss function by iteratively moving in the direction of steepest descent.

Click to view details

GAN (Generative Adversarial Network)

Deep LearningPopular

Framework where a Generator and Discriminator compete to create realistic data.

GPT (Generative Pre-trained Transformer)

NLPTrending

A series of decoder-only transformer models developed by OpenAI.

H

Hallucination

NLPPopular

Confident but incorrect outputs generated by an AI.

Hyperparameter

General

Parameters set before training (e.g., learning rate) rather than learned.

I

Inference

General

Using a trained model to make predictions.

K

Knowledge Distillation

Deep LearningTrending

transferring knowledge from a large 'teacher' model to a smaller 'student' model.

L

LoRA (Low-Rank Adaptation)

Deep LearningTrending

A parameter-efficient fine-tuning technique that freezes pre-trained weights and injects trainable rank decomposition matrices.

Latent Space

Math

A compressed, abstract representation of data within a model.

Learning Rate

Math

Step size for the optimization algorithm.

Logits

Math

The raw, unnormalized predictions generated by the last layer of a neural network before Softmax.

LSTM (Long Short-Term Memory)

Deep Learning

A type of RNN capable of learning long-term dependencies, resolving the vanishing gradient problem.

M

Model Collapse

GeneralTrending

Degradation of generative models when trained on AI-generated data recursively.

N

NLP (Natural Language Processing)

NLP

AI focused on interaction with human language.

O

Objective Function

Math

The function the model aims to maximize or minimize (also Loss Function).

One-Hot Encoding

Math

Representing categorical variables as binary vectors.

Overfitting

General

Memorizing training data instead of generalizing.

P

Parameter

Math

Internal variables (weights/biases) learned by the model.

Perceptron

Deep Learning

The simplest type of feedforward neural network classifier.

Prompt Engineering

NLPTrending

The art of crafting inputs (prompts) to guide Large Language Models to produce desired outputs.

Click to view details

Q

Quantization

Deep LearningTrending

The process of reducing the precision of a model's weights (e.g., from 32-bit float to 8-bit integer) to reduce memory usage and increase speed.

Click to view details

R

Reinforcement Learning from Human Feedback (RLHF)

Reinforcement LearningTrending

A method to align language models with human values by fine-tuning them using a reward model trained on human preferences.

RAG (Retrieval-Augmented Generation)

NLPTrending

A technique that retrieves relevant external knowledge and feeds it to an LLM to generate more accurate and up-to-date responses.

ResNet (Residual Network)

Deep Learning

A CNN architecture using 'skip connections' that allow gradients to flow easily during training, enabling extremely deep networks.

Click to view details

Regularization

Math

A set of techniques used to prevent overfitting by penalizing complex models.

Click to view details

RNN (Recurrent Neural Network)

Deep Learning

Network for sequential data processing.

S

Softmax

Math

A function that converts a vector of raw scores (logits) into a probability distribution summing to 1.

Scaling Laws

GeneralTrending

Empirical observation that model performance improves predictably with model size, data size, and compute.

Self-Attention

NLPPopular

Mechanism relating different positions of a single sequence to compute a representation of the sequence.

Sigmoid

Math

Activation function mapping predictions to 0-1.

Supervised Learning

General

Training on labelled data.

T

Transformer

NLPPopular

A deep learning architecture based entirely on attention mechanisms, dispensing with recurrence and convolutions.

Temperature

NLPPopular

Hyperparameter controlling randomness in LLM generation (High = creative, Low = deterministic).

Tensor

Math

A multi-dimensional array, the fundamental data structure in ML frameworks like PyTorch/TensorFlow.

Token

NLP

The basic unit of text for an LLM (roughly 0.75 words).

Tokenization

NLP

Splitting text into tokens.

Transfer Learning

Deep LearningPopular

Applying knowledge from one task to a related one.

U

Underfitting

General

Model is too simple to learn the data.

Unsupervised Learning

General

Finding patterns in unlabeled data.

V

Validation Set

General

Data used to tuning hyperparameters, separate from training and test sets.

Vanishing Gradient

Math

Gradients becoming too small to train deep networks efficiently.

Vector Database

GeneralTrending

Storage for high-dimensional embeddings.

W

Weights

Math

The strength of connections between neurons.

Z

Zero-Shot Learning

Deep LearningPopular

Performing tasks without specific training examples.