Studyify
Search Index...

Technical Lexicon

The comprehensive A-Z guide to Large Language Model architectures, optimization strategies, and transformer mathematics.

Filter by domain:

A

Attention Mechanism

NLPTrending

A mechanism that allows neural networks to focus on specific parts of the input sequence when generating output, enabling long-range dependencies.

Detailed Specs

Activation Function

Deep Learning

A mathematical gate that decides a neuron's output, introducing non-linearity to the network.

Detailed Specs

Autoencoder

Deep Learning

A neural network that learns to compress input data into a latent representation and then reconstruct it.

Detailed Specs

B

Backpropagation

Deep Learning

The algorithm used to train neural networks by calculating independent gradients for each weight via the chain rule.

Detailed Specs

Batch Normalization

Deep Learning

A technique to standardize inputs to a layer for each mini-batch, stabilizing the learning process.

Detailed Specs

BERT (Bidirectional Encoder Representations from Transformers)

NLPPopular

A transformer-based model focusing on understanding context from both left and right directions.

Detailed Specs

Bias (Inductive)

Math

Assumptions built into a model to help it learn effectively (e.g., CNNs assume spatial locality).

Detailed Specs

C

Chain of Thought (CoT)

NLPTrending

A prompting technique enabling LLMs to decompose complex problems into intermediate reasoning steps.

Detailed Specs

Convolutional Neural Network (CNN)

Deep Learning

A network architecture specialized for processing grid-like data such as images.

Detailed Specs

Cross-Entropy Loss

Math

A loss function typically used in classification tasks, measuring the difference between true and predicted distributions.

Detailed Specs

D

Data Augmentation

General

Increasing the diversity of training data by modifying existing samples.

Detailed Specs

Decoder

NLP

The component of a Transformer that generates the output sequence.

Detailed Specs

Diffusion Model

Deep LearningTrending

Generative models that create data by reversing a noise addition process.

Detailed Specs

Dropout

Deep Learning

Regularization technique dropping random neurons during training.

Detailed Specs

E

Embedding

NLPPopular

A continuous vector representation of discrete variables (words, images) where semantic similarity translates to geometric proximity.

Detailed Specs

Encoder

NLP

The part of a model that processes input data into a context vector or embedding.

Detailed Specs

Epoch

General

One full cycle through the entire training dataset.

Detailed Specs

F

Few-Shot Learning

NLPTrending

Providing a model with a small number of examples (shots) to guide its performance on a new task.

Detailed Specs

Fine-Tuning

Deep LearningPopular

Taking a pre-trained model and training it further on a specific dataset.

Detailed Specs

G

Gradient Descent

Math

An optimization algorithm used to minimize the loss function by iteratively moving in the direction of steepest descent.

Detailed Specs

GAN (Generative Adversarial Network)

Deep LearningPopular

Framework where a Generator and Discriminator compete to create realistic data.

Detailed Specs

GPT (Generative Pre-trained Transformer)

NLPTrending

A series of decoder-only transformer models developed by OpenAI.

Detailed Specs

H

Hallucination

NLPPopular

Confident but incorrect outputs generated by an AI.

Detailed Specs

Hyperparameter

General

Parameters set before training (e.g., learning rate) rather than learned.

Detailed Specs

I

Inference

General

Using a trained model to make predictions.

Detailed Specs

K

Knowledge Distillation

Deep LearningTrending

transferring knowledge from a large 'teacher' model to a smaller 'student' model.

Detailed Specs

L

LoRA (Low-Rank Adaptation)

Deep LearningTrending

A parameter-efficient fine-tuning technique that freezes pre-trained weights and injects trainable rank decomposition matrices.

Detailed Specs

Latent Space

Math

A compressed, abstract representation of data within a model.

Detailed Specs

Learning Rate

Math

Step size for the optimization algorithm.

Detailed Specs

Logits

Math

The raw, unnormalized predictions generated by the last layer of a neural network before Softmax.

Detailed Specs

LSTM (Long Short-Term Memory)

Deep Learning

A type of RNN capable of learning long-term dependencies, resolving the vanishing gradient problem.

Detailed Specs

M

Model Collapse

GeneralTrending

Degradation of generative models when trained on AI-generated data recursively.

Detailed Specs

N

NLP (Natural Language Processing)

NLP

AI focused on interaction with human language.

Detailed Specs

O

Objective Function

Math

The function the model aims to maximize or minimize (also Loss Function).

Detailed Specs

One-Hot Encoding

Math

Representing categorical variables as binary vectors.

Detailed Specs

Overfitting

General

Memorizing training data instead of generalizing.

Detailed Specs

P

Parameter

Math

Internal variables (weights/biases) learned by the model.

Detailed Specs

Perceptron

Deep Learning

The simplest type of feedforward neural network classifier.

Detailed Specs

Prompt Engineering

NLPTrending

The art of crafting inputs (prompts) to guide Large Language Models to produce desired outputs.

Detailed Specs

Q

Quantization

Deep LearningTrending

The process of reducing the precision of a model's weights (e.g., from 32-bit float to 8-bit integer) to reduce memory usage and increase speed.

Detailed Specs

R

Reinforcement Learning from Human Feedback (RLHF)

Reinforcement LearningTrending

A method to align language models with human values by fine-tuning them using a reward model trained on human preferences.

Detailed Specs

RAG (Retrieval-Augmented Generation)

NLPTrending

A technique that retrieves relevant external knowledge and feeds it to an LLM to generate more accurate and up-to-date responses.

Detailed Specs

ResNet (Residual Network)

Deep Learning

A CNN architecture using 'skip connections' that allow gradients to flow easily during training, enabling extremely deep networks.

Detailed Specs

Regularization

Math

A set of techniques used to prevent overfitting by penalizing complex models.

Detailed Specs

RNN (Recurrent Neural Network)

Deep Learning

Network for sequential data processing.

Detailed Specs

S

Softmax

Math

A function that converts a vector of raw scores (logits) into a probability distribution summing to 1.

Detailed Specs

Scaling Laws

GeneralTrending

Empirical observation that model performance improves predictably with model size, data size, and compute.

Detailed Specs

Self-Attention

NLPPopular

Mechanism relating different positions of a single sequence to compute a representation of the sequence.

Detailed Specs

Sigmoid

Math

Activation function mapping predictions to 0-1.

Detailed Specs

Supervised Learning

General

Training on labelled data.

Detailed Specs

T

Transformer

NLPPopular

A deep learning architecture based entirely on attention mechanisms, dispensing with recurrence and convolutions.

Detailed Specs

Temperature

NLPPopular

Hyperparameter controlling randomness in LLM generation (High = creative, Low = deterministic).

Detailed Specs

Tensor

Math

A multi-dimensional array, the fundamental data structure in ML frameworks like PyTorch/TensorFlow.

Detailed Specs

Token

NLP

The basic unit of text for an LLM (roughly 0.75 words).

Detailed Specs

Tokenization

NLP

Splitting text into tokens.

Detailed Specs

Transfer Learning

Deep LearningPopular

Applying knowledge from one task to a related one.

Detailed Specs

U

Underfitting

General

Model is too simple to learn the data.

Detailed Specs

Unsupervised Learning

General

Finding patterns in unlabeled data.

Detailed Specs

V

Validation Set

General

Data used to tuning hyperparameters, separate from training and test sets.

Detailed Specs

Vanishing Gradient

Math

Gradients becoming too small to train deep networks efficiently.

Detailed Specs

Vector Database

GeneralTrending

Storage for high-dimensional embeddings.

Detailed Specs

W

Weights

Math

The strength of connections between neurons.

Detailed Specs

Z

Zero-Shot Learning

Deep LearningPopular

Performing tasks without specific training examples.

Detailed Specs