LLM Lab
Search

The Gym (Training)

Optimization via Gradient Descent.

Epoch1

Loss Function

Accuracy
0.0%
Learning Rate3e-4
Batch Size512
OptimizerAdamW
GPU Utilization