LLM Lab
Search
Data Collection
Architecture
Training
Inference
Evals & Testing
AI/ML Quiz
Focus Play
Playground
AI/ML Dictionary
ZeroCasting
Sign Up / Sign In
LLM Lab
Search
Data Collection
Architecture
Training
Inference
Evals & Testing
AI/ML Quiz
Focus Play
Playground
AI/ML Dictionary
ZeroCasting
Sign Up / Sign In
LLM LAB
- 108
1,205
Data Collection & Tokenization
Ingesting raw text and converting to tokens.
Common Crawl
Wikipedia Dump
GitHub Code
Incoming Stream
Tokenized IDs
0 Tokens
Next
Architecture