Papers & Reports — LLM Observatory

Our Layer 1 benchmark paper introducing the observational distribution evaluation framework.

Explore the full codebase used to run our benchmark evaluations.

Go to GitHub

Browse benchmark datasets, predictions, and evaluation artifacts on Hugging Face.

Visit Hugging Face