We train language models specialized in evaluating other language models and optimize evaluation pipelines!
Below are our key projects, with links to their repositories and related publications:
Repository | Description | Paper |
---|---|---|
prometheus-eval | A repository for evaluating LLMs in generation tasks. Supports Prometheus 2, GPT-4, and others. | Link |
prometheus | An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. | Link |
prometheus-vision | An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. | Link |