Spanish Mix (Scale.ai Leaderboard)

A mix that picks the highest-ranked model for Spanish prompts, based on Scale's Multilingual Prompts Dataset aimed at measuring a model's proficiency in engaging with Spanish users from Spain, Mexico and the rest of Latin America, reflecting complexity of global communication.

Updated Aug 13$6.12/M input tokens$18.36/M output tokensGithubGitHub

API

Example

Models

This mix uses the models below:

OpenAI Icon

gpt-4o

Provided by multiple sources

Google Gemini Icon

gemini-1.5-pro

Provided by multiple sources

OpenAI Icon

gpt-4-turbo-preview

Provided by multiple sources

Readme

Spanish Mix (Scale.ai Leaderboard)

This mix is based on the top models ranked for Spanish by Scale.ai's SEAL leaderboard. The weight is a function of the Elo score.

The evaluation process involves assessing model responses across three main dimensions: honesty (understanding and consistency), accuracy (correctness of claims), and helpfulness (instruction following and writing quality). Models are paired against each other and evaluated on these criteria, with a focus on instruction following abilities.

The top-performing models in the Spanish leaderboard are GPT-4o in first place, followed by Gemini 1.5 Pro (May 2024) in second, and GPT-4 Turbo Preview in third.

Categories

  • 💬 General
  • 🗣️ Multilingual

Composition

This mix produces responses from the following models:

Model NameWeight %
gpt-4o22.51%
gemini-1.5-pro21.54%
gpt-4-turbo-preview23.79%

Last updated: est. May 2024 Source: https://scale.com/leaderboard/spanish