Which language ranked highest for prompting AI models in the study?

Polish ranked highest, achieving an average task accuracy of 88% across multiple large language models tested in the study.

Why did Polish outperform English despite having less training data?

Researchers suggest Polish's morphological structure, consistent orthography, and tokenization-friendly patterns may make prompts clearer to models. In short, linguistic structure and how models tokenize text can outweigh sheer data volume.

What should prompt engineers do with these findings?

Prompt engineers should experiment with multiple languages when optimizing model outputs, evaluate performance in target languages rather than relying only on English, and consider how tokenization and linguistic features affect results.

Polish Beats English in AI Prompts — New Study Reveals Why

Q: Which AI models were evaluated in the research?

The study tested several large language models, including OpenAI models, Google Gemini, Qwen, Llama, and DeepSeek, using the same prompts translated into 26 languages.

3 Minutes

A surprise finding from a joint study by the University of Maryland and Microsoft: Polish outperformed 25 other languages as the most effective language for prompting large AI models, while English ranked only sixth.

How researchers tested language performance with AI

The research team fed identical prompts translated into 26 languages to multiple large language models — including OpenAI models, Google Gemini, Qwen, Llama, and DeepSeek — and measured task accuracy. Against expectations, Polish came out on top with an average task accuracy of 88%.

Authors of the report called the results “unexpected” and noted that English was not the universal winner. In longer-text evaluations English placed sixth, while Polish led the pack. The study highlights how language choice can materially affect model output quality.

Top languages for AI prompting — the study's leaderboard

Here are the ten best-performing languages from the study, ranked by average accuracy:

Polish — 88%
French — 87%
Italian — 86%
Spanish — 85%
Russian — 84%
English — 83.9%
Ukrainian — 83.5%
Portuguese — 82%
German — 81%
Dutch — 80%

Why might Polish be better for AI prompts?

Several theories could explain this counterintuitive outcome. Polish is morphologically rich and has relatively consistent spelling rules, which might yield tokens that align well with transformer tokenization schemes. That can make prompts clearer to a model even if fewer Polish training examples exist.

Another factor is ambiguity and phrasing: some languages naturally force more explicit grammatical signals, reducing the chance that a model misinterprets intent. The study also suggests that a language being “hard for humans” doesn’t mean it’s difficult for AI — models can pick up structural patterns regardless of speaker learning difficulty.

On the flip side, Chinese ranked near the bottom (fourth from last) in this evaluation, showing that large training data alone doesn't guarantee superior prompt performance across languages.

Implications for prompt engineering and multilingual AI

So what should developers, researchers, and prompt engineers take away?

Don’t assume English is always best: test prompts in multiple languages — you might get more accurate or concise outputs in an unexpected tongue.
Consider morphology and tokenization effects when designing multilingual benchmarks or fine-tuning datasets.
For international deployments, evaluate model behavior in target languages rather than extrapolating from English-only tests.

The Polish Patent Office even posted on social media that the results show Polish is the most precise language for instructing AI, adding a wry note: humans may find Polish hard to learn, but AI does not share that limitation.

What’s next?

Researchers say this isn’t the final word — more work is needed to understand how tokenization, training data distribution, and linguistic structure influence model behavior. Still, the study nudges the AI community to rethink assumptions and to experiment broadly when optimizing prompts for multilingual models.

Polish Beats English in AI Prompts — New Study Reveals Why

A joint University of Maryland and Microsoft study finds Polish outperforms 25 languages for AI prompting, with English in sixth place. The results challenge assumptions about language, data, and model accuracy.

How researchers tested language performance with AI

Top languages for AI prompting — the study's leaderboard

Why might Polish be better for AI prompts?

Implications for prompt engineering and multilingual AI

What’s next?

Leave a Comment

Comments

Related Posts

How Tesla's Optimus Could Bring Top Surgeons to All

Amazon's AI Meeting Simulator Trains Staff for Stress

Bitfarms Abandons Crypto to Become an AI Data Center

Jury Fines Apple $634M in Masimo Patent Case; Ban?

German Court Fines Google $664M Over Shopping Bias

China Deploys Humanoid Robots in Factories - UBTECH Leap

Android 17 Brings System-Level Controller Remapping

Huawei MatePad Edge: 24GB - 1TB 2‑in‑1 Tablet Debuts

Lenovo's Compact Legion Tabs Could Pack Snapdragon 8 Elite

Redmi Note 15 Arrives in India in December — Flagship Hints

Samsung Galaxy A27 Leak: What We Know About 2026 Launch

Google's 'Gesture Exchange': Android's NameDrop Rival