Thumbnail Bench

Human evaluation of AI image models for YouTube thumbnail generation. Models are tested on prompt-following using production thumbnail templates.

Leaderboard

Text-to-image model rankings based on ai thumbnail generation.

How We Evaluate

Each model creates thumbnails from production TubeSalt templates using identical prompts and default API settings. Thumbnails are typically scored against 10-15 criteria: anatomical accuracy (hands, face, body), skin quality, text and graphics quality, spelling, legibility, composition, framing and prompt-matching. The leaderboard shows average scores across multiple template generations.

Rank Model Score (AVG@10) Organization
1 Imagen 4 Preview 90.7% Google
2 Flux Pro Kontext Max 88.0% Black Forest Labs
3 Flux Pro Kontext 86.0% Black Forest Labs
4 Ideogram V3 80.2% Ideogram
5 Seedream V4 79.9% ByteDance

Want new model benchmarks and leaderboard updates?

Get notified when we publish new benchmark results and model comparisons.

Unsubscribe anytime. Read our privacy policy.