AI Can Match Average Human Creativity—But We Still Hold the Edge Where It Matters Most, New Study Finds | Metaglossia: The Translation World | Scoop.it

A study of 100,000 people shows AI can match average creativity—but the most creative humans still outperform machines.AI Can Match Average Human Creativity—But We Still Hold the Edge Where It Matters Most, New Study Finds
Tim McMillan·January 27, 2026
Advances in artificial intelligence have fueled a growing belief that machines are on the verge of matching, or even surpassing, human creativity. Large language models can now write poems, spin short stories, and generate clever wordplay in seconds. To many, these outputs feel creative enough to blur the line between human imagination and machine-generated language.


However, a new large-scale empirical study suggests that while today’s most advanced AI systems can rival the average human on certain creativity measures, they still fall short of the most creative minds—and that gap remains significant.


 


The research, published in Scientific Reports, offers one of the most comprehensive head-to-head comparisons yet between human creativity and large language models (LLMs).


By benchmarking multiple AI systems against a dataset of 100,000 human participants, the study moves the conversation beyond anecdotes and viral examples, replacing speculation with quantitative evidence.


“Our study shows that some AI systems based on large language models can now outperform average human creativity on well-defined tasks,” co-author and Professor at the University of Montreal, Dr. Karim Jerbi, said in a press release. “This result may be surprising — even unsettling — but our study also highlights an equally important observation: even the best AI systems still fall short of the levels reached by the most creative humans.”


Measuring creativity without subjectivity
One of the biggest challenges in studying creativity is measurement. Traditional creativity tests often rely on subjective ratings or expert judgment, making comparisons between humans and machines difficult.


To address this, researchers focused on divergent thinking—the ability to generate ideas meaningfully different from one another—using computational tools that can be consistently applied to both people and AI.


Researchers focused on the Divergent Association Task (DAT), a well-established test in creativity research. Participants are asked to generate a list of words that are as different in meaning as possible.


The creativity score is then calculated using semantic distance: how far apart the words are from each other in a high-dimensional language space. Larger distances indicate more diverse associations and, by extension, higher divergent creativity.


“These divergence-based measures index associative thinking—the ability to access and combine remote concepts in semantic space—an established facet of creative cognition,” researchers explain. 


Because the DAT relies on objective mathematical scoring rather than human judgment, it offers a rare opportunity to compare human and machine creativity on equal footing.


Where AI beats the average human
Using identical scoring methods, researchers tested a wide range of language models, including GPT-4, GPT-4-Turbo, Gemini Pro, Claude, and several open-source systems. The results were intriguing.


On average, some models—most notably GPT-4—matched or even exceeded the mean DAT score of the human population. Gemini Pro performed statistically on par with humans, while several other models fell just below.


This finding helps explain why AI-generated text can feel surprisingly inventive. Compared with everyday human performance, top-tier models can produce word combinations spanning a wide semantic territory. In purely statistical terms, they can appear impressively creative.


However, the study emphasizes that this is only part of the story.


The ceiling AI still cannot break
When the researchers ranked human participants by their performance on the Divergent Association Task and compared AI systems against those highest-scoring groups, a clear gap emerged. Humans in the top 50% of DAT scores, and especially those in the top 10%, consistently outperformed every tested language model.


Even GPT-4, the highest-performing system in the study, failed to reach the creativity levels of humans in the upper tail of the distribution.


“The top performing LLMs are still largely surpassed by the aggregated top half of human participants, underscoring a ceiling that current LLMs still fail to surpass,” researchers write.


In practical terms, this suggests that while AI can imitate creative behaviors common in the general population, it struggles to replicate the kind of associative leaps made by highly creative individuals—writers, artists, and innovators who regularly push conceptual boundaries.


Creativity can be tuned—but only so far
The researchers also explored whether AI creativity could be enhanced through technical adjustments. By increasing a model’s “temperature”—a parameter that controls randomness in word selection—they observed higher creativity scores. Higher temperatures reduced repetitive word choices and encouraged broader exploration of semantic space.


Prompting strategies mattered as well. When models were explicitly instructed to use certain approaches, such as drawing on word etymology, their divergent creativity scores improved. These findings suggest that AI creativity is not fixed, but responsive to how models are guided and configured.


Still, these gains did not eliminate the gap between AI and the most creative humans. Tuning improved performance, but it did not fundamentally change the ceiling.


From word lists to stories and poems
To test whether DAT performance translated into more realistic creative tasks, the study also examined creative writing. Models were asked to generate haikus, movie synopses, and flash fiction, which were then evaluated using automated measures of semantic diversity and textual complexity.


Here again, AI systems demonstrated impressive abilities, particularly in longer formats such as synopses and short stories. Yet human-written texts retained a consistent advantage, especially in structured, stylistically constrained forms such as haiku.


The results suggest that while AI can approximate certain surface features of creative writing, deeper patterns of originality and integration remain more characteristic of human authors.


What this means for creative work
The findings cut against both extremes of the AI creativity debate. On the one hand, they undermine claims that creativity is an exclusively human trait beyond the reach of machines. On the other hand, they challenge fears that AI is poised to replace top-tier creative professionals.


“The persistent gap between the best-performing humans and even the most advanced LLMs indicates that the most demanding creative roles in industry are unlikely to be supplanted by current artificial intelligence systems,” researchers write.


Rather than a story of replacement, the research points toward collaboration. AI systems may serve as creative amplifiers—useful for brainstorming, variation, and exploration—while humans remain essential for the highest levels of originality, judgment, and synthesis.


A clearer picture of human creativity
Beyond AI benchmarking, the study also reframes how scientists think about creativity itself. If machines can score well on some creativity tests without human-like cognition, it raises important questions about what those tests truly measure.


By combining large human datasets with objective semantic metrics, the research offers a more nuanced view: creativity is not a single ability, but a spectrum. AI may operate convincingly in the middle of that spectrum, while the upper extremes remain distinctly human. At least for now.


As AI systems continue to evolve, this kind of rigorous, data-driven comparison will be essential. Ultimately, the evidence is clear that machines can imitate creativity, but the deepest wells of human imagination are not so easily replicated.


“Even though AI can now reach human-level creativity on certain tests, we need to move beyond this misleading sense of competition,” Dr. Jerbi says. “Generative AI has above all become an extremely powerful tool in the service of human creativity: it will not replace creators, but profoundly transform how they imagine, explore, and create — for those who choose to use it.”


Tim McMillan is a retired law enforcement executive, investigative reporter and co-founder of The Debrief. His writing typically focuses on defense, national security, the Intelligence Community and topics related to psychology. You can follow Tim on Twitter: @LtTimMcMillan.  Tim can be reached by email: tim@thedebrief.org or through encrypted email: LtTimMcMillan@protonmail.com "
https://thedebrief.org/ai-can-match-average-human-creativity-but-we-still-hold-the-edge-where-it-matters-most-new-study-finds/
#Metaglossia
#metaglossia_mundus
#métaglossie