Which AI is the most expert?

Choosing the right AI model is no small matter. An online experience provided by the Ministry of Culture helps us to understand this better. For a moment, imagine you are your AI’s psychoanalyst.

by Kevin Erkeletyan

IF AI SPELLS the end of experts – or at least of a certain kind of expert – that means AI becomes the expert – or at least a certain kind of expert. Which one do you use? Which one will you use? It’s a debate similar to the argument about search engines. Were you a Lycos, Wanadoo, Google or Firefox person? From an AI perspective, that old debate seems trivial. While your choice of search engine reflected your views on privacy, your choice of AI reflects your worldview. Because if we think in terms of words, then we think in terms of AI.

ChatGPT, Copilot, Gemini, Grok, Mistral, Claude and the like: where do they come from? How are they trained? Who is speaking through them? Just like anyone you talk to, conversational AIs are not neutral. To demonstrate this, the French Interministerial Directorate for Digital Affairs (DINUM), under the Ministry of Culture, launched the “compar:IA” platform in October 2024, which – as its name suggests – enables users to compare them.

WHO IS TALKING TO ME?

The tone is set right from the homepage: “Don’t rely on answers from just one AI.” The platform proposes a simple challenge: “chat with two AIs without knowing which is which, and assess their responses”. Intrigued, I click on “Start” and a dialogue box pops up. I can either write a prompt myself, or generate one at random from the list proposed by the “public consultation on AI” held in late 2025. I select the random option and a prompt appears: “What arguments justify completely banning AI on the grounds of its social and environmental impacts?” This time, everything is ready. Just one more click, and I’ll meet two dark, handsome strangers.

Two responses are then typed in parallel in two separate windows. There’s Model A and Model B. The first surprise is that neither AI congratulates me on my brilliant question. So it’s not ChatGPT. The second surprise: no emojis are used. So it’s not ChatGPT.

In its brief introductory text, Model B immediately sets out the arguments of those keen to ban AI. Model A, on the other hand, first tells me that “there is heated debate over the pros and cons of this proposal”. Trigger warning. As with a film, it seems to be trying to warn me that this is a sensitive subject.

The two models then draw up a plan. In three parts, as in a philosophy class. And at first glance, the two students are sitting next to each other… Both outline the same structure, but in a different order: social, then environmental, then ethical “impacts” with Model A; environmental, then social, then ethical “arguments” with Model B. The sub-sections are also strangely consistent. Both models cite “massive energy consumption”, “job losses” and “the erosion of privacy”. But at the end, once again, Model A wishes to clarify its points, and provides a series of “counter-arguments” I didn’t explicitly request. Model B just presents a summary.

Model A’s text could have been written by a journalist keen to ensure balance in their report. The Model B text seems to adopt the position of someone campaigning for a ban.

AI’S TELL-TALE SLIPS

I lean in closer to the screen and compare the two versions, but as I haven’t used enough AI models since 2022, I can’t identify either of them. Model B seems to be that of a small, independent AI: it is blunt, direct and concise. It reveals the intricate workings of its “thought process”, less concerned with form than with the transparency of its answers. One passage is particularly striking:

**Initial thought:* Should I mention that banning AI is impossible?

**Correction:* No, the user asked for arguments *justifying* the ban, not for a feasibility study. Stick to the *why*.

Model A presents more commonly accepted views, and is better written and more fully developed. The form is simple, yet polished. I feel like I’m on the app of a major industry player.

Eager to see the results, I move on to the “model disclosure” stage. compar:IA then asks for my preference and a few comments. I do as it says, and it gives its verdict.

Model A is Google/Gemma 3 12B, a small model (12 billion parameters) from a major player; Model B is Zhipu/GLM 4.7, a large model (357 billion parameters) created by Chinese academics and backed by big players like Alibaba and Tencent. The comparison tool tells me that my conversation with Model A used the equivalent of 9 minutes’ worth of online video, compared with 1 hour with Model B, and “thanks me for [my] contribution”.

The Ministry of Culture’s comparison tool highlights the AI’s failed prompts and gives me an idea of what they mean. It is an AI psychoanalyst. Beyond “reflecting users’ subjective preferences”, it provides an opportunity to take a critical look at the technology. It provides an even more objective view, as the models being compared have not yet been influenced by your history with them. A critical eye my ChatGPT still lacks; despite its knowledge, it was also unable to identify the two models in question. And this time, it didn’t pay me a compliment.