Meta Pulled AI Model That Can Write Scientific Papers, Three Days After Its Launch

Galactica demo

AI has become increasingly smarter, thanks to the abundance of data, increasingly powerful hardware, and sophisticated algorithms.

The technology has become so advanced, that it Meta managed to created one that can summarize academic literature, write new research papers, solve mathematic equations, generate Wikipedia-like articles and more. The AI the company calls 'Galactica' is simply an 'AI for science'.

But it failed, badly.

The thing is, Galactica failed not because it's "dumb."

Quite the contrary, because the AI failed because it knows no limit to what it should say.

The AI created quite a stir on social media networks and amongst researchers who reviewed it, that Meta had to pull its plugs.

Yann Lecun, who works as the Vice President, Chief AI Scientist at Meta, announced this in a tweet.

It all began when Meta, the company formerly known as Facebook, launched a demo version of Galactica.

The text generator was trained on "a large corpus comprising more than 360 millions in-context citations and over 50 millions of unique references normalized across a diverse set of sources. This enables Galactica to suggest citations and help discover related papers," the researchers explained the Galactica research paper.

It is said that the model was trained on a novel "high-quality scientific dataset called NatureBook, making the models capable of working with scientific terminology, math and chemical formulas as well as source codes."

In the demo, users could enter prompts for virtually any subject, and let the AI prepare a articles and explainers on the topic easily.

If users chose to hit the 'Generate More' prompt at the bottom of the screen, Galactica would keep on adding more content to the results.

By adding the right prompts, users could easily generate entire article from scratch. And because the AI was trained on research papers themselves, the model could also come up with references and formulas that are intrinsic to academic writing.

While the quality of the text is debatable, some of the results were kind of not as well-written as it would have been if a human specialist had written it.

But still, Galactica did excel in creating papers for the less keen eyes in the specific field.

The issue here is that, some of the things it spat out could be riddled with falsehoods and potentially harmful stereotype.

Another way of saying it, Galactica is just another example of the Nazi-loving Microsoft Tay.

Galactica, the large language model designed to "store, combine and reason about scientific knowledge," is intended to accelerate writing scientific literature. But adversarial users running the tests found it could also generate realistic nonsense.

The AI soon became a racist and a fascist, and resembling more like a mad scientist.

For example, the AI managed to create instructions on how to (incorrectly) make napalm in a bathtub, create a false wiki entry about the benefits of committing suicide, a wiki entry on the benefits of being White, and numerous research papers about the benefits of eating crushed glass.

In the several examples shared by users, the AI also generated articles that are authoritative-sounding and believable, but not backed up by actual scientific research. In some cases, the AI could include the names of real authors, but link to non-existent GitHub repositories and research papers.

Many of it created results include prejudice and assert falsehoods as facts.

Galactica is little more than statistical nonsense at scale.

According to the research published, Galactica outperforms other model on different metrics.

For example, it beats OpenAI's GPT-3 by 68.2% versus 49.0% on technical knowledge probes such as LaTeX equations, surpasses Chinchilla on mathematical MMLU with 41.3% to Chinchilla’s 35.7%, and PaLm 540B on MATH with a score of 20.4% versus 8.8%. It's also better than BLOOM and OPT-175B on BIG-bench despite not being trained on general corpus.

This makes Galactica a large language model, and an AI that is capable of generating exceptionally believable text that feels like it was written by humans.

While the results of these systems are often impressive, Galactica is another example of how the ability to produce believable human language doesn’t mean the system actually understands its contents.

Some researchers have questioned whether large language models should be used to make any decisions at all.

Particularly, things can be a massive problem when it comes to scientific research.

Scientific papers are created through thorough and rigorous methodologies that text-generating AI systems clearly can’t comprehend - at least, not yet.

If some AIs can create deepfakes, AIs like Galactica can open an era of deep "scientific" fakes.

"Some of Galactica’s generated text may appear very authentic and highly-confident, but might be subtly wrong in important ways. This is particularly the case for highly technical content," the company explained.