Apple's 'OpenELM' Is An 'Open-Source Efficient Language Model' That Runs Locally

Business, it affects everyone from big to small, and that even giants have to take notice.

Apple is a tech giant that tends to distance itself from the crowd. In the hype of generative AI, started by OpenAI when it announced ChatGPT, Apple has never show a special interest towards the technology. And normally, Apple is also known for its openness.

But this is changing, at least for this time.

After previously announcing what it calls MGIE, the company said that it has released a generative AI model called OpenELM, which it said, can outperform a set of other language models trained on public datasets.

OpenELM, which stands for Open-Source Efficient Language Models, uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy.

Apple pretrained the models using the CoreNet library, and the company is releasing both pretrained and instruction tuned models with 270M, 450M, 1.1B and 3B parameters.

OpenELM was pretrained using the RedPajama dataset from GitHub, books, data from Wikipedia and StackExchange, ArXiv papers, the Dolma set from Reddit, Wikibooks, Project Gutenberg, and more.

Its dataset also contains those from RefinedWeb, deduplicated PILE, totaling to approximately 1.8 trillion tokens.

And just like other generative AI products out there, users can simply give it a prompt, and let the AI do the rest.

But what makes this OpenELM unique, is the way it utilizes a technique called layer-wise scaling to allocate parameters more efficiently in the transformer model.

So instead of each layer having the same set of parameters, OpenELM's transformer layers have different configurations and parameters.

The result is better accuracy.

"Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations," explain Apple researchers in their research paper.

What's more, OpenELM is accompanied by "code to convert models to MLX library for inference and fine-tuning on Apple devices."

MLX is a framework for running machine learning on Apple silicon.

What this means, Apple is keen to show the merits of its homegrown chip architecture for machine learning, and that OpenELM is designed to be able to operate locally on Apple devices, rather than on the network, inside the cloud.

OpenELM vs. public LLMs. OpenELM outperforms comparable-sized existing LLMs pretrained on publicly available datasets.

"The release of OpenELM models aims to empower and enrich the open research community by providing access to state-of-the-art language models," the paper reads.

"Trained on publicly available datasets, these models are made available without any safety guarantees."

Apple's claim to openness comes from its decision to release not just the model, but its training and evaluation framework.

But it's worth noting though, that the accompanying software release is not a recognized open source license, despite being shared on GitHub.

While Apple is not restrictive of the project, but it does make it clear that the company has the rights to file a patent claim, if any derivative work based on OpenELM is deemed to infringe on its rights.

So even when the project is said to be an open-source project, Apple's interpretation of openness is somewhat comparable to the not-very-open OpenAI.

Published:

22/04/2024

Search form

Apple's 'OpenELM' Is An 'Open-Source Efficient Language Model' That Runs Locally