Dark Mode Light Mode

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Follow Us
Follow Us
Login Login

AI2 releases text-generating AI models and training data.

The Allen Institute for AI (AI2), founded by late Microsoft co-founder Paul Allen, is releasing several GenAI language models it claims are more “open” and licensed so developers can use them unfettered for training, experimentation, and commercialization.

AI2 senior software developer Dirk Groeneveld says the models and dataset used to train them, Dolma, one of the largest public datasets, were called OLMo, an acronym for “Open Language Models,” to research the high-level science of text-generating AI.

“‘Open’ is an overloaded term when it comes to [text-generating models],” Groeneveld emailed Eltrys. We expect academics and practitioners to use the OLMo framework to examine a model trained on one of the largest public data sets disclosed to date and all the components needed to create the model.

Advertisement

Meta and Mistral are offering powerful open-source text-generating models for developers to use and improve. Groeneveld claims that many of these models were trained “behind closed doors” on proprietary, opaque data, making them unopen.

However, the OLMo models, developed using Harvard, AMD, and Databricks, include training and assessment metrics and logs detailing the code used to generate their training data.

Groeneveld says OLMo 7B, the most capable model, is a “compelling and strong” alternative to Meta’s Llama 2, depending on the application. On certain measures, especially reading comprehension, OLMo 7B beats Llama 2. However, OLMo 7B lags in question-answering tests.

Dolma predominantly uses English; therefore, OLMo models have low-quality outputs in non-English languages and inadequate code-generating skills. But Groeneveld said it’s early.

He claimed, “OLMo is not designed to be multilingual—yet.” While OLMo’s major aim at this point wasn’t code creation but to offer a head start to future code-based fine-turning efforts, its data mix now comprises roughly 15% code.

I asked Groeneveld whether he was worried that bad actors may utilize the OLMo models, which can be used commercially and run on consumer GPUs like the Nvidia 3090, in unanticipated, malevolent ways. Democracy Reporting International’s Disinfo Radar project, which identifies and addresses disinformation trends and technologies, found that Hugging Face’s Zephyr and Databricks’ Dolly reliably generate toxic content in response to malevolent prompts.

Groeneveld thinks the pros outweigh the cons.

“Building this open platform will actually facilitate more research on how these models can be dangerous and what we can do to fix them,” he added. Yes, open models may be misused. However, this method fosters technical breakthroughs that lead to more ethical models, requires access to the complete stack for verification and repeatability, and lowers power concentration, generating more egalitarian access.

AI2 hopes to offer larger and more competent OLMo models, including multimodal models and more datasets for training and fine-tuning, in the coming months. All materials will be free on GitHub and Hugging Face, as with the first OLMo and Dolma releases.

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Add a comment Add a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post

Smart-bus startup suggests consolidation. Zeelo buys smaller player Kura.

Next Post

Okta fires 400 workers barely a year after last cutbacks.

Advertisement