A New Kind of AI Model Lets Data Owners Take Control


A new kind of A great language modelDeveloped by researchers at the Allen Institute for AI (AI2), allows you to control how training data is used even after a model has been built.

The new model, called Flexolm, could challenge Big’s current industrial paradigm Artificial intelligence companies Slurping up data of the network, books and other sources – often with little consideration of possession– And then own the resulting models altogether. After data is baked in AI model today, extracting it from this model is a bit like trying to recover the eggs of a finished cake.

“Conventionally, your data is either in or out,” says Ali Farhadi, Director General of AI2, based in Seattle, Washington. “After I train on that data, you lose control. And you have no exit, unless you force me to go through another multi-million dollar circle of training.”

The avant -garde access of AI2 shares training so that data owners can practice control. Those who want to contribute data to a flexed model can do this first by copying a publicly shared model known as the “anchor.” They then train a second model using their own data, combines the result with the anchor model, and contribute the result back to the one who builds the third and final model.

Contribution in this way means that the data itself should never be delivered. And because of how the data owner’s model merges with the final, it is possible to extract the data later. A magazine publisher could, for example, contribute a text from his Archive of articles to a model but later remove the undermodel trained on that data If there is a legal dispute Or if the company disputes how a model is used.

“The training is completely asynchronous,” says Sewon Me, a researcher at AI2, who led the technical work. “Data owners don’t have to coordinate, and the training can be done completely independently.”

The flexolm-model architecture is what is known as “a mix of experts”, a popular design that is usually used to simultaneously combine several sub models into a larger, more capable. Key innovation of AI2 is a way to merge sub-models that have been trained independently. This reaches through a new scheme to represent the values ​​in a model so that its capabilities can merge with others when the final combined model works.

To test the access, the flexian researchers have created a database, which they call Flexmix from proprietary sources including books and websites. They used the Flexolm design to build a model with 37 billion parameters, around a tenth of the size of the largest open source model of Meta. They then compared their model to several others. They found that it exceeded any individual model on all tasks and also earned 10 percent better at common references than two other approaches to merge independently trained models.

The result is a way to have your cake – and also get your eggs back. “You could only choose the system without serious damage and inference time,” Farhadi says. “It’s a whole new way of thinking about how to train these models.”



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *