According to Meta, until now, many companies have been forced to make compromises on their artificial intelligence systems. To get the best results, an artificial intelligence system that processes text, images and other images must be trained on an immense data set and specialized for a specific task like identifying hate speech.
The model might be good at spotting hate speech, but that’s all it can do The process is expensive for teams that want to use artificial intelligence. Smaller, simpler and less capable models are used by companies instead of the most capable models.
MultiRay makes it possible to reuse the results of training for many different tasks. Multiple models trained for specific tasks can be run on the same input and have the same processing costs. When creating more powerful models, it results in a lower per-model processing cost.
Meta’s Artificial Intelligence team wrote in a post that doing this helps them maximize the cost of performing the tasks. Due to the concentration of company-wide computation into a single model, we can also trade off between compute power and storage at the company level
Meta calls MultiRay’s models universal models, which are trained to perform strongly across a wide range of tasks. These jack-of-all-trade models have been shown to deliver higher-quality results, allowing Meta’s teams to improve and iterate quickly on all manner of machine learning models for numerous types of applications, such as topic tagged for posts, hate speech detection, fake news and so on Meta’s first model is called TextRay, and it has been up and running since 2020.
Meta is using MultiRay to create artificial intelligence. Text, images and a video are possible in some Facebook posts. The elements need to be assessed in the context of the others in that case. Combining several compute-intensive models into one large intensive model is normally how this would be done.
Increased compute and power consumption slows down our efforts to bring the most advanced ML models into production for our products and services.
PostRay brings text and image understanding capabilities into a single model. PostRay models are more complex to train, deploy and maintain due to incorporating multiple capabilities into a single model. Meta said that by using MultiRay, it only has to perform these tasks once, and that model can be used by dozens of different teams within the company.
Meta researchers said that a centralized system allowed them to work directly with cutting-edge research teams and bring their work to production soon after it was published.
There are two advantages to centralizing the models, one of which is that they can be amortization across multiple teams. Training powerful models puts a huge demand on resources such as graphics processing units and each model must be trained separately. MultiRay allows teams to train multiple models at once, and split the bill between them, so they all get the same resources.
A second advantage is that MultiRay makes the process simpler. MultiRay has a small number of large centralized models that allow a single team to handle most of the operations. Smaller, task specific models that are easier to manage are owned by client teams. Many teams don’t have the bandwidth to train, deploy, and manage cutting-edge artificial intelligence.
Meta admitted that implementing MultiRay led to many new challenges. The energy required to process queries is affected by query size and hit rates.
If each of the models is widely used, splitting the costs of training works. State-of-the-art quality across multiple use cases is what the models must provide. To ensure this, Meta has had to make some heavy investments in model refresh and innovate new model architectures and training flows.
The code that powers MultiRay was not mentioned by Meta. Meta has made a lot of its research available to the community, so others might be able to benefit from MultiRay’s capabilities soon.