Apple quietly MM1, a multimodal LLM

Apple researchers quietly published a paper describing the company’s paintings in MM1, a set of multimodal LLMs (large language models) designed to caption images, answer visual questions, and infer in natural language. This indicates that Apple, which had remained silent on AI while the rest of the industry took advantage of it in the next wave, has made some progress and will possibly soon play a leading role.

“In this paper, we talk about the creation of high-throughput multimodal multimodal (MLLM) models,” reads the description of MM1: Methods, Analysis, and Insights of Multimodal LLM Pre-Training in arxiv. org. “We show that for large-scale multimodal pretraining, the use of a judicious combination of symbol legends, interlaced symbol texts, and text-only knowledge is very important to achieve effects in a few shots in multiple benchmark tests, compared to other published effects. before training.

Sign up for our new free newsletter to receive three time-saving tips every Friday and get free copies of Paul Thurrott’s Windows 11 and Windows 10 Field Guides (normally $9. 99) as a special welcome gift.

The article describes MM1 as a circle of relatives of multimodal models that support up to 30 billion parameters and “achieve competitive functionality after supervised adjustment over a diversity of established multimodal references. “As Apple researchers put it, MLLMs (multimodal giant language models) have “the next frontier of basic models” after classic LLMs, and are “reaching greater capabilities. “

Apple researchers have made a breakthrough in educational models with photographs and text, and those effects will help those looking to adapt those models to larger and larger datasets with greater functionality and reliability. Of course, for now, all we need to do is concentrate on the role, as MM1 is not available for testing.

And that may never be the case: Apple is rumored to be running an LLM framework called “Ajax” as part of an effort by R.

“We consider AI and device learning to be foundational technologies, and they’re an integral component of virtually every single product we bring to market,” Apple CEO Tim Cook said on a post-earnings conference call in February, after a year of silence on the issue. theme. ” We are excited to share the main points of our ongoing work in this area later this year. “

Since then, the company has also highlighted AI prowess in its recently announced update to the M3 MacBook Air. But the big push will most likely come in June, when Apple is expected to hold the next edition of its annual developer trade show. , WWDC. Es reasonable to expect this occasion to focus on AI, as the upcoming developers from Google (I/O) and Microsoft (Build) will demonstrate.

Paul Thurrott is an award-winning technology journalist and blogger with more than 25 years of industry experience and 30 books. He owns Thurrott. com and hosts third-generation podcasts: Windows Weekly with Leo Laporte and Richard Campbell, Hands-On Windows, and First Ring Daily with Brad Sams. In the past, he was a Senior Technology Analyst at Windows IT Pro and a writer for SuperSite for Windows from 1999 to 2014 and a Thurrott. com Major Dome at BWW Media Group from 2015 to 2023. You can connect with Paul via email, Twitter, or Mastodon.

Join the crowd where the love of the generation is genuine – become a Thurrott Premium member today!

Leave a Comment Cancel Reply