Earlier this week, an investigation revealed that Apple and other tech giants used YouTube translations to train their AI models. It includes over 170,000 videos from artists like MKBHD and Mr. Beast and more. Apple then used this data set to train its open-source OpenELM models, which it released in April.
Apple has now confirmed the matter 9 to 5 MacHowever, OpenELM does not support any AI or machine learning features – including Apple Intelligence.
Apple says it’s made OpenELM-Model As a means of contributing to the research community and promoting the development of large open source language models. In the past, Apple researchers have described OpenELM as an “evolved open language model”.
According to Apple, OpenELM is intended for research purposes only and is not intended to use Apple Intelligence features. The model has been released as open source and is generally available, including… Apple’s machine learning research page.
Since OpenELM is not used as part of Apple Intelligence, this means that the YouTube caption dataset is not used to support Apple Intelligence. In the past, Apple has said that Apple Intelligence models are “trained on licensed data, including data selected to enhance certain features as well as publicly available data collected by our web crawlers.”
Finally, Apple informed me that no new version of the OpenELM model will be developed.
how wired known Earlier this week, companies like Apple, Anthropic and NVIDIA used this “YouTube subtitles” dataset to train their AI models. This dataset is part of a larger collection called “The Pile” by the nonprofit EleutherAI.