Teaching AI systems to fully understand what’s happening in video as humans can do is one of the toughest challenges in the world of machine learning and one of the biggest potential breakthroughs. Today, Facebook announced a new initiative that will give you an edge in this resulting work. AI training on public videos from Facebook users.
Access to training data is one of AI’s greatest competitive advantages, and by collecting this resource from millions of users, tech giants like Facebook, Google, and Amazon have been able to get ahead in a variety of areas. Facebook has already trained machine vision models on billions of images it has gathered on Instagram, but it hasn’t previously announced a project with similar ambitions for video comprehension.
“By learning from a global stream of publicly available video across almost any country and hundreds of languages, our AI system will not only improve accuracy, but also adapt to a fast-moving world and recognize the nuances and visual cues of different cultures and regions. is.” The company on the blog. The project, titled Learning from Videos, is also part of Facebook’s “extensive effort to build machines that learn like humans.”
The resulting machine learning model will be used to create new content recommendation systems and moderation tools, says Facebook, but there is a lot more to do in the future. AI, capable of understanding the content of your video, provides unprecedented insight into the lives of Facebook users, allowing you to analyze hobbies and interests, brand and clothing preferences, and countless other personal information. Of course, Facebook already has access to this information through its current ad targeting operations, but being able to parse the video via AI will add an incredibly rich (invasive) data source to the store.
Facebook is ambiguous about future plans for AI models trained with users’ videos. The company said The Verge These models can be used for a variety of purposes, from video captioning to creating advanced search features, but we haven’t answered the question of whether they are used to gather information for targeting ads. Similarly, when asked if users should agree to or can opt out of using their videos for AI training on Facebook, the company said, according to its data policy, that the content uploaded by users “may be used for product research and development.” I answered. ”Facebook also did not respond to questions asking how many videos would be collected to train the AI system or if company researchers could access this data.
But in a blog post announcing this project, social networks pointed to future speculative uses. It’s about using AI to search for “digital memory” captured with smart glasses.
Facebook plans to release smart glasses for consumers sometime this year. The details of the device are ambiguous, but these or future glasses are likely to contain an integrated camera that captures the wearer’s point of view. If an AI system can be trained to understand the content of a video, it allows users to search past recordings, just like many photography apps allow people to search for a specific location, object, or person. (This is incidentally information often indexed by AI systems trained on user data.)
“It’s become commonplace” to record video with smart glasses, says Facebook. “People need to be able to remember them as easily as capturing certain moments in digital memory.” Provides an example where users perform a search with the phrase “Show me a happy birthday song every time I sing” before receiving a related clip. As the company has pointed out, these searches require that the AI system establish a link between the data types to “match the phrase’Happy Birthday’ to cakes, candles, people singing various birthday songs, etc.” Like humans, AI needs to understand a wealth of concepts made up of different types of sensory input.
Looking into the future, the combination of smart glasses and machine learning will turn smart glasses wearers into wandering CCTV cameras, enabling “world scraping” to capture granular data about the world. As the practice explained in last year’s report tutelar:“Whenever someone browses the supermarket, smart glasses record real-time price data, inventory levels, and browsing habits. Every time they open the newspaper, their glasses will know the stories they read, the advertisements they saw, and photos of the celebrity beaches in which their eyes remain.”
This is an extreme result and not a means of research that Facebook has said it is currently working on. However, this shows the potential importance of combining advanced AI video analytics with smart glasses. Social networks definitely want to do that.
By comparison, the only use of the new AI video analysis tools that Facebook is currently unveiling is relatively mediocre. With today’s Learning from Videos announcement, Facebook announced that it has deployed a new content recommendation system based on video work by TikTok-clone Reels. “Popular videos often consist of the same set of music for the same dance moves, but they were created and played by different people,” says Facebook. Facebook’s AI can analyze video content and suggest similar clips to users.
However, these content recommendation algorithms are not without potential problems. Recent report MIT technology review It has highlighted how the social network’s emphasis on growth and user engagement has left the AI team completely out of touch with how algorithms can spread misinformation and promote political polarization. in Technology review In the article” [machine learning] Models that maximize participation also favor controversy, misinformation and extremism. “This creates a conflict between the obligations of Facebook’s AI ethics researchers and the company’s growth maximization tenet.
Facebook is not the only conglomerate pursuing advanced AI video analytics, nor is it the only one leveraging user data for this. For example, Google maintains a publicly accessible research dataset that includes 8 million curated and partially labeled YouTube videos to’accelerate research on large-scale video understanding’. Search giant’s ad operations can similarly benefit from AI that understands the content of a video, even if the end result is simply to serve more relevant ads on YouTube.
But I think Facebook has one particular advantage over its competitors. Not only is there enough training data, but more and more resources are being put into the AI method of self-supervised learning.
Typically, when an AI model is trained on data, these inputs must be human-labeled. For example, tag objects in pictures or record audio recordings. If you’ve unwrapped CAPTCHAs that identify fire hydrants or crosswalks, it’s likely that you’ve labeled data that helps AI training. However, self-supervised learning eliminates labels and speeds up the training process, and some researchers believe that deeper and more meaningful analysis is possible as AI systems teach themselves to join the dots. Facebook is so optimistic about self-supervised learning called “the dark matter of intelligence”.
Future work on AI video analytics will focus on semi- and self-supervised learning methods, the company says, “it has already improved our computer vision and speech recognition systems.” With this rich video content served by Facebook’s 2.8 billion users, it certainly makes sense to skip the labeling part of AI training. And if you can teach social networks to seamlessly understand video through machine learning models, who knows what they can learn?