Benefiting from AI and deep learning for video summarization

Adobe Sensei's Director of Machine Learning, Divya Jain shares about AI's key role with video summarization and gauging interest in the content.

Author

Divya Jain

Date published May 9, 2019 Categories

The global video market is taking center stage, according to Forbes, over 500 million hours of video are watched on YouTube every day. Google adds that almost 50% of internet users look for videos related to a product or service before visiting a store. Many such statistics show how video content is growing and will remain the mainstream as a means of sharing information. We are already seeing a shift from copy and text to snapshot stories and visual posts (for instance, Instagram) for sharing content. Artificial intelligence (AI) is also playing a large role in this shift to video. We can use AI to improve video quality by stabilization, to understand and classify content for editing purposes, or to better deliver and target.

AI is also playing a key role with video summarization, a process of shortening a video by selecting keyframes or parts of videos that capture the main points in the video. Summarization has many use cases, with one of the most significant being the ability to gauge interest in the content. A flashcard summary can determine how many people will actually watch an entire video. Even a single thumbnail plays a crucial role in determining how many people will click on a video to play it. Along with determining video clicks, video summarization is also necessary for efficient viewing of the material and for video length adaptation for different mediums, like Instagram, Facebook, and the others.

Recently, there have been many advances in using deep learning to increase the processing of images. The ability for AI to understand an image’s context has rapidly improved in accuracy. Similar techniques can be used to understand videos too, but this is a much more complex process. Video is not just a collection of a large number of frames or images, but videos are multi-dimensional, including audio, motion, and a time-series dimension. Each of these dimensions is key in understanding a video, and depending on what the summarization is targeting, different dimensions can be crucial.

The anatomy of AI video summarization

Video summarization can be categorized into two broad areas of machine learning, supervised and unsupervised. Supervised summarization entails learning patterns from previously annotated videos and examples. This works very well in case of videos where a pattern exists, like sporting events. For these videos, we can annotate some sequences and learn from them. However, the biggest challenge with supervised learning is the labeled data. It is costly to create these well-defined datasets. Labeling of data requires domain knowledge and does not work well when it comes to a wide variety of content that is present on the web.

The other machine learning form of summarization is unsupervised, where a smaller number of frames are selected from the original video through change detection in the video. Low-level features such as color, motion, and texture have been commonly used to create histograms and clusters to determine the similar frames within a video. A few frames are then selected that are deemed useful for the summary based on the information that they are conveying from the original video. These techniques work best when the video has distinct visual content, for example, a video taken throughout the different days of a vacation. However, these summaries often lack the context and come out as disjointed images.

Recent forms of deep learning look very promising in addressing the above-mentioned challenges. They lend themselves to much more effective creation of video summaries. While supervised deep learning techniques popularized the process, unsupervised techniques such as generative adversarial networks (GANs) and reinforcement learning are showing great promise, offering excellent advantages that are making them a forerunner in video summarization.

The power of emerging unsupervised deep learning techniques in video summarization

For videos that don’t adhere to any pattern and are completely different from each other, GANs work very nicely. GANs have two neural nets:

An encoder that tries to mimic the real data.
A decoder that is trying to learn if the generated data is fake or not.

This helps GANs learn the data distribution very effectively and create data that is very difficult to distinguish from the original dataset. In this case, each video can be described as a dataset, with GANs creating a subset of frames that are most representative of the given videos. This generates unique summaries for videos while preserving the context and meaning of the videos themselves. This technique can be used by marketers for creating smaller versions of full-length ads or campaigns based on the devices and target the right audience. This can also be used by creative artists to give a preview of their upcoming releases.

For videos that have a common structure, like sporting events, reinforcement learning is more effective than supervised learning because it does not require labeled data. Here, the neural nets can learn which frames to choose based on a reward function. They learn from previous summaries to determine whether certain frames were watched or skipped. Different kinds of reward functions can also be defined in ways where previous information is not required, such as frame diversity and representativeness or frame category classification. Such techniques can be employed by campaign managers to create more watchable and memorable summaries from past experience and engage with their customers effectively.

These new unsupervised techniques are just the start of a new era in deep learning technology when it comes to video summarization. Many advances will be made in the near future to create and optimize the best summaries based on the audience, delivery medium, and intent of summarization. Together with efforts across the industry, we’ll make video summarization highly scalable, reliable, and incredibly efficient.

Divya Jain is Director of Machine Learning at Adobe Sensei. She can be found on Twitter @divyajain1.

Subscribe to get your daily business insights

More about:

Read the next article

Engagement To Empowerment - Winning in Today's Experience Economy

Report | Digital Transformation

Engagement To Empowerment - Winning in Today's Experience Economy

Engagement To Empowerment - Winning in Today's Exp...

Customers decide fast, influenced by only 2.5 touchpoints – globally! Make sure your brand shines in those critical moments. Read More...

View resource

Announcement Alert from Lee Arthur

Weekly briefing | Digital Transformation

Announcement Alert from Lee Arthur

Announcement Alert!! Read More

View resource

The 2023 B2B Superpowers Index

Whitepaper | Digital Transformation

The 2023 B2B Superpowers Index

The Merkle B2B 2023 Superpowers Index outlines what drives competitive advantage within the business culture and subcultures that are critical to succ...

View resource

Impact of SEO and Content Marketing

Whitepaper | Digital Transformation

Impact of SEO and Content Marketing

Making forecasts and predictions in such a rapidly changing marketing ecosystem is a challenge. Yet, as concerns grow around a looming recession and b...

View resource

The age of the prompt: Brand drivers wanted

3y Steve Susi

The age of the prompt: Brand drivers wanted

The rise of Large Language Models (LLMs), such as ChatGPT brings opportunities for creative marketing executives to transform not just their brands bu...

View article

Overcoming CX shortfalls across digital channels with (and without!) AI

Actionable Analysis

Overcoming CX shortfalls across digital channels with (and without!) AI

4y Cyril Coste

Overcoming CX shortfalls across digital channels w...

Exclusive advice from global influencer and CXO leader, Cyril Coste on how to amp up your customer experience (CX) strategy and connect AI with digita...

View article

Nestlé USA drives consumer engagement with cookie coach, AI bot ‘Ruth’

AI & Automation

Nestlé USA drives consumer engagement with cookie coach, AI bot ‘Ruth’

5y Kamaljeet Kalsi

Nestlé USA drives consumer engagement with cookie ...

How a non-cookie-cutter strategy optimized CX and won the brand a historic average session length Read More...

View article

A contingency plan for the inevitable cookie death

AI & Automation

A contingency plan for the inevitable cookie death

5y Jeremy Hlavacek

A contingency plan for the inevitable cookie death

Google’s decision to kill Chrome's third-party cookies shifted to 2023 but that should not lull digital advertisers and publishers into complacency – ...

View article

Can we trust AI if we don't trust each other?

AI & Automation

Can we trust AI if we don't trust each other?

5y Helen Yu

Can we trust AI if we don't trust each other?

Wall Street Journal Best Seller, Helen Yu shares reflections about the inner workings of AI, the causes for distrust, and the potential route organiza...

View article

If you want sellers to sell more, embrace AI

Acquisition

If you want sellers to sell more, embrace AI

5y Kayleigh Halko

If you want sellers to sell more, embrace AI

Oracle CX Sales Strategist, Kayleigh Halko highlights how today’s CRM isn’t cutting it for sellers and why AI needs to be injected to help give intell...

View article

Customer experience in 2025: here’s where we’re heading

AI & Automation

Customer experience in 2025: here’s where we’re heading

5y Chris McGugan

Customer experience in 2025: here’s where we’re he...

By 2025, it’s likely that nine dollars of every $10 will be spent on the digital experience versus phone/voice. Oracle Service's SVP and GM, Chris McG...

View article

AI-powered chatbots deliver personalization at scale

AI & Automation

AI-powered chatbots deliver personalization at scale

5y Jacqueline Dooley

AI-powered chatbots deliver personalization at sca...

Conversational marketing platforms that utilize automated, personalized, real-time conversations are reshaping digital commerce Read More...

View article

Follow us

Strategy

Innovation

Insights

Stats & Tools

Benefiting from AI and deep learning for video summarization

The anatomy of AI video summarization

The power of emerging unsupervised deep learning techniques in video summarization

Leave a Reply Cancel reply

Subscribe to get your daily business insights

Read the next article

Engagement To Empowerment - Winning in Today's Experience Economy

Engagement To Empowerment - Winning in Today's Exp...

Announcement Alert from Lee Arthur

Announcement Alert from Lee Arthur

The 2023 B2B Superpowers Index

The 2023 B2B Superpowers Index

Impact of SEO and Content Marketing

Impact of SEO and Content Marketing

Related Articles

The age of the prompt: Brand drivers wanted

The age of the prompt: Brand drivers wanted

Overcoming CX shortfalls across digital channels with (and without!) AI

Overcoming CX shortfalls across digital channels w...

Nestlé USA drives consumer engagement with cookie coach, AI bot ‘Ruth’

Nestlé USA drives consumer engagement with cookie ...

A contingency plan for the inevitable cookie death

A contingency plan for the inevitable cookie death

Can we trust AI if we don't trust each other?

Can we trust AI if we don't trust each other?

If you want sellers to sell more, embrace AI

If you want sellers to sell more, embrace AI

Customer experience in 2025: here’s where we’re heading

Customer experience in 2025: here’s where we’re he...

AI-powered chatbots deliver personalization at scale

AI-powered chatbots deliver personalization at sca...