Limited Time Free! Sign up for 14-day trial plan (no credit card needed) with WAF and Bot Management!

The Complete Guide to VP9 Video Processing with FFmpeg

EdgeOne-Dev Team

20 min read

Dec 9, 2024

In today's digital landscape, VP9 has emerged as a powerful and increasingly popular video codec, offering excellent compression efficiency while maintaining high visual quality. As content creators and developers continue to work with various video formats, understanding how to effectively handle VP9 videos becomes crucial. This comprehensive guide explores the versatile FFmpeg framework and its capabilities in managing VP9 video content, from basic encoding and decoding to advanced manipulation techniques. Whether you're a video professional, developer, or enthusiast, this article will equip you with the essential knowledge and practical commands to work efficiently with VP9 using FFmpeg's powerful toolset.

VP9 Video Processing with FFmpeg

What is VP9?

VP9 is an open and royalty-free video coding format developed by Google. It is the successor to VP8 and aims to provide higher compression efficiency and better video quality, especially when processing high-definition and ultra-high-definition video content. VP9 employs a series of advanced coding technologies such as more efficient prediction modes, transform coding, and entropy coding. It can provide clearer images at the same bitrate than previous coding formats or reduce the bitrate while maintaining picture quality, thereby saving storage space and network bandwidth.

During the encoding process of VP9, the block division of images is more refined, and the types of prediction modes are also more abundant. For example, it can better handle texture information and object edges in images, giving the encoded video an advantage in detail restoration. In addition, VP9 has also improved the processing of chroma information and can be closer to the original picture in color performance.

What is WebM?

VP9 is often used in conjunction with the WebM container format. WebM is an open and free media file format specifically designed for storing and playing video content on the web. It is based on the Matroska media container format and aims to provide a universal and high-quality solution for online video.

WebM supports multiple audio and video coding formats, and its combination with VP9 is particularly close. It has good compatibility and can be played in most modern web browsers without the need for additional plugins. This enables video content based on WebM and VP9 to be easily disseminated and played on a wide range of devices and platforms, providing convenience for online video services. Moreover, the WebM format considers the characteristics of network transmission in its design and can efficiently store and transmit video data, reducing buffering time and enhancing the user viewing experience.

VP9 Codec

1. Encoding

The basic command for encoding a video with VP9 using FFmpeg is as follows:

ffmpeg -i input_video.mp4 -c:v libvpx -b:v 2M -crf 30 output_video.webm

-i input_video.mp4: Specifies the path and filename of the input video file. Here, it is assumed that the input video is in MP4 format. You can replace it according to the actual situation.
-c:v libvpx: Specifies the use of the libvpx encoder, which supports VP9 encoding.
-b:v 2M: Sets the target video bitrate to 2Mbps. The bitrate determines the amount of data per second during video encoding and directly affects video quality and file size. A higher bitrate can provide better picture quality but will result in a larger file size and higher requirements for network transmission and storage.
-crf 30: Constant Rate Factor (CRF), with a value range of 0 - 63. A lower value means better picture quality but will also increase the bitrate correspondingly. Here, crf = 30 is a relatively balanced value between picture quality and bitrate.

2. Decoding

To decode a VP9-encoded video, you can use the following command:

ffmpeg -i input_video.webm -c:v copy output_decoded_video.mp4

Here, -c:v copy means directly copying the video stream without decoding and re-encoding. This can quickly extract video content from the VP9-encoded input_video.webm and save it as output_decoded_video.mp4.

Picture Quality Optimization

1. Adjusting the CRF value

Reducing the crf value can significantly improve picture quality. For example, adjusting the crf value from 30 to 20:

ffmpeg -i input_video.mp4 -c:v libvpx -b:v 2M -crf 20 output_video.webm

When the crf value is reduced, the encoder will allocate more data to represent image details, making the picture clearer and sharper. However, the bitrate may increase simultaneously.

2. Adjusting resolution and frame rate

The resolution can be adjusted using the -s parameter. For example, increasing the resolution from the original 1280x720 to 1920x1080:

ffmpeg -i input_video.mp4 -c:v libvpx -b:v 3M -s 1920x1080 output_video.webm

Increasing the resolution can add picture details, but a higher bitrate is also needed to maintain picture quality. If the bitrate is insufficient, phenomena such as blurring or mosaic may appear. During the encoding process, the encoder needs to process more pixel information, so the encoding time may increase. In terms of video loading, high-resolution videos require greater bandwidth and stronger hardware decoding capabilities. If the device or network does not support it, it may lead to slow loading or stuttering.

The frame rate can be adjusted using the -r parameter. For example, increasing the frame rate from 30fps to 60fps:

ffmpeg -i input_video.mp4 -c:v libvpx -b:v 3M -s 1280x720 -r 60 output_video.webm

A higher frame rate can make the picture more smooth, especially more obvious for fast-moving scenes. But similarly, increasing the frame rate will increase the amount of data and require a higher bitrate. When loading high-frame-rate videos, faster data transmission speed is required. Otherwise, frame loss may occur, affecting the viewing experience.

Speed Optimization

1. Multi-threaded encoding

Use the -threads parameter to enable multi-threaded encoding. For example:

ffmpeg -i input_video.mp4 -c:v libvpx -b:v 2M -crf 30 -threads 4 output_video.webm

Here, four threads are set for encoding. Multi-threaded encoding can fully utilize the resources of multi-core CPUs and accelerate the encoding speed. Different threads can process different parts of the video simultaneously, thereby improving overall encoding efficiency.

2. Hardware acceleration

If hardware support is available, hardware acceleration can be used. Taking NVIDIA GPU as an example:

ffmpeg -hwaccel cuda -i input_video.mp4 -c:v libvpx -b:v 2M -crf 30 output_video.webm

Hardware acceleration can transfer some of the computational tasks in the encoding process to the GPU, greatly improving the encoding rate, especially for high-resolution and high-frame-rate videos.

Bitrate Optimization

1. Adjusting the target bitrate

Set the target bitrate reasonably according to the video content. For complex dynamic scenes, the bitrate can be appropriately increased:

ffmpeg -i action_video.mp4 -c:v libvpx -b:v 3M -crf 30 output_video.webm

For videos with more static scenes, the bitrate can be reduced to reduce file size.

2. Bitrate control mode

In addition to simply setting the target bitrate, parameters such as -maxrate and -bufsize can also be used. For example:

ffmpeg -i input_video.mp4 -c:v libvpx -b:v 2M -maxrate 2.5M -bufsize 5M output_video.webm

-maxrate sets the maximum bitrate, and -bufsize sets the buffer size. This method can control the fluctuation of the bitrate to a certain extent and avoid network congestion or storage problems caused by excessive bitrate.

Video Quality Comparison of VP9 at Different Bitrates

The following is a set of approximate data comparisons (the actual situation may vary depending on video content and other factors):

Bitrate (Mbps)	Video quality description
0.5	Obvious loss of picture details, and the color transition is not natural enough. In dynamic scenes, moving objects may appear blurry and have jagged edges. For example, the facial expressions of people may become blurred, and the texture details in the background are almost invisible.
1	The picture quality has improved, and the details can be distinguished, but it is still not clear enough. The color performance is more accurate, and the blurring in dynamic scenes is reduced. For example, in a simple animation scene, the lines and colors can be well displayed, but the complex textures (such as the veins of leaves) are still not fine enough.
2	A relatively clear picture can be presented, and the detail richness is relatively high. The movement of objects in dynamic scenes is relatively natural, and the color restoration is better. For example, in a general TV drama scene, the clothing textures of characters and indoor decoration details can be clearly seen.
3	The video quality is high, the picture details are clear and sharp, the colors are vivid and the transitions are natural. Whether it is a static picture or a complex dynamic scene (such as fast actions in sports events), it can be well presented, and there is almost no obvious loss of picture quality.

How to Choose a Suitable Bitrate for Encoding VP9 Videos

1. Consider video content complexity

Simple content (such as static scenes, slide shows, etc.): If the video is mainly composed of static pictures, such as lecture slide shows or simple landscape picture rotations, a lower bitrate (0.5 - 1Mbps) can usually meet the requirements. Because the elements in these pictures are basically static, and the image details are relatively fixed. The VP9 encoder can accurately represent these contents with less data.
Medium complexity content (such as general conversation scenes, simple animations, etc.): When the video contains character conversation scenes, and the characters have some natural actions, such as slight body movements and facial expression changes, or simple animations, the bitrate can be set at 1 - 2Mbps. This can better capture the expression details and actions of characters, making the video look more smooth, and at the same time, the color and picture details can also be better presented.
High complexity content (such as sports events, action movies, etc.): For high-dynamic and rich-detail content, such as high-speed sports scenes in sports events and intense fighting and special effects scenes in action movies, a higher bitrate is required. Generally, it is recommended to be 2 - 3Mbps or higher. Only in this way can problems such as blurring and trailing in the picture be avoided to ensure that the audience can see every wonderful moment in the video.

2. Combine target resolution and frame rate

Impact of resolution: Videos with lower resolutions (such as 360p or 480p) can use relatively lower bitrates. For example, for a 360p video, for simple content, 0.3 - 0.5Mbps may be sufficient; for medium complexity content, 0.5 - 1Mbps; for high complexity content, 1 - 1.5Mbps. As the resolution increases, such as 720p, 1080p, or higher 4K, 8K, the bitrate needs to be increased accordingly to maintain picture quality. For 720p simple content, 0.5 - 1Mbps may be needed; for medium complexity content, 1 - 2Mbps; for high complexity content, 2 - 3Mbps.
Impact of frame rate: Videos with lower frame rates (such as 15fps or 24fps) have relatively lower requirements for bitrate. But if the frame rate is higher (such as 60fps), in order to ensure the smoothness of the picture, especially for dynamic content, the bitrate needs to be appropriately increased. For example, for a 60fps game video, in order to capture the fast actions and details in the game, it may require a bitrate 30% - 50% higher than the same type of video at 30fps.

3. Consider the limitations of playback platforms and devices

Network bandwidth limitations: If the video is mainly played on network platforms and the target user's network bandwidth is limited, such as in some mobile network environments, a lower bitrate needs to be selected to ensure smooth playback. For example, for some short video applications for mobile users, considering that some users may use 3G or 4G networks, the bitrate can be controlled at about 0.5 - 1Mbps to reduce buffering time and playback stuttering.
Device decoding capability limitations: Some older devices may have performance issues when decoding high-bitrate videos. In this case, the appropriate bitrate needs to be selected according to the performance of the target device. For example, some old smartphones may not be able to play high-bitrate VP9 videos smoothly. If you want to ensure that these devices can also play normally, for simple content, the bitrate can be controlled below 0.5Mbps; for medium complexity content, it should not exceed 1Mbps.

Adaptive Bitrate

1. Generate multiple bitrate versions

Use FFmpeg to generate multiple versions of videos with different bitrates:

# Low bitrate version
ffmpeg -i input_video.mp4 -c:v libvpx -b:v 1M -crf 35 low_bitrate_video.webm

# Medium bitrate version
ffmpeg -i input_video.mp4 -c:v libvpx -b:v 2M -crf 30 medium_bitrate_video.webm

# Medium-high bitrate version
ffmpeg -i input_video.mp4 -c:v libvpx -b:v 2.5M -crf 28 medium_high_bitrate_video.webm

# High bitrate version
ffmpeg -i input_video.mp4 -c:v libvpx -b:v 3M -crf 25 high_bitrate_video.webm

2. Cooperate with streaming media servers

Combine these video files with different bitrates with streaming media servers that support adaptive bitrate (such as DASH or HLS servers). The server will automatically select video segments with appropriate bitrates for playback according to the user's network bandwidth and device performance. In this way, when the network condition is poor, users can watch low-bitrate versions to ensure smooth playback; when the network is good, they can switch to high-bitrate versions to enjoy better picture quality.

Conclusion

FFmpeg provides powerful functions for processing VP9 videos. Through fine-tuning of codec parameters, optimization can be achieved in aspects such as picture quality, speed, and bitrate. Moreover, adaptive bitrate technology can enhance the user viewing experience. In practical applications, factors such as video content, target devices, and network environments need to be comprehensively considered, and these methods need to be flexibly applied to achieve the best video processing and playback effects. Whether it is for online video platforms or local video processing, these technologies have broad application prospects and important values.

If your platform requires support for different streaming protocols and rapid, secure content distribution to global audiences, we invite you to utilize Tencent EdgeOne.

Tencent EdgeOne supports a wide range of streaming protocols, including HLS, DASH, RTMP, and WebRTC, ensuring compatibility with various streaming platforms and devices. With an extensive network of edge servers strategically located around the world, Tencent EdgeOne ensures low-latency and high-speed content delivery, providing a seamless viewing experience for users regardless of their geographical location. The platform incorporates advanced security features such as DDoS protection, SSL encryption, and access control, safeguarding your content and ensuring a secure streaming environment.

We have now launched a Free Trial, welcome to Sign Up or Contact Us for more information.