2015-06-01

FFmpeg is an extremely powerful and flexible multimedia platform with extensive support to manage and process audio and video files. It provides capability to record, transform and convert audio and video content in numerous forms and formats using easy to use commands. FFmpeg uses a flow sequence on input file to demux, decode, encode and finally mux to generate requested outcome. FFmpeg is available across most platforms including Linux, Mac and Windows.Check here to compile and setup FFmpeg under windows and here for mac.



Under the HOOD – A typical FFmpeg command execution involves passing the encoded packets to the decoder. The decoder generates uncompressed frames (raw video, PCM audio) which are processed further by filtering. The frames are then passed to the encoder which encodes them to desired form. And finally the encoded packets are passed to the muxer to write them in defined container as the output file. A widely misunderstood and often interchangeably used term is codec and container. Let us briefly understand and clear the confusion

Codec : A codec is the name that the video or audio is stored in or simply put the protocol for compressing the audio and video data. E.g. H.264, HEVC , VC-1, VP8 are video codec while AAC , AC3, MP3, FLAC are audio codecs. Codec define the way audio or video content is to be encoded or decoded.

Container : A container is the packager of files (read wrapper) in which video and audio is stored. Container takes the responsibility of packaging, transportation and presentation. Most commonly used containers are .flv, .ogg, .mkv, .mov, .mp4 etc. A file extension generally represents the container type.

FFmpeg will generally attempt to automatically choose the video and audio codec based on the file extension. For example, for an AVI file, unless user specifies, FFmpeg will use MPEG-4 for the video codec, and MPEG-2 for the audio codec. Mentioned below are set of mostly widely used FFmpeg commands which can take care of most audio video processing requirements.

1.  Basic Help Commands

1.1 File info – Get details of audio or video file

$ ffmpeg -i  Hubble.mp4

The command provides file details for title, encoder used, duration and brief summary of audio and video tracks including bitrate, codec, metadata etc. Sample output of file info command is as below

1.2 FFmpeg  help commands

$ ffmpeg    -h      — prints basic options

$ ffmpeg    -h long – prints more options

$ ffmpeg    -h full — prints all options (detailed info format and codec)

1.3 Get details on Audio and Video codecs and format support

FFmpeg support an incredible range of formats and codecs. libavcodec is the default library with bulk of the encoding/decoding support. FFmpeg uses additional libraries to support encoding for special codecs e.g. x264 for H.264/MPEG-4 AVC, x265 for HEVC, the support for these libraries can be enabled at compile time. Decoding for most is natively supported.

$ ffmpeg -formats            show available formats

$ ffmpeg -codecs            show available codecs

2. Audio Format/Codec Conversions/ Transformations

FFmpeg provides numerous options for audio conversions through usage of simple options and flags. Some of these key command options include

2.1    Convert audio format

Command to convert audio format from one to another. e.g. converting .wav to mp3. The command take input file followed by output file.

$ ffmpeg -i input.wav output.mp3

Explanation:  WAV is an high quality lossless uncompressed audio format and mp3 is a audio compression format (mpeg 1 layer 3 audio). The commands encodes the wav files (pcm, uncompressed lossless audio) to mp3 with a default bitrate of 128 kbps which is typically 10% of original size. FFmpeg info commands shows the codec and format details, conversion from pcm_s16le (native) -> mp3 (libmp3lame)

2.2    Convert audio format with specific audio bitrate

ffmpeg -i input.wav  -ab 256k output.mp3

Explanation: If no bitrate information is provided then 128 kbps is the default bitrate used. Bitrate can be set explicitly by using  -ab option. Use above command to set output file bitrate as 256 kbps

2.3 Convert audio format, bitrate  and sampling rate

ffmpeg -i input.wav -ar 48000 -ab 256k output.mp3

Explanation: By default sampling rate is kept unchanged, output file sampling rate can be set using -ar option. The command does the format conversion in addition to changing the sampling rate to 48000 and bitrate as 256 kbps

3. Video Format/Codec conversions

FFmpeg provides numerous options for video conversion using various flag options. Some of the basic  commands are

3.1    Repackaging a file (without re-encoding)

Command to repackage a file with a new container.

Explanation : A H.264 video in FLV container is repackaged to an MP4 container. The command does not encode or decode the file but only changes the container.

3.2    Change video codec – Transcode to a particular format

Transcode a file by changing the video codec from one to another . It can be easily achieved by using -vcodec flag.

Explanation : Source video codec can be changed to a destination codec type while keeping the audio codec same. In the above example video codec is changed to mpeg2 by specifying the flag “-vcodec mpeg2video”. Similarly it is easy to change audio format by specifying destination audio codec (e.g. ac3) instead of copy as well as change the destination container format by specifying the destination container (e.g. .avi). An example below changes video codec, audio codec as well as the container.

4.      Basic Audio/Video File Operations

4.1 Extract Video Only

Command to extract video track only from the file and ignore audio track.

Explanation : The command keeps the video track intact and generates the output file by removing the audio track. “-an” flag used stands for “no audio recording”  and output file has no audio. One can also change the video codec to a different format and any additional transformations like bitrate, different container  etc. by providing corresponding options

4.2 Extract Audio Only

Explanation : The command keeps the audio track intact and generates the output file by removing the audio track. The command uses ‘-vn’ flag which stands for “no video recording” and output file has no video. One can also change the audio codec to a different format and any additional transformations like sampling rate, bitrate etc. by providing corresponding options

4.3 Extract Frame at given timecode location

Extracting a Frame (Picture) out of video at a specific timecode location

Explanation : The command extracts a picture frame from the source file input.avi at timecode 2 minutes 14 seconds and 2 frames as output.png image file. Flag option “-qscale 0″ is used to keep the quality intact

4.4 Extract part of a file at with start timecode and duration

The command to extract  part of file from a given offset or start timecode for a specified duration.

Explanation : Flag “-ss” take the start timecode value and “-t” specifies the duration of file which needs to be extracted from the specified start timecode. In the example above, a 3 min file will be extracted from the sourcefile starting at timecode 0f 1 min and 45 seconds. The output file has same audio and video codecs.

5.   Encoding and Rate Control Algorithms

5.1 Rate Control using Constant Rate Factor

Rate control is a mechanism for having control over encoding. It is advisable to either use constant rate factor (CRF) or perform two pass encoding with latter being a preferred option.

CRF - Constant Rate Factor achieves constant quality when output file size is not important. It provides maximum compression efficiency with a single pass operation. Simply put CRF will compress different frames or use different bitrates  based on the frame complexity to achieve constant quality. CRF quantization range varies on type of encoder. E.g. x264 has a range of 0-51,  value of 0 is lossless, 23 is default, and 51 is the worst possible case. A lower value suggests higher quality ( and hence less compression) with a recommended range of 18-28. Similarly vpx has range of 4-63.

Encoding Preset : An Option which reflects a combination of encoding speed and compression ratio. A slower preset provides better compression but will take longer. The default preset is medium

Explanation : In the above command, slow preset is selected to achieve better compression with CRF set as 22. CRF option can also be used with a maximum bit rate by specifying both -crf and -maxrate setting.

5.2 Rate control using 2 Pass encoding

2 Pass encoding can deliver great results if targeting an output file of specific size

Explanation : In a single pass encoding without any special rate control option the encoder will use same about of data for all frames. This is inefficient since all frames are not similar and have different requirements e.g. a blank frame vs a scene change or a complex scene. Using two pass encoding, the first pass evaluates the video and feeds information to a default log file (ffmpeg2pass.log).The second pass then uses the information from the log file to give a better quality encode.  Two pass encoding offers capability to encoder to determine the required bitrate for each frame.   To get a output file of a desired size, the required bitrate can be found using the formula, bitrate = file size / duration. For example a 15 minute file (900 seconds), and desired output size as 150 MB, the bitrate is calculated as

bitrate = filesize / duration = 150*8000 / 300 = 1333 kbps.  Reducing Audio bitrate of 128k, Video Bitrate ~ 1200k.

Hence with two passes, encoder has the knowledge that a given blank frame can be encoded with a lower bitrate and that another “complex frame” requires more bitrate

Show more