The Compressionist: 2009

Friday, July 3, 2009

Quality Control for Web Video

Web video need to balance quality and size - you can't afford the bandwidth to make your video look perfect, so you have to accept some loss in quality. On the other hand, you don't want the loss in quality to be distracting to the viewers of the video.

When checking quality, you have to look closely at the video, and that can lead to your paying attention to artifacts that won't be noticed by the people watching your video, and possibly missing other artifacts that don't look as bad to you, but which are distracting when actually watching the video.

My method for identifying problems and knowing when I have fixed them is this:

First, I watch the compressed video straight through without pausing. I remember what distracts my attention from the video, and take notes.

I use these notes when working on fixing problems. I concentrate on those areas that distracted me the most. Although I might try to improve other artifacts that I see when working on problems, I make sure these artifacts are visible when playing at full speed.

When I think I have fixed the problems, I watch the fixed video again without pausing, trying to watch the way the expected audience would. If I don't see any problems watching like this, then the video quality should be acceptable.

Tuesday, May 12, 2009

VP6 Maximum Quantizer - the real meaning

This setting is sometimes called Minimum Quality.

This limits the amount of difference from the original frame that is allowed. The lower the Max Quantizer (or the higher the Minimum Quality) setting, the closer the compressed frame will be to the original. This will often increase the percieved quality of the video by preventing keyframes from being encoded poorly (which would lead to the video quality dropping for a long period of time).

This setting will override other settings in the VP6 codec, and can cause the video size to increase substantially, potentially causing the bitrate to go significantly above the target bitrate. To help keep the bitrate from going too high for good streaming, you should lower the target bitrate and increase the variability of VBR encoding when increasing this parameter - doing this will allow the codec to save bandwidth in other areas of the video so the keyframe looks good and the average bitrate stays reasonable.

When encoding HD video with this setting set for higher quality, you will sometimes get timeouts on the encode leading to every frame being encoded as a keyframe. To avoid this, encode on a fast machine and avoid any bottlenecks during encoding and encode (sample bottlenecks are USB hard drives and running other programs that use the CPU or hard drive.)

Technical terms - this setting actually controls the maxumum # of the Discrete Cosine Transform coefficients that can be dropped, and has a range from 0 (keep all coefficients) to 64 (allow the codec to discard all coefficients).

When this setting is called Minimum Quality, the meaning is inverted and the range is 0 (equivalent to 64) to 100 (equivalent to 0)

Friday, March 20, 2009

CBR and VBR - the real meaning

CBR stands for Constant Bit rate. What this actually means is that the bitrate doesn't vary much over time. Very few codecs can guarantee exact constant bitrates, so there is almost always some variation.

VBR stands for Variable Bit rate. This means that the bitrate is allowed to vary to a larger amount to maintain better quality over the entire clip.

CBR is useful when streaming video over a severely bandwidth-constrained channels that can maintain fixed speeds, like dialup internet, ISDN, Broadcast television, or Cable or Satellite TV Channels.

VBR is useful when providing video over connections that work better with overall lower speed and occasional spikes of high speed, like broadband internet.

CBR tends to waste space on easy-to-compress sections of video and lose quality on hard-to-compress sections.

VBR tends to produce better quality at any given average data rate.

I generally recommend VBR unless CBR is required for technical reasons.

Tuesday, March 17, 2009

2 Pass Encoding - the real meaning

2 Pass encoding compresses the video twice - the first time solely to determine how large each frame tends to be, the second time to actually compress with optimized frame size and/or quality.

This allows the encoding program to correct for variations in compressibility and maintain a more even Data Rate.

When combined with Variable Bit Rate, the encoding program can use the information to improve quality in difficult sections of the video.

Friday, March 13, 2009

Display Aspect Ratio - The Real Meaning

This is the relative height/width that the final video should be played back at.

Non-widescreen Standard Definition TV is 4:3, Widescreen and High Definition TV are 16:9.

Movies vary quite a bit, from very old movies in 4:3, older movies in the Academy standard of 11:8, others in 1.85:1 or 2.40:1, and there were a wide variety of other widescreen movie aspect ratios.

Same as Source means the program will read the original video and use that aspect ratio.

You will sometimes see 1:1 or Square pixels listed. This means that the aspect ratio is proportional to the number of pixels in each direction. this is generally correct for video to be watched on a modern computer (you should resize the video to be proportional to the video's actual aspect ratio if needed.)

Friday, February 20, 2009

Encoding Profile - the real meaning

Many codecs have 'profiles' - these are collections of allowed settings to encode for limited power devices like cellphones, MP3 players, and set-top boxes. For example, iPods with h.264 capability support the Baseline h.264 profile only.

Generally, higher level profiles require more memory and CPU power to decode, and produce better quality.

My general recommendation is to find the lowest common denominator of the devices you are compressing for and use that if they are similar in capability. If some of the devices are much less powerful than others, you might want to make 2 or 3 versions, one for the lower power devices, and another higher quality for the higher power devices.

Saturday, February 14, 2009

Initial Buffer Fullness - the real meaning

When using a VBV buffer, this specifies how full the buffer should be at the start of the video file.

This has an effect on the quality of the first section of the video, but shouldn't have any effect after one or two buffer lengths into the video.

Tuesday, February 10, 2009

VBV Buffer size - the real meaning

This sets how large (in seconds) to set the Video buffer. The compression will be done so that the average bitrate over the VBV buffer is the requested bitrate.

This method of handling variable bitrates for streaming is not necessarily optimal, since the maximum size of the VBV buffer is smaller than most hard-to-compress sections of video, and there is a playback delay proportional to the size of the VBV. This also leads to wasting space in easy-to-compress areas, as the video size will be increased to fill the buffer.

Sunday, February 8, 2009

Force Block Refresh Every x seconds - the real meaning

This makes sure that every part of the video is refreshed at least once during the interval.

This limits the amount of time before the video recovers from a dropped packet.

This is useful for live video or if there is packet loss, but can hurt quality slightly.

If this is set higher than the maximum keyframe interval, it will have no effect, as keyframes always refresh the entire frame.

Wednesday, February 4, 2009

Packet Size Limit - the real meaning

This is also known as Streaming Packet size

This is used to limit the size of each video packet to reduce fragmentation.

Larger packets have less overhead, but if the packet is larger than the network packet size, the packet will be fragmented - split into multiple pieces, then reassembled on the playing computer.

Fragmented packets will have little effect on playback unless one of the fragments is lost - and with broadband streaming, players may even be able to recover from lost packets.

Tuesday, February 3, 2009

Deinterlacing - The real meaning

This is a complex subject, and there are many aspects. I will try to explain the important parts as simply as I can.

Field order

This is whether the lower (even lines) or upper (odd lines) field comes first in the frame.

DV cameras are almost all Lower field first (there are some exceptions with PAL video), while most other codecs use Upper Field first.

If you get this wrong, the output video may look jerky

Deinterlace methods

Weave - this is also called None - keep both interlaced fields in the frame. The resulting video may have horizontal 'comb' artifacts.

Discard even field, discard odd field - use only 1/2 the height and resize. This loses some details and some motion from the original footage, but eliminates the comb artifacts.

Resize by duplicate - results in blocky edges.

Resize by linear interpolate - smooths blocky edges.

Resize Bicubic - restores some of the details with a better guess, but takes longer.

Resize Lanczos - does a better guess than Bicubic, but takes even longer.

Edge Detect - Same as discard and interpolate, but tries to detect edges to interpolate along.

Blend - average the 2 fields - this blurs the motion to some extent.

Smooth Blend - does a lowpass filter on the blend to smooth the image, losing fine details.

Bob - Double frame rate, resize each field to full height. This maintains all the motion from the original footage, but loses some details.

Motion Compensation - analyze motion in the movie and recreate progressive frames based on analysis of the objects - this is the best method, although it takes significantly longer.

Double frame rate - improves quality by maintaining all original information from the interlaced video.

Interlace Detection - also known as Deinterlace Type.

Deinterlace all - apply deinterlace to all frames, even those that aren't interlaced.

Deinterlace interlaced - The program will try to detect interlaced frames, and deinterlace only those frames that are interlaced.

Deinterlace moving - The program will try to isolate moving areas of frames, and deinterlace the moving areas of frames only.

Monday, February 2, 2009

Auto key frames - the real meaning

This is also called Auto key frame on scene change.

This allows the encoder to insert a key frame when the scene changes. You may have to set a threshold setting of how much change must exist before the encoder inserts a key frame.

Allowing this generally improves the quality of the video, and is highly recommended.

Sunday, February 1, 2009

Compression Speed vs. quality - the real meaning

This controls how much time the encoder puts into getting a better file.

The Specific tradeoffs of time and quality depend on the codec.

I recommend setting this close to the highest quality setting unless

you have to compress video in realtime or are under a tight deadline.

Thursday, January 29, 2009

Minimum Distance to keyframe - the real meaning

This sets the minimum # of frames between key frames.

This prevents the compressor from making every frame a key frame

in high action scenes.

I would recommend setting this to about 1/2-1 second, depending on the content.

(for 30 frame/second video, set this to 15-30)

Wednesday, January 28, 2009

Keyframe distance - the real meaning

This is also called Key Frame every x frames.

This sets the maximum # of frames between key frames. If this # of frames without a keyframe occurs, the compressor will insert a keyframe regardless of whether the scene has changed.

The larger this number, the smaller the resulting video file.

The smaller this number, the more control the user has over video playback - when seeking in web video, you might only be able to seek to a key frame.

I would recommend setting this to 5-10 seconds for reasonable file size of longer videos.

(for 30 frame/second video, set this to 150-300)

For 1-3 minute videos where you want users to be able to seek more accurately, set this to 2-5 seconds.

Tuesday, January 27, 2009

B-Frames - the real meaning

The name is short for Bidirectionally Predicted frames. These frames can refer to other frames that occur both before and after the B-frame.

In other words a B-Frame can say "this frame is the same as the last frame except that the football player has moved, and the ball is from the next frame except that it has moved"

Specific differences

In MPEG-1, 2 and B-frames can refer only to the previous and next Key or P-frame

In h.264. B-Frames can refer to multiple key, P and B frames

Monday, January 26, 2009

P-Frames - the real meaning

The name is short for Predicted frames.

These frames can refer to other frames in order to reduce the frame size.

In other words a P-frame can say "this frame is the same as the last frame except that the football player has moved and the ball is new - this is what the ball looks like"

P-Frames are significantly smaller than Key Frames, but jumping to them is harder, as one or more other frames has to be decoded in order for the P-frame to be decoded.

Specific differences

In MPEG-1,2 and 4, P-frames can only refer to a single previous Key or P-frame.

In h.264, P-frames can refer to multiple Key, P, or B-frames

Sunday, January 25, 2009

Key Frames - the real meaning

Key Frames are also known as I-frames - short for Intra-coded frames.

Key Frames are encoded with no other frames used as a reference.

This allows one to jump to a Key frame with minimal decoding effort.

Key Frames are large compared to other types of compressed frames, and having too many of them will hurt video quality. On the other hand, having too few will make it hard to navigate in the video.

Saturday, January 24, 2009

Frame Rate - the real meaning

This is the frames per second of the final video.

You usually have an option to use the original frame rate or some fraction of the original, plus fixed frame rates.

Some compression programs put standard frame rate conversions here (e.g. inverse telecine/inverse 3:2 pulldown, Film-PAL, PAL-Film)

My recommendation is to use the original frame rate whenever possible. If you don't have that option, use 1/2 the original frame rate.

Frame Size - the real meaning

This is just the final video width and height in pixels (dots).

640x480 is roughly equivalent to NTSC Standard Definition television and 768x576 is roughly equivalent to PAL Standard Definition television. 1280x720 is 720P HD and 1920x1080 is 1080i/1080P HD

You additionally may have to specify how to adjust the aspect ratio to fit. There are 4 possible options:

1. Distort - also known as Unconstrained. This resizes the video to fill the final size regardless of the original aspect ratio. This is appropriate for converting 720x480 NTSC video or PAL 720x576 PAL video to computer square pixels.

2. Letterbox - this will put black bars in the video to adjust the aspect ratio. This is appropriate for publishing video in a player that resizes video to fill the player screen.

3. Maintain Aspect Ratio - this will shrink one of the output dimensions so that the original aspect ratio is maintained. This is preferable to Letterbox for players that maintain aspect ratio when scaling video.

4. Crop - also known as pan & scan. This will crop off the edges of the video that won't fit in the destination. This is almost never appropriate.

I recommend you make the final video as large as practical without going larger than the original video. You gain more from the extra pixels than you lose in size.

Thursday, January 22, 2009

Peak Rate - The Real Meaning

The Peak Rate setting is the maximum bitrate that should be allowed by the codec.

Be aware that some codecs will occasionally peak somewhat higher than the Peak Rate setting.

Tuesday, January 20, 2009

Bitrate in the Real World

The Bitrate setting is also known as Data Rate and Average Rate.

In almost all codecs and programs it is used to set the target (not actual) size of the file in kilobits/second (kbps). Usually, the actual size is larger than the bitrate setting (how much larger varies based on the program and codec)

Determining the final actual bitrate.

To determine the approximate actual bitrate of a compressed file, take the file size in Kilobytes, multiply by 8 (there are 8 bits in a byte), then divide by the # of seconds of video in the file. This will give you a total bitrate for audio and video combined.

This is only an approximate bitrate because there is some overhead, but it should be very close.

Compression Settings in the Real World

The settings available in most compression programs are confusing.

The help file often is confusing, and often inaccurate.

Sometimes the actual effect of a setting is the opposite of what you would expect.

With that in mind, I am working on describing most of the settings available, listing some alternate wording used in the programs, and describing what the setting does in the real world.

Followers

Blog Archive

About Me