Android 音视频系列:H264视频编码介绍
- 空间冗余:同一个物体表面上采样点的颜色,存在空间连续性,是相同或相近的。
- 时间冗余:连续画面之间存在相关性,例如两人在房间里聊天,背景没有变化,人也只有位置和动作的变化。
- 结构冗余:某些结构是简单图像模式的重复,如蜂窝,方格地板。
- 知识冗余:某些图像的理解,跟知识有相关性。如人脸有固定的结构,包含眼、鼻子、嘴巴,按一定位置排列。可以对具备固定结构的图像元素,构造模型,结合图像库,只需要几个参数就可以表征。
- 视觉冗余:人眼对图像场的敏感性是非均匀和非线性的。对色度相对不敏感,对亮度更敏感。在高亮度区,人眼对亮度变化敏感度下降。对物体边缘敏感,对内部区域不敏感。可以根据这些视觉特性,对图像信息进行取舍。
H264是ITU-T的VCEG和ISO/IEC的MPEG的联合视频组(JVT, Joint Video Team)开发的一个数字视频编码标准,于2003年3月正式发布。它采用网络友好的结构和语法,有利于对误码和丢包的处理。在编码技术上,通过统一的VLC符号编码,高精度、多模式的位移估计,基于4X4块的整数变换,分层的编码语法等措施,使得H264算法具备很高的编码效率。引入的复杂编码算法,会降低编码性能,从而对实时编码提出了挑战。主要通过优化编码算法实现和硬件加速来缩短编码运算时间。
x264是VideoLAN组织实现H264编码的开源库。可以通过git clone
可以通过函数int x264_param_default_preset( x264_param_t *param, const char *preset, const char *tune )
x264_param_t param;
x264_param_default_preset(¶m, "medium", NULL);
param.i_csp = X264_CSP_I420;
param.i_width = width;
param.i_height = height;
param.b_vfr_input = 0;
param.b_repeat_headers = 1;
param.b_annexb = 1;
x264_param_apply_profile(¶m, "high");x264_encoder_open(¶m );
typedef struct x264_picture_t{ /* In: force picture type (if not auto)
* If x264 encoding parameters are violated in the forcing of picture types,
* x264 will correct the input picture type and log a warning.
* Out: type of the picture encoded */
int i_type; /* In: force quantizer for != X264_QP_AUTO */
int i_qpplus1; /* In: pic_struct, for pulldown/doubling/etc...used only if b_pic_struct=1.
* use pic_struct_e for pic_struct inputs
* Out: pic_struct element associated with frame */
int i_pic_struct; /* Out: whether this frame is a keyframe. Important when using modes that result in
* SEI recovery points being used instead of IDR frames. */
int b_keyframe; /* In: user pts, Out: pts of encoded picture (user)*/
int64_t i_pts; /* Out: frame dts. When the pts of the first frame is close to zero,
* initial frames may have a negative dts which must be dealt with by any muxer */
int64_t i_dts; /* In: custom encoding parameters to be set from this frame forwards
(in coded order, not display order). If NULL, continue using
parameters from the previous frame. Some parameters, such as
aspect ratio, can only be changed per-GOP due to the limitations
of H.264 itself; in this case, the caller must force an IDR frame
if it needs the changed parameter to apply immediately. */
x264_param_t *param; /* In: raw image data */
/* Out: reconstructed image data. x264 may skip part of the reconstruction process,
e.g. deblocking, in frames where it isn't necessary. To force complete
reconstruction, at a small speed cost, set b_full_recon. */
x264_image_t img; /* In: optional information to modify encoder decisions for this frame
* Out: information about the encoded frame */
x264_image_properties_t prop; /* Out: HRD timing information. Output only when i_nal_hrd is set. */
x264_hrd_t hrd_timing; /* In: arbitrary user SEI (e.g subtitles, AFDs) */
x264_sei_t extra_sei; /* private user data. copied from input to output frames. */
void *opaque;
} x264_picture_t;
typedef struct x264_image_t{ int i_csp; /* Colorspace */
int i_plane; /* Number of image planes */
int i_stride[4]; /* Strides for each plane */
uint8_t *plane[4]; /* Pointers to each plane */} x264_image_t;
i_csp代表颜色空间类型,i_plane代表通道数。常用为yuv420p,有y, u, v三个通道。
数据存储在plane[0]指向的内存块中。plane[0], plane[1], plane[2]分别指向了y,u,v三个通道的数据内存起始位置。i_stride[0], i_stride[1], i_stride[2]代表了y, u, v三个通道每一行数据占用的长度。
使用函数int x264_encoder_encode( x264_t *h, x264_nal_t **pp_nal, int *pi_nal, x264_picture_t *pic_in, x264_picture_t *pic_out )
i_frame_size = x264_encoder_encode( h, &nal, &i_nal, &pic, &pic_out ); if( i_frame_size < 0 ) goto fail; else if( i_frame_size ) { int i; for (i = 0; i < i_nal; i++)
printf("i_nal = %d, i_frame_size = %d, nal i_payload = %d\\n", i_nal, i_frame_size, nal[i].i_payload);
i_nal = 4, i_frame_size = 13534, nal i_payload = 29i_nal = 4, i_frame_size = 13534, nal i_payload = 10i_nal = 4, i_frame_size = 13534, nal i_payload = 690i_nal = 4, i_frame_size = 13534, nal i_payload = 12805i_nal = 1, i_frame_size = 67, nal i_payload = 67i_nal = 1, i_frame_size = 1159, nal i_payload = 1159
/* The data within the payload is already NAL-encapsulated; the ref_idc and type
* are merely in the struct for easy access by the calling application.
* All data returned in an x264_nal_t, including the data in p_payload, is no longer
* valid after the next call to x264_encoder_encode. Thus it must be used or copied
* before calling x264_encoder_encode or x264_encoder_headers again. */typedef struct x264_nal_t{ int i_ref_idc; /* nal_priority_e */
int i_type; /* nal_unit_type_e */
int b_long_startcode; int i_first_mb; /* If this NAL is a slice, the index of the first MB in the slice. */
int i_last_mb; /* If this NAL is a slice, the index of the last MB in the slice. */
/* Size of payload (including any padding) in bytes. */
int i_payload; /* If param->b_annexb is set, Annex-B bytestream with startcode.
* Otherwise, startcode is replaced with a 4-byte size.
* This size is the size used in mp4/similar muxing; it is equal to i_payload-4 */
uint8_t *p_payload; /* Size of padding in bytes. */
int i_padding;
} x264_nal_t;
既可以通过每个nal data的指针与数据长度来分别访问每个nal,也可以通过nal[0]的数据起始地址uint8 *p_payload和总长度i_frame_size来一次访问所有的nal data。当要把nal写进视频容器时,会采用第一种访问的方式;当生成h264码流时,会采用第二种访问的方式。
- NAL_SPS: sequence parameter set
- NAL_PPS: picture parameter set
- NAL_SEI: supplemental enhancement information
- NAL_SLICE_IDR: coded slice of an IDR picture
- NAL_SLICE: codec slice of a noe-IDR picture
H264视频编码标准适应于不同网络之间的视频传输,起主要原因是引入了分层结构,即将图像压缩数据分成网络抽象层(NAL, Network Abstraction Layer)和视频编码层(VCL, Video Coding Layer),从而实现了压缩编码与网络传输分离,使编码层能够移植到不同的网络结构中。这样不但使得H264对目前显存的各种网络有很强的网络友好性,而且使它对未来的网络具有很强的适应性。
作者简介:taoxiong(熊涛),天天P图 AND 工程师
天天P图技术团队长期招聘:(1) 图像处理算法工程师,(2) Android / iOS 开发工程师,期待对我们感兴趣或者有推荐的技术牛人加入我们(base 上海)!联系方式