本系列文章将详细讲述移动端音视频的采集、渲染、硬件编码、硬件解码这些涉及硬件的能力该如何实现

作者：老汪软件技巧
发表时间：2024-12-15 11:07
浏览量：

涉及硬件的音视频能力，比如采集、渲染、硬件编码、硬件解码，通常是与客户端操作系统强相关的，就算是跨平台的多媒体框架也必须使用平台原生语言的模块来支持这些功能

本文为该系列文章的第 3 篇，将详细讲述在 iOS 平台下如何实现视频的硬件解码

往期精彩内容，可参考

音视频基础能力之 iOS 视频篇（一）：视频采集

音视频基础能力之 iOS 视频篇（二）：视频硬件编码

前言

视频解码是视频编码的逆过程，就是将压缩后的图像数据还原成原始未压缩的图像数据，可用于图像处理或渲染到屏幕。有关原始图像数据渲染到屏幕的内容，本系列后续文章中会详细介绍，敬请期待

在 iOS 平台，Apple 提供的硬件解码功能，目前仅支持 H.264 和 H.265，本文也将介绍这 2 种格式的硬件解码该如何实现。在阅读本文之前，建议预先了解下 H.264 和 H.265 的码流结构这些原理性的内容，方便后续更好的理解本文内容

整体流程

本文所介绍的解码流程，如下图所示

数据变化的流程，如下图所示

系统框架

用到了 VideoToolbox，引入头文件

#import

关键类型

VTDecompressionSessionRef

CMVideoFormatDescriptionRef

CMSampleBufferRef

创建视频格式

视频格式的类型为 CMVideoFormatDescriptionRef

CMVideoFormatDescriptionRef video_format;
uint8_t* sps_data;
uint8_t* pps_data;
size_t sps_data_length;
size_t pps_data_length;
const uint8_t* param_set_pointers[2] = {sps_data, pps_data};
size_t param_set_sizes[2] = {sps_data_length, pps_data_length};
int nalu_header_length = 4;
OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault,
                                                                      2,
                                                                      param_set_pointers,
                                                                      param_set_sizes,
                                                                      nalu_header_length,
                                                                      &video_format);
if (status != noErr) {
  // error logic
}

uint8_t* vps_data;
uint8_t* sps_data;
uint8_t* pps_data;
size_t vps_data_length;
size_t sps_data_length;
size_t pps_data_length;
const uint8_t* param_set_pointers[3] = {vps_data, sps_data, pps_data};
size_t param_set_sizes[3] = {vps_data_length, sps_data_length, pps_data_length};
int nalu_header_length = 4;
CMVideoFormatDescriptionRef video_format;
OSStatus status = CMVideoFormatDescriptionCreateFromHEVCParameterSets(kCFAllocatorDefault,
                                                                      3,
                                                                      param_set_pointers,
                                                                      param_set_sizes,
                                                                      nalu_header_length,
                                                                      NULL,
                                                                      &video_format);
if (status != noErr) {
  // error logic
}

初始化解码器

构造 dest_image_buffer_attributes，与编码时类似，是与最终图像有关的一系列参数

解码器的类型不需要指定，因为 video_format 中已经存储了关键信息，直接创建即可

    const size_t attributes_size = 4;
    CFTypeRef keys[attributes_size] = {
        kCVPixelBufferOpenGLESCompatibilityKey,
        kCVPixelBufferMetalCompatibilityKey,
        kCVPixelBufferIOSurfacePropertiesKey,
        kCVPixelBufferPixelFormatTypeKey
    };
    CFDictionaryRef io_surface_ref = CFDictionaryCreate(kCFAllocatorDefault, nullptr, nullptr, 0, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
    OSType pixelFormat = kCVPixelFormatType_420YpCbCr8BiPlanarFullRange;
    CFNumberRef pixel_format_ref = CFNumberCreate(nullptr, kCFNumberLongType, &pixelFormat);
    CFTypeRef values[attributes_size] = {
        kCFBooleanTrue,
        kCFBooleanTrue,
        io_surface_ref,
        pixel_format_ref
    };
    CFDictionaryRef destination_image_buffer_attributes = CFDictionaryCreate(kCFAllocatorDefault, keys, values, attributes_size, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
    
    // 创建解码器
    OSStatus status = VTDecompressionSessionCreate(nullptr,
                                                   video_format,
                                                   NULL,
                                                   destination_image_buffer_attributes,
                                                   nullptr,

                                                   &decode_session_);
    CFRelease(io_surface_ref);
    CFRelease(pixel_format_ref);
    CFRelease(destination_image_buffer_attributes);

设置解码器参数

设置实时解码

OSStatus status = VTSessionSetProperty(decode_session_,
                                       kVTDecompressionPropertyKey_RealTime,
                                       kCFBooleanTrue);

处理解码前数据

每帧开始解码之前要处理 NALU 的起始码

uint8_t* nalu_data;
uint32_t nalu_length;
// 对于关键帧，跳过 vps、sps、pps
// 将 Annex-B 格式中的 0x00000001 起始码替换成大端模式的 4 字节 NALU 长度
nalu_length = CFSwapInt32HostToBig(nalu_length);
memcpy(nalu_data, &nalu_length, 4);

进行解码

解码之前，需要将 NALU 数据包含开头 4 字节长度，一起封装在 CMSampleBuffer 中，同时需要传入最开始创建的 video_format

    uint8_t* data;
    uint32_t data_length;
    CMBlockBufferRef block_buffer = NULL;
    OSStatus status = CMBlockBufferCreateWithMemoryBlock(kCFAllocatorDefault,
                                                         data,
                                                         data_length,
                                                         kCFAllocatorNull,
                                                         NULL,
                                                         0,
                                                         data_length,
                                                         0,
                                                         &block_buffer);
    if (status != noErr) {
        // error logic
        return;
    }
    
    CMSampleBufferRef sample_buffer = NULL;
    status = CMSampleBufferCreate(kCFAllocatorDefault,
                                  block_buffer,
                                  true,
                                  nullptr,
                                  nullptr,
                                  video_format,
                                  1,
                                  0,
                                  nullptr,
                                  0,
                                  nullptr,
                                  &sample_buffer);
    CFRelease(block_buffer);

进行解码，iOS 9 开始支持用 block 处理解码回调，比起静态函数方便了很多，解码后的数据存储在 CVImageBufferRef 当中，CVImageBufferRef 跟 CVPixelBuffer 是同一个东西，拿到之后就可以做后流程了，不管是图像处理还是渲染，都可以

    
    VTDecodeFrameFlags decode_flags = kVTDecodeFrame_EnableAsynchronousDecompression;
    OSStatus status = VTDecompressionSessionDecodeFrameWithOutputHandler(decode_session_,
                                                                         sample_buffer,
                                                                         decode_flags,
                                                                         nullptr,
                                                                         ^(OSStatus status,
                                                                           VTDecodeInfoFlags infoFlags,
                                                                           CVImageBufferRef imageBuffer,
                                                                           CMTime presentationTimeStamp,
                                                                           CMTime presentationDuration) {
        if ((status != noErr) || (infoFlags == kVTEncodeInfo_FrameDropped)) {
            // 当前帧解码出错
            return;
        }
        // 拿到 CVImageBufferRef 之后，做后续流程
    });

释放资源

VTDecompressionSessionWaitForAsynchronousFrames(decode_session_);
VTDecompressionSessionInvalidate(decode_session_);
CFRelease(decode_session_);

写在最后

以上就是本文的所有内容了，详细讲述了在 iOS 平台下如何实现视频的硬件解码

本文为音视频基础能力系列文章的第 3 篇

往期精彩内容，可参考

音视频基础能力之 iOS 视频篇（一）：视频采集

音视频基础能力之 iOS 视频篇（二）：视频硬件编码

后续精彩内容，敬请期待

如果您觉得以上内容对您有所帮助的话，欢迎关注我们运营的公众号声知视界，会定期的推送音视频技术、移动端技术为主轴的科普类、基础知识类、行业资讯类等文章。

上一条查看详情 +聚焦沟通：货拉拉自研客服IM系统

下一条 查看详情 +没有了