文章

FSPlayer 支持渲染 tile grid HEIC 格式的图片

FSPlayer 支持渲染 tile grid HEIC 格式的图片

我这里有一个接近 4K 的图片,FSPlayer 只能展示其中的一小块,原因是解码后尺寸是 512x512,经过了解才知道HEIC 图片(尤其是 iOS 拍摄的)为了提高解码效率,通常不会直接存储为一张超大图,而是切分成多个 512x512 的 Tiles(切片)。所以 FSPlayer 只是解码并渲染了其中的一个分块而已。

FFmpeg 对 HEIC 的支持情况

早在 FFmpeg 4 代,FSPlayer 就通过打 patch 对 heic 进行了支持,这个情况一直持续到 6 代。到了 7 代官方支持了这个特性,但是没有支持上面提到的切片 heic。

这是 7 代前不打 path 的报错:

1
2
3
4
ffmpeg6 -i /Users/matt/Desktop/常用测试资源/图片/heic/image2.heic -map 0 output.png
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x7f92527078c0] moov atom not found
[in#0 @ 0x7f9252707780] Error opening input: Invalid data found when processing input
Error opening input file /Users/matt/Desktop/常用测试资源/图片/heic/image2.heic.

这是 7 代对 heic 分片的解码情况,保存时改成 output_%d.png 就能输出所有块了,不带 %d 的话就会输出第一块:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
ffmpeg7 -i /Users/matt/Desktop/常用测试资源/图片/heic/image2.heic -map 0 output_%d.png
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/matt/Desktop/常用测试资源/图片/heic/image2.heic':
  Metadata:
    major_brand     : heic
    minor_version   : 0
    compatible_brands: mif1heic
  Duration: N/A, start: 0.000000, bitrate: N/A
  Stream group #0:0[0x24]: Tile Grid: hevc (Main) (hvc1 / 0x31637668), yuvj420p(pc), 3464x2130 (default)
Stream mapping:
  Stream #0:0 -> #0:0 (hevc (native) -> png (native))
  Stream #0:1 -> #0:1 (hevc (native) -> png (native))
  Stream #0:2 -> #0:2 (hevc (native) -> png (native))
  Stream #0:3 -> #0:3 (hevc (native) -> png (native))
  Stream #0:4 -> #0:4 (hevc (native) -> png (native))
  Stream #0:5 -> #0:5 (hevc (native) -> png (native))
...
Output #0, image2, to 'output_%d.png':
  Metadata:
    major_brand     : heic
    minor_version   : 0
    compatible_brands: mif1heic
    encoder         : Lavf61.7.100
  Stream #0:0: Video: png, rgb24(pc, gbr/unknown/unknown, progressive), 512x512, q=2-31, 200 kb/s, 1 fps, 1 tbn (default) (dependent)
      Metadata:
        encoder         : Lavc61.19.100 png
      Side data:
        ICC Profile
  Stream #0:1: Video: png, rgb24(pc, gbr/unknown/unknown, progressive), 512x512, q=2-31, 200 kb/s, 1 fps, 1 tbn (dependent)
      Metadata:
        encoder         : Lavc61.19.100 png
      Side data:
        ICC Profile

这说明 FFmpeg7 代支持读取,解码分块的 HEIC,但是没有合成逻辑,接下来需要在 FSPlayer 里实现解码和将分块合并的逻辑。

改造 FSPlayer 读包逻辑

第一步是需要让 FSPlayer 能够读到所有的包,既然现在只读到了一个包,那说明分块的 HEIC 跟普通的读包逻辑不同,这些网格图被当成一个分组,放在了 stream_groups 里。下面的代码实现了将所有的 packet 加入到解码队列里:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
if (ic->nb_stream_groups > 0) {
    for (unsigned int i = 0; i < ic->nb_stream_groups; i++) {
        AVStreamGroup *group = ic->stream_groups[i];
        // 1. 获取该组内包含的 stream 数量
        unsigned int count = group->nb_streams;

        av_log(NULL, AV_LOG_INFO,
                "Group %u (Type: %d) contains %u streams.\n",
                i, group->type, count);

        // 2. 只有当类型是 Tile Grid 时,才进行拼图逻辑判断
        if (group->type == AV_STREAM_GROUP_PARAMS_TILE_GRID) {
            AVStreamGroupTileGrid *grid = group->params.tile_grid;
            av_log(NULL, AV_LOG_INFO,
                    "  Tile grid: nb_tiles=%u, canvas=%dx%d, roi=(%d,%d %dx%d)\n",
                    grid->nb_tiles, grid->coded_width, grid->coded_height,
                    grid->horizontal_offset, grid->vertical_offset,
                    grid->width, grid->height);

            // 3. 找到 pkt->stream_index 在 group 中的位置 j
            int group_stream_idx = -1;
            for (unsigned int j = 0; j < group->nb_streams; j++) {
                if (group->streams[j]->index == pkt->stream_index) {
                    group_stream_idx = (int)j;
                    break;
                }
            }
            if (group_stream_idx < 0) {
                // packet 不属于该 group,跳过
                continue;
            }

            // 4. 通过 grid->offsets[].idx 找到对应 tile
            int tile_index = -1;
            int tile_x = 0, tile_y = 0;
            for (unsigned int t = 0; t < grid->nb_tiles; t++) {
                if ((int)grid->offsets[t].idx == group_stream_idx) {
                    tile_index = (int)t;
                    tile_x = grid->offsets[t].horizontal;
                    tile_y = grid->offsets[t].vertical;
                    break;
                }
            }
            if (tile_index < 0) {
                continue;
            }

            AVStream *tile_st = group->streams[group_stream_idx];

            // 5. 为 packet 附加 tile 元数据 (opaque_ref)
            AVBufferRef *meta_buf = av_buffer_alloc(sizeof(FSTileGridMetadata));
            if (meta_buf) {
                FSTileGridMetadata *meta = (FSTileGridMetadata *)meta_buf->data;
                meta->tile_index = tile_index;
                meta->nb_tiles   = (int)grid->nb_tiles;
                meta->canvas_w   = grid->coded_width;
                meta->canvas_h   = grid->coded_height;
                meta->tile_x     = tile_x;
                meta->tile_y     = tile_y;
                meta->tile_w     = tile_st->codecpar ? tile_st->codecpar->width  : 0;
                meta->tile_h     = tile_st->codecpar ? tile_st->codecpar->height : 0;

                // 释放之前可能挂载的 opaque_ref,以防重复 put 叠加
                if (pkt->opaque_ref) {
                    av_buffer_unref(&pkt->opaque_ref);
                }
                pkt->opaque_ref = meta_buf;
            }

            av_log(NULL, AV_LOG_DEBUG,
                    "put tile packet: group=%u stream=%d tile_idx=%d pos=(%d,%d)\n",
                    i, pkt->stream_index, tile_index, tile_x, tile_y);

            packet_queue_put(&is->videoq, pkt);
        }
    }
    break;
}

改造解码逻辑,实现多次累计一次提交

之前的解码逻辑是将每一 frame 都填充为一个 SDL_VoutOverlay,然后入队列,等着渲染线程从队列里取出来渲染。这就导致改造完上面的读包逻辑后的效果是就像播放 gif 一样,去播放么一个分块的内容。所以必须有一个地方处理将 N 个分块合成一个完整大图的处理,这个处理有两个选择:

  1. 解码线程生成一张目标大小的图,每解码完一个分块就根据分块的位置信息进行局部替换,直到所有的分块都替换完成后,将这个大图发送给渲染模块。好处是渲染模块不需要任何改动,缺点是填充过程在 CPU 端完成,渲染时可能需要申请很大的纹理。
  2. 解码线程只负责收集所有的分块,仅保留分块信息,不做合成,等所有分块都解码后,将分块数组交给渲染模块。缺点是渲染模块可能需要大改,支持按照区域进行循环渲染,优点是填充过程在 GPU 端完成,可规避合并,占用 CPU 较低。

我选择了方案 2,因为渲染模块封装的足够好,容易改造,我不想占用很高的 CPU 和一个超大的连续内存区域。

回顾下解码线程处理 avfame 入队列的核心过程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
ffplay_video_thread {
    for (;;) {
        get_video_frame(ffp, frame);
        queue_picture(ffp, frame, pts, duration, pos, is->viddec.pkt_serial);
    }
}

queue_picture {
    if (!(vp = frame_queue_peek_writable(&is->pictq)))
            return -1;
    /* get a pointer on the bitmap */
    SDL_VoutLockYUVOverlay(vp->bmp);

    // FIXME: set swscale options
    if (SDL_VoutFillFrameYUVOverlay(vp->bmp, src_frame) < 0) {
        av_log(NULL, AV_LOG_FATAL, "Cannot initialize the conversion context\n");
        return -3;
    }

    /* update the bitmap content */
    SDL_VoutUnlockYUVOverlay(vp->bmp);

    vp->pts = pts;
    vp->duration = duration;
    vp->pos = pos;
    vp->frame_serial = serial;
    vp->sar = src_frame->sample_aspect_ratio;
    vp->bmp->sar_num = vp->sar.num;
    vp->bmp->sar_den = vp->sar.den;
    vp->bmp->fps = ffp->stat.vfps_probe;
    frame_queue_push(&is->pictq);
}

我们期望的是让 overlay 来累积 group 里的所有 frame,最后统一发送这个 overlay,根据 frame queue 的设计,我们只要不调用 frame_queue_push 就能做到获取的 overlay 不会变化。所以在 SDL_VoutFillFrameYUVOverlay 之后,push 之前增加一个 pendig 状态的逻辑即可:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// FIXME: set swscale options
if (SDL_VoutFillFrameYUVOverlay(vp->bmp, src_frame) < 0) {
    av_log(NULL, AV_LOG_FATAL, "Cannot initialize the conversion context\n");
    return -3;
}

/* HEIC tile-grid: 如果 overlay 还在累积 tile,不要 push 到渲染队列,
    * 保留 writable 槽位给下一个 tile 继续写入。
    */
int isPending = SDL_VoutOverlay_IsTilePending(vp->bmp);

/* update the bitmap content */
SDL_VoutUnlockYUVOverlay(vp->bmp);

if (isPending) {
    return 0;
}

等到stream_groups里的所有包都解码完毕后就会将这个包含了所有 frame 的 SDL_VoutOverlay 放到队列里,继而出发渲染线程渲染。

接来下在 overlay 里实现累积 frame 就比较简单了,注意需要适配两套逻辑,一个是硬解一个是软解,以软解举例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
typedef struct FSTileSlot {
    CVPixelBufferRef pb;   // 已拷贝的 tile CVPixelBuffer(owned)
    int x, y;              // tile 在 canvas 上的位置
    int w, h;              // tile 尺寸
    int filled;            // 是否已填充
} FSTileSlot;

struct SDL_VoutOverlay_Opaque {
    SDL_mutex *mutex;
    Uint16 pitches[AV_NUM_DATA_POINTERS];

    CVPixelBufferRef pixelBuffer;
    CVPixelBufferPoolRef pixelBufferPool;

    /* HEIC tile grid 模式 */
    int         tile_mode;       // 1 表示当前正在累积 tile
    int         tile_expected;   // 期望总数(grid->nb_tiles)
    int         tile_received;   // 已收到并存入槽位的 tile 数
    int         tile_ready;      // 1 表示已攒齐、可显示
    int         tile_canvas_w;
    int         tile_canvas_h;
    FSTileSlot *tiles;           // 长度 tile_expected
};

static int func_is_tile_pending(SDL_VoutOverlay *overlay)
{
    if (!overlay) return 0;
    SDL_VoutOverlay_Opaque *opaque = overlay->opaque;
    if (!opaque || !opaque->tile_mode) return 0;
    return opaque->tile_ready ? 0 : 1;
}

static int func_get_tile_count(SDL_VoutOverlay *overlay)
{
    if (!overlay) return 0;
    SDL_VoutOverlay_Opaque *opaque = overlay->opaque;
    if (!opaque || !opaque->tile_mode) return 0;
    return opaque->tile_received;
}

static int func_get_tile_buffers(SDL_VoutOverlay *overlay,
                                 CVPixelBufferRef *out_buffers,
                                 int *out_x, int *out_y,
                                 int *out_w, int *out_h,
                                 int max_count)
{
    if (!overlay) return 0;
    SDL_VoutOverlay_Opaque *opaque = overlay->opaque;
    if (!opaque || !opaque->tile_mode || !opaque->tiles) return 0;
    int n = opaque->tile_expected < max_count ? opaque->tile_expected : max_count;
    int k = 0;
    for (int i = 0; i < n; i++) {
        FSTileSlot *slot = &opaque->tiles[i];
        if (!slot->filled || !slot->pb) continue;
        if (out_buffers) out_buffers[k] = slot->pb;
        if (out_x) out_x[k] = slot->x;
        if (out_y) out_y[k] = slot->y;
        if (out_w) out_w[k] = slot->w;
        if (out_h) out_h[k] = slot->h;
        k++;
    }
    return k;
}

static int func_fill_avframe_to_cvpixelbuffer(SDL_VoutOverlay *overlay, const AVFrame *frame)
{
    if (!overlay || !frame)
        return -100;

    SDL_VoutOverlay_Opaque *opaque = overlay->opaque;

    /* ---------- HEIC tile grid 分支 ---------- */
    FSTileGridMetadata *tmeta = NULL;
    if (frame->opaque_ref && frame->opaque_ref->size >= (int)sizeof(FSTileGridMetadata)) {
        tmeta = (FSTileGridMetadata *)frame->opaque_ref->data;
        if (tmeta->nb_tiles <= 0 || tmeta->canvas_w <= 0 || tmeta->canvas_h <= 0) {
            tmeta = NULL; // 非法元数据,回落到单帧
        }
    }

    if (tmeta) {
        // 首次进入 tile 模式:初始化槽位
        if (!opaque->tile_mode ||
            opaque->tile_expected != tmeta->nb_tiles ||
            opaque->tile_canvas_w != tmeta->canvas_w ||
            opaque->tile_canvas_h != tmeta->canvas_h) {

            // 之前可能有残留,先清理
            tile_slots_free(opaque);
            if (opaque->pixelBuffer) {
                CVPixelBufferRelease(opaque->pixelBuffer);
                opaque->pixelBuffer = NULL;
            }

            opaque->tile_mode     = 1;
            opaque->tile_expected = tmeta->nb_tiles;
            opaque->tile_received = 0;
            opaque->tile_ready    = 0;
            opaque->tile_canvas_w = tmeta->canvas_w;
            opaque->tile_canvas_h = tmeta->canvas_h;
            opaque->tiles = (FSTileSlot *)calloc((size_t)tmeta->nb_tiles, sizeof(FSTileSlot));
            if (!opaque->tiles) {
                ALOGE("tile_mode: allocate tiles array failed");
                opaque->tile_expected = 0;
                opaque->tile_mode     = 0;
                return -100;
            }

            overlay->is_tile_grid   = 1;
            overlay->tile_canvas_w  = tmeta->canvas_w;
            overlay->tile_canvas_h  = tmeta->canvas_h;
            overlay->w              = tmeta->w;
            overlay->h              = tmeta->h;
        }

        int idx = tmeta->tile_index;
        if (idx < 0 || idx >= opaque->tile_expected) {
            ALOGE("tile_mode: invalid tile_index %d (expected<%d)", idx, opaque->tile_expected);
            return 0; // 忽略,继续累积
        }

        FSTileSlot *slot = &opaque->tiles[idx];
        // 如果该槽位已有(重复 put 导致),先释放旧的
        if (slot->pb) {
            CVPixelBufferRelease(slot->pb);
            slot->pb = NULL;
            slot->filled = 0;
            if (opaque->tile_received > 0) opaque->tile_received--;
        }

        // 每个 tile 分辨率可能与 pool 不符,直接不走 pool
        CVPixelBufferRef pb = createCVPixelBufferFromAVFrame(frame, NULL);
        if (!pb) {
            ALOGE("tile_mode: createCVPixelBufferFromAVFrame failed for tile %d", idx);
            return 0;
        }
        slot->pb     = pb;
        slot->x      = tmeta->tile_x;
        slot->y      = tmeta->tile_y;
        slot->w      = tmeta->tile_w > 0 ? tmeta->tile_w : frame->width;
        slot->h      = tmeta->tile_h > 0 ? tmeta->tile_h : frame->height;
        slot->filled = 1;
        opaque->tile_received++;

        ALOGD("tile_mode: received tile %d/%d at (%d,%d) %dx%d",
              opaque->tile_received, opaque->tile_expected,
              slot->x, slot->y, slot->w, slot->h);

        // pitches 先维持个合理值,渲染侧不再用 overlay->pitches
        overlay->pitches[0] = CVPixelBufferGetWidth(pb);

        if (opaque->tile_received >= opaque->tile_expected) {
            opaque->tile_ready = 1;
            ALOGI("tile_mode: all %d tiles gathered, canvas=%dx%d",
                  opaque->tile_expected, opaque->tile_canvas_w, opaque->tile_canvas_h);
        }
        return 0;
    }

    /* ---------- 普通单帧路径(非 tile 或 opaque 丢失) ---------- */
    ......
    return -100;
}

修改了 overlay 的定义,第一次收到分块时,会按照总数分配FSTileSlot内存,后续每过来一个分块就将其对应的 CVPixelBufferRef 存储到对应的 slot 里。当 tile_received 等于 tile_expected 时,表明分组里的帧解码完毕了。

注意,通过 opaque_ref 可以实现让解码器透传数据,解码前把数据挂到 pkt->opaque_ref 上,解码后从 frame->opaque_ref 获取。

适配渲染逻辑

将 N 个分片渲染成一张大图核心思想是循环绘制,每次绘制指定viewport和绘制的区域。遇到了 padding 导致的黑边问题。

当解码线程累积了所有的 frame 之后,pending状态就会变成 0,继而将 overlay push 到 frame queue里,下面是渲染线程获取 overlay 渲染的流程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
IjkMediaPlayer *ijkmp_ios_create(int (*msg_loop)(void*))
{
    IjkMediaPlayer *mp = ijkmp_create(msg_loop);
    if (!mp)
    goto fail;
    
    mp->ffplayer->vout = SDL_VoutIos_CreateForGLES2();
    if (!mp->ffplayer->vout)
    goto fail;
    
    mp->ffplayer->pipeline = ffpipeline_create_from_ios(mp->ffplayer);
    if (!mp->ffplayer->pipeline)
    goto fail;
    
    mp->ffplayer->aout = ffpipeline_open_audio_output(mp->ffplayer->pipeline, mp->ffplayer);
    if (!mp->ffplayer->aout)
    goto fail;
    
    return mp;
    
fail:
    ijkmp_dec_ref_p(&mp);
    return NULL;
}


is->video_refresh_tid = SDL_CreateThreadEx(&is->_video_refresh_tid, video_refresh_thread, ffp, "ff_vout");
-> video_refresh_thread
-> video_refresh
-> video_display2
-> video_image_display2 {
    Frame *vp = frame_queue_peek_last(&is->pictq);
    SDL_VoutDisplayYUVOverlay(ffp->vout, vp->bmp, sub_overlay);
    SDL_TextureOverlay_Release(&sub_overlay);
}

其中 SDL_VoutDisplayYUVOverlay 函数的里会将 SDL_VoutOverlay 映射成渲染那边定义的 FSOverlayAttach 类型:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SDL_VoutDisplayYUVOverlay
vout->display_overlay
->vout_display_overlay
->vout_display_overlay_l {
    NSMutableArray<FSTilePiece *> *pieces = [NSMutableArray arrayWithCapacity:got];
    for (int i = 0; i < got; i++) {
        if (!bufs[i]) continue;
        FSTilePiece *p = [[FSTilePiece alloc] init];
        p.pixelBuffer = CVPixelBufferRetain(bufs[i]);
        p.x = xs[i]; p.y = ys[i];
        p.w = ws[i]; p.h = hs[i];
        [pieces addObject:p];
    }
    attach.tilePieces = pieces;
    attach.overlay = SDL_TextureOverlay_Retain(sub_overlay);
    free(bufs); free(xs); free(ys); free(ws); free(hs);
    return [gl_view displayAttach:attach];
}

FSMetalView 通过 displayAttach 收到需要渲染的 attach 后,最终会在 drawRect 里做渲染,支持多 tile 绘制的大致思路是每个分块绘制一次,每次需要重新确定下映射关系,即到分块显示到屏幕的区域

// 计算 canvas 在 drawable 中按 scalingMode + sar + rotate 贴合后的目标矩形 (viewport 坐标系, origin 左下)
- (MTLViewport)computeCanvasViewport:(FSOverlayAttach *)attach
                        drawableSize:(CGSize)drawableSize
                               ratio:(CGSize)ratio
{
    // 这里复用 encodePicture 的思路:顶点用 [-ratio.w,+ratio.w] × [-ratio.h,+ratio.h]
    // 最终映射到 [0,drawable.w]×[0,drawable.h]。直接按 ratio 算 canvas 在屏幕的矩形。
    double cw = drawableSize.width  * ratio.width;
    double ch = drawableSize.height * ratio.height;
    double cx = (drawableSize.width  - cw) * 0.5;
    double cy = (drawableSize.height - ch) * 0.5;
    return (MTLViewport){cx, cy, cw, ch, -1.0, 1.0};
}

CGSize ratio = [self computeNormalizedVerticesRatio:currentAttach drawableSize:viewport];

// tile 绘制:顶点全屏、不裁剪(每个 tile 视口就是它在 canvas 的对应位置)
self.picturePipeline.vertexRatio = CGSizeMake(1.0, 1.0);
self.picturePipeline.textureCrop = CGSizeZero;

// 先算出 canvas 在屏幕上的目标矩形
MTLViewport display_vp = [self computeCanvasViewport:attach drawableSize:drawableSize ratio:ratio];
double display_w = attach.w;
double display_h = attach.h;

CVMetalTextureCacheRef textureCache = NULL;
#if TARGET_CPU_ARM64
textureCache = _pictureTextureCache;
#endif

for (FSTilePiece *piece in attach.tilePieces) {
    if (!piece.pixelBuffer || piece.w <= 0 || piece.h <= 0) continue;
    if (!piece.textures) {
        piece.textures = [[self class] doGenerateTexture:piece.pixelBuffer
                                            textureCache:textureCache
                                                    device:self.device];
    }
    if (!piece.textures) continue;

    // tile 在 canvas 上的归一化位置
    double nx = (double)piece.x / display_w;
    double ny = (double)piece.y / display_h;
    double nw = (double)piece.w / display_w;
    double nh = (double)piece.h / display_h;

    // 映射到屏幕
    // 注意 Metal viewport 的原点在左上(y 向下),drawable 坐标同向,直接计算即可
    MTLViewport tile_vp;
    tile_vp.originX = display_vp.originX + nx * display_vp.width;
    tile_vp.originY = display_vp.originY + ny * display_vp.height;
    tile_vp.width   = nw * display_vp.width;
    tile_vp.height  = nh * display_vp.height;
    tile_vp.znear   = -1.0;
    tile_vp.zfar    =  1.0;

    [renderEncoder setViewport:tile_vp];
    [self.picturePipeline uploadTextureWithEncoder:renderEncoder textures:piece.textures];
}

有个坑需要注意,每次draw 三角形,都需要一个新的 argument buffer,之前能复用是因为渲染的间隔大,GPU能处理完,但是现在是 for 循环绘制的,间隔太短了,GPU绘制不完,如果重用会导致内容不对,可能会看到相同的图填充到了不同的位置里。

- (void)uploadTextureWithEncoder:(id<MTLRenderCommandEncoder>)encoder
                        textures:(NSArray*)textures
{
    [self updateVertexIfNeed];
    // Pass in the parameter data.
    [encoder setVertexBuffer:self.vertexBuffer
                      offset:0
                     atIndex:FSVertexInputIndexVertices]; // 设置顶点缓存
 
    [self updateConvertMatrixBufferIfNeed];
    
    // Each draw needs its own argument buffer snapshot. Reusing one mutable buffer for
    // multiple draw calls in the same command encoder can make earlier draws observe
    // later texture bindings when the GPU executes asynchronously.
    id<MTLBuffer> drawArgumentBuffer = [_device newBufferWithLength:self.argumentEncoder.encodedLength options:0];
    [self.argumentEncoder setArgumentBuffer:drawArgumentBuffer offset:0];
    
    for (int i = 0; i < [textures count]; i++) {
        id<MTLTexture>t = textures[i];
        [self.argumentEncoder setTexture:t
                                 atIndex:FSFragmentTextureIndexTextureY + i]; // 设置纹理
        
        // Indicate to Metal that the GPU accesses these resources, so they need
        // to map to the GPU's address space.
        if (@available(macOS 10.15, ios 13.0, tvOS 13.0, *)) {
            [encoder useResource:t usage:MTLResourceUsageRead stages:MTLRenderStageFragment];
        } else {
            // Fallback on earlier versions
            [encoder useResource:t usage:MTLResourceUsageRead];
        }
    }
    [self.argumentEncoder setBuffer:self.convertMatrixBuff offset:0 atIndex:FSFragmentMatrixIndexConvert];
    
    // to map to the GPU's address space.
    if (@available(macOS 10.15, ios 13.0, tvOS 13.0, *)) {
        [encoder useResource:self.convertMatrixBuff usage:MTLResourceUsageRead stages:MTLRenderStageFragment];
    } else {
        // Fallback on earlier versions
        [encoder useResource:self.convertMatrixBuff usage:MTLResourceUsageRead];
    }
    
    [encoder setFragmentBuffer:drawArgumentBuffer
                        offset:0
                       atIndex:FSFragmentBufferLocation0];
    
    // 设置渲染管道,以保证顶点和片元两个shader会被调用
    [encoder setRenderPipelineState:self.renderPipeline];
    
    // Draw the triangle.
    [encoder drawPrimitives:MTLPrimitiveTypeTriangleStrip
                vertexStart:0
                vertexCount:4]; // 绘制
}

上述代码显示时左侧和底部会出现黑边,因为有 Padding 数据,以我测试的图为例,所有的分块拼起来的canvas 尺寸是 3584x2560,但是图片的显示尺寸是 3464x2130,左上角重叠,剩余的就是填充数据,不能显示,否则就是黑边。因为计算是 attach.w/h 实际上是canvas的尺寸,按照这个尺寸缩放到屏幕上,必然有黑边,就是把一个大 Canvas(带黑边)的内容,等比缩放到了一个小 Viewport(显示区域)里,右侧和底部自然留出了由于 3584→3464 转换产生的空隙。

回顾普通视频去黑边的逻辑:

self.picturePipeline.textureCrop = CGSizeMake(1.0 * (attach.pixelW - attach.w) / attach.pixelW, 1.0 * (attach.pixelH - attach.h) / attach.pixelH);

其中 pixelW/H 是包含 Padding 的尺寸,attach.w/h 是图像显示的尺寸。只需要按照这个流程,为分片实现纹理裁剪和viewport即可:

修正后的代码:

// tile 绘制:顶点全屏、不裁剪(每个 tile 视口就是它在 canvas 的对应位置)
self.picturePipeline.vertexRatio = CGSizeMake(1.0, 1.0);

// 先算出合并后区域在屏幕上的目标矩形
MTLViewport display_vp = [self computeCanvasViewport:attach drawableSize:drawableSize ratio:ratio];

//canvas=3584x2560   (pixelW,pixelH)
//display=3464x2130 (w,h)

double display_w = attach.w;
double display_h = attach.h;

CVMetalTextureCacheRef textureCache = NULL;
#if TARGET_CPU_ARM64
textureCache = _pictureTextureCache;
#endif

for (FSTilePiece *piece in attach.tilePieces) {
    if (!piece.pixelBuffer || piece.w <= 0 || piece.h <= 0) continue;
    if (!piece.textures) {
        piece.textures = [[self class] doGenerateTexture:piece.pixelBuffer
                                            textureCache:textureCache
                                                    device:self.device];
    }
    if (!piece.textures) continue;

    // 边缘处理:如果这个 Tile 位于最右边或最下面,它的物理尺寸可能包含了 Padding
    // 我们需要通过计算实际的显示区域,然后确定出一个 Viewport,和纹理的裁剪区域
    double valid_w = piece.w;
    if (piece.x + piece.w > display_w) {
        valid_w = display_w - piece.x;
    }
    
    double valid_h = piece.h;
    if (piece.y + piece.h > display_h) {
        valid_h = display_h - piece.y;
    }
    
    // tile 在 显示尺寸 上的归一化位置
    double nx = (double)piece.x / display_w;
    double ny = (double)piece.y / display_h;
    double nw = (double)valid_w / display_w;
    double nh = (double)valid_h / display_h;
    
    // 映射到显示到屏幕的区域
    // 注意 Metal viewport 的原点在左上(y 向下),drawable 坐标同向,直接计算即可
    MTLViewport tile_vp;
    tile_vp.originX = display_vp.originX + nx * display_vp.width;
    tile_vp.originY = display_vp.originY + ny * display_vp.height;
    tile_vp.width   = nw * display_vp.width;
    tile_vp.height  = nh * display_vp.height;
    tile_vp.znear   = -1.0;
    tile_vp.zfar    =  1.0;

    // 计算该 Tile 纹理内部的裁剪比例
    // textureCrop 的定义是:需要减去的百分比
    // 比如 Tile 宽 512,有效 392,则需剪掉 (512-392)/512
    float cropX = (float)(piece.w - valid_w) / piece.w;
    float cropY = (float)(piece.h - valid_h) / piece.h;
    self.picturePipeline.textureCrop = CGSizeMake(cropX, cropY);

    [renderEncoder setViewport:tile_vp];
    [self.picturePipeline uploadTextureWithEncoder:renderEncoder textures:piece.textures];
}

其他

  • https://github.com/tigranbs/test-heic-images/blob/master/
  • https://trac.ffmpeg.org/ticket/11170
  • https://trac.ffmpeg.org/ticket/6521
  • https://git.ffmpeg.org/gitweb/ffmpeg.git/blobdiff/433d18a1d99dbfca48ca1b16e38a2a032d140a7a..5ff8395e7806ad27743829b047067098c288782a:/fftools/ffmpeg_demux.c
本文由作者按照 CC BY 4.0 进行授权