TransNetV2 Video Shot Detection - C++ Implementation

项目介绍 | Project Introduction

本项目是TransNetV2视频镜头检测模型的C++实现版本，支持动态输入形状、多线程推理和视频自动切割。

This is a C++ implementation of the TransNetV2 video shot boundary detection model, featuring dynamic input shapes, multi-threaded inference, and automatic video splitting.

本项目部分代码使用AI工具辅助生成 | Part of the code in this project is generated with the assistance of AI tools

主要特性 | Key Features

🚀 高性能推理 | High-Performance Inference

多线程并行处理 | Multi-threaded parallel processing
- 4个线程同时进行推理
- 4 threads running inference simultaneously
动态形状支持 | Dynamic shape support
- 支持任意长度的视频输入
- Supports variable-length video input
模型嵌入 | Embedded model
- 模型数据直接嵌入可执行文件
- Model data embedded in executable
- 无需外部模型文件
- No external model file required

🎬 完整工具链 | Complete Tool Chain

transnet.exe - 视频镜头检测推理
video_splitter.exe - 视频自动切割
transnet_dll.dll - C ABI接口库

🔧 技术栈 | Technology Stack

PyTorch (LibTorch) - 模型推理
OpenCV 4.12.0 - 视频处理
CMake - 项目构建
C++17 - 编程语言

项目结构 | Project Structure

go-TransNet/
├── archive/
│   ├── src/
│   │   ├── transnet.cpp          # 推理主程序 | Main inference program
│   │   ├── transnet_dll.cpp       # DLL实现 | DLL implementation
│   │   ├── transnet_dll.h         # DLL头文件 | DLL header
│   │   ├── video_splitter.cpp     # 视频切割工具 | Video splitter
│   │   ├── model_data.cpp         # 嵌入模型数据 | Embedded model data
│   │   ├── model_data.h           # 模型数据头 | Model data header
│   │   ├── model_to_header.py     # 模型转换脚本 | Model conversion script
│   │   └── test_model.py          # 测试脚本 | Test script
│   ├── libtorch/                   # PyTorch库 | PyTorch libraries
│   ├── opencv4120/                 # OpenCV库 | OpenCV libraries
│   ├── install/                    # 编译输出 | Build output
│   └── CMakeLists.txt              # CMake配置 | CMake configuration
├── example/                         # 示例视频 | Example videos
└── build/                          # 构建目录 | Build directory

快速开始 | Quick Start

编译项目 | Building the Project

cd archive
mkdir build
cd build
cmake .. -G "Visual Studio 17 2022" -A x64
cmake --build . --config Release

运行推理 | Running Inference

# 使用默认输出目录 ./output
# Using default output directory ./output
transnet.exe video.mp4

# 指定输出目录
# Specify output directory
transnet.exe video.mp4 my_output

视频切割 | Video Splitting

# 使用默认输出目录 ./segments
# Using default output directory ./segments
video_splitter.exe video.mp4 output/scenes.txt

# 指定输出目录
# Specify output directory
video_splitter.exe video.mp4 output/scenes.txt my_segments

使用示例 | Usage Examples

1. 镜头检测 | Shot Detection

// transnet.exe video.mp4
//
// 输出文件 | Output files:
// - output/predictions.txt    # 每帧的预测概率 | Per-frame prediction probabilities
// - output/scenes.txt        # 场景边界 | Scene boundaries

predictions.txt 格式 | Format:

0.123456 0.234567  # single_pred many_pred
0.234567 0.345678
...

scenes.txt 格式 | Format:

0 64        # start_frame end_frame
65 154
155 244
...

2. 视频切割 | Video Splitting

# 切割视频为多个片段
# Split video into multiple segments
video_splitter.exe video.mp4 output/scenes.txt segments

输出文件命名规则 | Output naming convention:

scene_0000_0_64_video.mp4    # scene_ID_startframe_endframe_originalname.mp4
scene_0001_65_154_video.mp4
...

技术架构 | Technical Architecture

多线程推理 | Multi-threaded Inference

┌─────────────────────────────────────────────────────────┐
│                    Input Video                          │
└─────────────────────────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                   Frame Splitting                        │
│         (Divide video into 4 segments)                  │
└─────────────────────────────────────────────────────────┘
                          │
            ┌─────────────┼─────────────┐
            ▼             ▼             ▼
    ┌───────────┐  ┌───────────┐  ┌───────────┐
    │ Thread 0  │  │ Thread 1  │  │ Thread 2  │  Thread 3
    │  Frames   │  │  Frames   │  │  Frames   │  (Frames
    │   0-3129  │  │ 3130-6259 │  │6260-9389  │   9390-12517)
    └─────┬─────┘  └─────┬─────┘  └─────┬─────┘  └────┬─────┘
          │              │              │              │
          ▼              ▼              ▼              ▼
    ┌─────────────────────────────────────────────────┐
    │              TransNetV2 Inference                │
    │           (Parallel Model Forward)               │
    └─────────────────────────────────────────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────┐
    │               Result Aggregation                 │
    └─────────────────────────────────────────────────┘
                          │
                          ▼
    ┌─────────────────────────────────────────────────┐
    │         predictions.txt + scenes.txt            │
    └─────────────────────────────────────────────────┘

视频切割流程 | Video Splitting Flow

┌─────────────────────────────────────────────────────────┐
│                    Input Video                          │
└─────────────────────────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                    scenes.txt                           │
│              (start_frame end_frame)                    │
└─────────────────────────────────────────────────────────┘
                          │
            ┌─────────────┼─────────────┐
            ▼             ▼             ▼
    ┌───────────┐  ┌───────────┐  ┌───────────┐
    │ Segment 0 │  │ Segment 1 │  │ Segment 2 │  ...
    │  Frame    │  │  Frame    │  │  Frame    │
    │  0-64     │  │  65-154   │  │  155-244  │
    └─────┬─────┘  └─────┬─────┘  └─────┬─────┘
          │              │              │
          ▼              ▼              ▼
    ┌─────────────────────────────────────────────────┐
    │           scene_0000_0_64.mp4                  │
    │           scene_0001_65_154.mp4                │
    │           scene_0002_155_244.mp4                │
    └─────────────────────────────────────────────────┘

DLL接口 | DLL Interface

函数列表 | Function List

// 创建/销毁句柄 | Create/Destroy handle
void* transnet_create();
void transnet_destroy(void* handle);

// 加载视频或文件夹 | Load video or folder
int32_t transnet_load_video(void* handle, const char* video_path);
int32_t transnet_load_folder(void* handle, const char* folder_path);

// 设置线程数 | Set thread count
void transnet_set_num_threads(void* handle, int32_t num_threads);

// 运行推理 | Run inference
int32_t transnet_run_inference(void* handle);

// 获取结果 | Get results
TransNetResults transnet_get_results(void* handle);
TransNetScenes transnet_get_scenes(void* handle, float threshold);

// 保存结果 | Save results
int32_t transnet_save_results(void* handle, const char* output_dir);

使用示例 | Usage Example

#include "transnet_dll.h"

void* handle = transnet_create();
transnet_set_num_threads(handle, 4);
transnet_load_video(handle, "video.mp4");
transnet_run_inference(handle);
transnet_save_results(handle, "./output");
transnet_destroy(handle);

性能优化 | Performance Optimization

内存优化 | Memory Optimization

视频帧不再一次性全部加载到内存
Video frames are not all loaded into memory at once
每个线程独立打开视频文件并读取所需帧
Each thread independently opens the video file and reads required frames

并行策略 | Parallel Strategy

4线程同时处理不同视频段
4 threads processing different video segments simultaneously
使用 std::thread 和 std::atomic 管理并发
Using std::thread and std::atomic for concurrency management

编码器选择 | Encoder Selection

使用H.264/AVC编码器 (avc1) 确保兼容性
Using H.264/AVC encoder (avc1) for compatibility
保持原始视频的FPS和分辨率
Maintaining original video's FPS and resolution

依赖项 | Dependencies

必需 | Required

LibTorch 2.0+ - PyTorch C++库 | PyTorch C++ library
OpenCV 4.12.0 - 计算机视觉库 | Computer vision library
CMake 3.18+ - 构建工具 | Build tool
Visual Studio 2022 或等价编译器 | or equivalent compiler

模型嵌入 | Model Embedding

model_to_header.py - 将PyTorch模型转换为C++头文件
Converts PyTorch model to C++ header file

文件说明 | File Descriptions

文件	File	说明
`transnet.cpp`	Main inference program	镜头检测主程序，支持多线程推理
`transnet_dll.cpp`	DLL implementation	C ABI兼容的动态链接库实现
`video_splitter.cpp`	Video splitter	读取scenes.txt切割视频为片段
`model_data.cpp`	Embedded model	嵌入的TransNetV2模型数据
`model_to_header.py`	Model converter	PyTorch模型转C++头文件脚本
`test_model.py`	Test script	模型测试和验证脚本

许可证 | License

本项目基于TransNetV2原项目，遵循相同许可证。

This project is based on the original TransNetV2 and follows the same license.

参考链接 | References

版本历史 | Version History

v1.0 (2026-05-12)

✅ 支持动态输入形状 | Dynamic input shape support
✅ 4线程并行推理 | 4-thread parallel inference
✅ 模型嵌入可执行文件 | Model embedded in executable
✅ 视频自动切割功能 | Automatic video splitting
✅ C ABI DLL接口 | C ABI DLL interface
✅ 中文英文双语文档 | Bilingual documentation (Chinese/English)

联系方式 | Contact

如有问题或建议，请提交Issue。

For questions or suggestions, please submit an Issue.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
install		install
libtorch		libtorch
model		model
opencv4120		opencv4120
src		src
.gitattributes		.gitattributes
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

TransNetV2 Video Shot Detection - C++ Implementation

项目介绍 | Project Introduction

本项目部分代码使用AI工具辅助生成 | Part of the code in this project is generated with the assistance of AI tools

主要特性 | Key Features

🚀 高性能推理 | High-Performance Inference

🎬 完整工具链 | Complete Tool Chain

🔧 技术栈 | Technology Stack

项目结构 | Project Structure

快速开始 | Quick Start

编译项目 | Building the Project

运行推理 | Running Inference

视频切割 | Video Splitting

使用示例 | Usage Examples

1. 镜头检测 | Shot Detection

2. 视频切割 | Video Splitting

技术架构 | Technical Architecture

多线程推理 | Multi-threaded Inference

视频切割流程 | Video Splitting Flow

DLL接口 | DLL Interface

函数列表 | Function List

使用示例 | Usage Example

性能优化 | Performance Optimization

内存优化 | Memory Optimization

并行策略 | Parallel Strategy

编码器选择 | Encoder Selection

依赖项 | Dependencies

必需 | Required

模型嵌入 | Model Embedding

文件说明 | File Descriptions

许可证 | License

参考链接 | References

版本历史 | Version History

v1.0 (2026-05-12)

联系方式 | Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages