• 作者:老汪软件技巧
  • 发表时间:2024-11-16 00:02
  • 浏览量:

本文工具已打包为exe,下载解压后双击 app.exe 可用,具体使用方法和原理请继续阅读本文

吴恩达老师的「反思式三步翻译法」非常有效,它通过让模型自我审视翻译结果并提出改进建议,进一步提升翻译质量。然而,直接将该方法应用于 SRT 格式字幕翻译却存在一些挑战。

SRT 字幕格式的特殊要求

SRT 格式字幕有严格的格式要求:

字幕之间使用两个空行隔开。

示例:

1
00:00:01,950 --> 00:00:04,430
五老星系中发现了有几分子,
2
00:00:04,720 --> 00:00:06,780
我们离第三类接触还有多元。
3
00:00:07,260 --> 00:00:09,880
微博真是展开拍摄任务已经进来周年,
4
00:00:10,140 --> 00:00:12,920
最近也传过来许多过去难以拍摄到的照片。

SRT 翻译中的常见问题

在使用 AI 翻译 SRT 字幕时,可能会出现以下问题:

翻译质量问题:

常见错误示例:

image.png

image.png

image.png

像上面所述,当前后两条字幕在语法上属于一句时,很可能会被翻译为同一条,导致结果字幕条数缺少

image.png

而格式出现错误直接导致后续依赖srt的流程无法进行,不同模型出现的错误和出错概率各不相同,相对来说,智能程度越高的模型,越可能返回合法的符合要求的内容,而本地部署的小规模模型几乎压根不可用。

不过鉴于三步反思法对翻译质量的提升,还是尽量尝试了下。最终选择使用 gemini-1.5-flash 小小尝试一下,主要因为它的智能程度足够、而且免费,除了限制频繁,其他几乎无限制。

撰写提示词思路

按照吴恩达的三步反思工作流,撰写提示词

所不同的是加强要求返回的内容务必是合法的SRT格式,虽然它未必百分百遵从。

搭建简单api

三步反思模式一个问题是额外消耗多得多的token,提示词变长,输出结果变长,另外因Gemini的频率限制,超频会返回429报错,需要在每次请求之后暂停一段时间。

采用 flask 搭建后端api,前台使用 bootstrap5 简单做个单页,总体界面如下

image.png

显然国内想使用 Gemini 必须有梯子

返回结果示例


1
00:00:01,950 --> 00:00:04,430
Several molecules have been discovered in the five-star system,
2
00:00:04,720 --> 00:00:06,780
We are still multiple universes away from third-type contact.
3
00:00:07,260 --> 00:00:09,880
Weibo has been carrying out filming missions for years now,
4
00:00:10,140 --> 00:00:12,920
Many previously difficult-to-capture photos have been transmitted recently.
5
00:00:13,440 --> 00:00:17,500
In early June, astronomers published this photo in Nature,
6
00:00:18,040 --> 00:00:19,180
Outside the blue core,
7
00:00:19,360 --> 00:00:21,380
There\'s also this circle of orange light,
8
00:00:21,900 --> 00:00:23,740
This is a new drama-scale sweet donut,
9
00:00:24,380 --> 00:00:25,640
This is a portal.
10
00:00:26,280 --> 00:00:28,100
This is the generation ring of an alien civilization,

”反思式三步翻译法” 应用于字幕翻译的尝试_”反思式三步翻译法” 应用于字幕翻译的尝试_

略...
* **Line 1:** "Five-star system" is likely a mistranslation. It probably refers to a five-member committee or group, not a star system. Clarify the context. * **Line 2:** "Multiple universes" seems like an over-exaggeration. Rephrase for clarity and accuracy. * **Line 3:** "Weibo" should be explained as a Chinese social media platform. "Filming missions" is unclear. Does it mean "posting videos/images"? * **Line 8:** "Drama-scale sweet donut" is a nonsensical literal translation. Figure out the intended meaning. * **Line 9:** "Portal" seems out of context. Verify the intended meaning. * **Line 10:** "Generation ring" is likely a mistranslation. Clarify the context. * **Line 11:** "Organic polycyclic aromatic hydrocarbons" is overly technical for a general audience. Simplify if possible. * **Line 12 and 14:** Use the correct formatting for the galaxy\'s name: SPT0418-47. * **Line 15:** "It hasn\'t been shortened" is awkward. Remove or rephrase. * **Line 28:** The name of the organization and the resource should be translated more naturally and accurately. Consider breaking this long line into two for better readability. * **Line 29:** "Cute plush dolls" may sound childish. Consider rephrasing as "animated characters" or similar. * **Line 35:** "James Webb Space Telescope" should be used consistently throughout. Shortening to "Webb Telescope" after the first mention is acceptable. * **Line 44:** "SPD048" is likely a typo. It should be SPT0418-47 to be consistent. * **Line 45-46:** "Standard beautiful photo" is redundant. Simplify to "beautiful photo". * **Line 48:** "Grovitational Lenshin" is a typo. Correct to "Gravitational Lensing". * **Line 50:** The sentence is incomplete. Finish the thought. 1 00:00:01,950 --> 00:00:04,430 Several molecules have been discovered in the five-member group\'s area of focus. 2 00:00:04,720 --> 00:00:06,780 We are still far from making contact with extraterrestrial life. 3 00:00:07,260 --> 00:00:09,880 The James Webb Space Telescope has been capturing images for a year now, 4 00:00:10,140 --> 00:00:12,920 and has recently transmitted many previously unseen photos. 5 00:00:13,440 --> 00:00:17,500 In early June, astronomers published this image in Nature. 6 00:00:18,040 --> 00:00:19,180 Outside the blue core, 7 00:00:19,360 --> 00:00:21,380 there\'s a ring of orange light. 8 00:00:21,900 --> 00:00:23,740 This is a large, ring-shaped structure. 9 00:00:24,380 --> 00:00:25,640 This is being investigated. 10 00:00:26,280 --> 00:00:28,100 This is thought to be a sign of an early galaxy. 略...

从结果中提取出标签内文本,即是翻译结果。

简单打了一个包,感兴趣可下载本地尝试

直接下载,解压后双击app.exe即可自动在浏览器中打开上述UI界面,输入在Gemini申请的Key、填写代理地址、选择要翻译的srt字幕文件、选择要翻译到的目标语言,试试结果。

百度网盘 /s/1hGBjLRND…

image.png

Q1: 反思工作流与传统机器翻译有何不同?

A1: 反思工作流引入了自我评估和优化机制,模拟人类译者的思考过程,能够产生更加精准和自然的翻译结果。

Q2: 使用反思工作流需要多长时间?

A2: 虽然反思工作流需要多次AI处理,但通常只比传统方法多花费10–20秒,考虑到翻译质量的提升,这点时间投资是值得的。

Q3: 反思工作流能保证字幕翻译结果一定是合法srt吗

A3: 不能,仍可能出现空行、同原始字幕数不一致的问题,例如前后两条字幕,后边一条仅有3-5个文字,而且语法上属于上面一句的连续,那么翻译结果很可能会合并为一条

参考资料

/andrewyng/t…

baoyu.io/blog/prompt…


上一条查看详情 +打家劫舍二
下一条 查看详情 +没有了