YouTube:https://www.youtube.com/c/xugaoxiang;Bilibili: 迷途小书童的Note;微信公众号: Dev_Club

如何快速上字幕?

IT技巧 迷途小书童 0评论

软硬件环境

  • Windows 10 64bit
  • Anaconda3 with python 3.7
  • ffmpeg static
  • autosub 0.5.6-alpha
  • subtitle edit

视频看这里

此处是youtube的播放链接,需要科学上网。喜欢我的视频,请记得订阅我的频道,打开旁边的小铃铛,点赞并分享,感谢您的支持。

简介

做自媒体,上字幕一直都是比较头疼的事情,耗时耗力不说,加了字幕也不见得平台能有多大的推荐,所以,很多刚做自媒体的朋友,比如我,就不会花大气力在字幕上。

有些人在做视频之前,会先写好脚本,所以字幕内容实际上已经写好了,因此,他们在上字幕的时候只是需要去调整字幕出现的时间点,可以通过视频编辑工具实现。那另一些人,就是先录制好视频,然后才去上字幕,我就是属于这一种。本文也是基于这种情况。

目前比较流行的做字幕方法,一般都是通过网络上开放的免费或付费AI语音转文字的工具,得到字幕文件,然后再手工调整,最后将字幕压回视频文件或者直接上传到平台。这其中的语音转文字是最关键的一步,网络上也有一大堆这样的平台,但是都各有利弊

  • 网易见外

    需要剥离出音频,上传到网易平台

  • 讯飞

    需要付费

  • gosubtitle

    免费套餐只能处理1分钟

准备工作

首先看看整个工作流

subtitle

开始字幕制作之前,我们需要安装环境

python环境

我们使用的是anaconda,下载链接 https://www.anaconda.com/distribution,在傻瓜式安装过程中,需要注意,在设置环境变量PATH的选项我们勾上,省掉后续自行设置的麻烦

subtitle

ffmpeg环境

到下载链接 https://ffmpeg.zeranoe.com/builds/,下载最新的版本,然后解压后将ffmpeg.exe文件所在的路径添加到系统环境变量PATH

subtitle

autosub安装

autosub项目地址是 https://github.com/BingLingGroup/autosub,我们下载zip包,解压,进入到目录,执行下面命令进行安装

cd autosub-0.5.6-alpha
pip install .

subtitleedit安装

subtitleedit项目地址是 https://github.com/SubtitleEdit/subtitleedit,下载下来是一个exe文件,直接安装就好了

生成字幕

假设我们已经剪辑好了视频文件test.mp4,我们可以通过下面命令得到字幕文件

autosub.exe -i test.mp4 -S cmn-hans-cn -D zh-cn

其中-S代表语音语言代号,选项可以通过autosub.exe -lsc来查看

$ autosub.exe -lsc
List of all lang codes for speech-to-text:

Lang code         Description
af-za             Afrikaans (South Africa)
am-et             Amharic (Ethiopia)
ar-ae             Arabic (United Arab Emirates)
ar-bh             Arabic (Bahrain)
ar-dz             Arabic (Algeria)
ar-eg             Arabic (Egypt)
ar-il             Arabic (Israel)
ar-iq             Arabic (Iraq)
ar-jo             Arabic (Jordan)
ar-kw             Arabic (Kuwait)
ar-lb             Arabic (Lebanon)
ar-ma             Arabic (Morocco)
ar-om             Arabic (Oman)
ar-ps             Arabic (State of Palestine)
ar-qa             Arabic (Qatar)
ar-sa             Arabic (Saudi Arabia)
ar-tn             Arabic (Tunisia)
az-az             Azerbaijani (Azerbaijan)
bg-bg             Bulgarian (Bulgaria)
bn-bd             Bengali (Bangladesh)
bn-in             Bengali (India)
ca-es             Catalan (Spain)
cmn-hans-cn       Chinese, Mandarin (Simplified, China)
cmn-hans-hk       Chinese, Mandarin (Simplified, Hong Kong)
cmn-hant-tw       Chinese, Mandarin (Traditional, Taiwan)
cs-cz             Czech (Czech Republic)
da-dk             Danish (Denmark)
de-de             German (Germany)
el-gr             Greek (Greece)
en-au             English (Australia)
en-ca             English (Canada)
en-gb             English (United Kingdom)
en-gh             English (Ghana)
en-ie             English (Ireland)
en-in             English (India)
en-ke             English (Kenya)
en-ng             English (Nigeria)
en-nz             English (New Zealand)
en-ph             English (Philippines)
en-sg             English (Singapore)
en-tz             English (Tanzania)
en-us             English (United States)
en-za             English (South Africa)
es-ar             Spanish (Argentina)
es-bo             Spanish (Bolivia)
es-cl             Spanish (Chile)
es-co             Spanish (Colombia)
es-cr             Spanish (Costa Rica)
es-do             Spanish (Dominican Republic)
es-ec             Spanish (Ecuador)
es-es             Spanish (Spain)
es-gt             Spanish (Guatemala)
es-hn             Spanish (Honduras)
es-mx             Spanish (Mexico)
es-ni             Spanish (Nicaragua)
es-pa             Spanish (Panama)
es-pe             Spanish (Peru)
es-pr             Spanish (Puerto Rico)
es-py             Spanish (Paraguay)
es-sv             Spanish (El Salvador)
es-us             Spanish (United States)
es-uy             Spanish (Uruguay)
es-ve             Spanish (Venezuela)
eu-es             Basque (Spain)
fa-ir             Persian (Iran)
fi-fi             Finnish (Finland)
fil-ph            Filipino (Philippines)
fr-ca             French (Canada)
fr-fr             French (France)
gl-es             Galician (Spain)
gu-in             Gujarati (India)
he-il             Hebrew (Israel)
hi-in             Hindi (India)
hr-hr             Croatian (Croatia)
hu-hu             Hungarian (Hungary)
hy-am             Armenian (Armenia)
id-id             Indonesian (Indonesia)
is-is             Icelandic (Iceland)
it-it             Italian (Italy)
ja-jp             Japanese (Japan)
jv-id             Javanese (Indonesia)
ka-ge             Georgian (Georgia)
km-kh             Khmer (Cambodia)
kn-in             Kannada (India)
ko-kr             Korean (South Korea)
lo-la             Lao (Laos)
lt-lt             Lithuanian (Lithuania)
lv-lv             Latvian (Latvia)
ml-in             Malayalam (India)
mr-in             Marathi (India)
ms-my             Malay (Malaysia)
nb-no             Norwegian Bokmal (Norway)
ne-np             Nepali (Nepal)
nl-nl             Dutch (Netherlands)
pl-pl             Polish (Poland)
pt-br             Portuguese (Brazil)
pt-pt             Portuguese (Portugal)
ro-ro             Romanian (Romania)
ru-ru             Russian (Russia)
si-lk             Sinhala (Sri Lanka)
sk-sk             Slovak (Slovakia)
sl-si             Slovenian (Slovenia)
sr-rs             Serbian (Serbia)
su-id             Sundanese (Indonesia)
sv-se             Swedish (Sweden)
sw-ke             Swahili (Kenya)
sw-tz             Swahili (Tanzania)
ta-in             Tamil (India)
ta-lk             Tamil (Sri Lanka)
ta-my             Tamil (Malaysia)
ta-sg             Tamil (Singapore)
te-in             Telugu (India)
th-th             Thai (Thailand)
tr-tr             Turkish (Turkey)
uk-ua             Ukrainian (Ukraine)
ur-in             Urdu (India)
ur-pk             Urdu (Pakistan)
vi-vn             Vietnamese (Vietnam)
yue-hant-hk       Chinese, Cantonese (Traditional, Hong Kong)
zu-za             Zulu (South Africa)

All works done.

-D代表翻译后的语言代号,可以通过autosub.exe -ltc来查看

$ autosub.exe -ltc
List of all lang codes for translation:

Lang code         Description
af                Afrikaans
am                Amharic
ar                Arabic
az                Azerbaijani
be                Belarusian
bg                Bulgarian
bn                Bengali
bs                Bosnian
ca                Catalan
ceb               Cebuano
co                Corsican
cs                Czech
cy                Welsh
da                Danish
de                German
el                Greek
en                English
eo                Esperanto
es                Spanish
et                Estonian
eu                Basque
fa                Persian
fi                Finnish
fr                French
fy                Frisian
ga                Irish
gd                Scots Gaelic
gl                Galician
gu                Gujarati
ha                Hausa
haw               Hawaiian
he                Hebrew
hi                Hindi
hmn               Hmong
hr                Croatian
ht                Haitian Creole
hu                Hungarian
hy                Armenian
id                Indonesian
ig                Igbo
is                Icelandic
it                Italian
iw                Hebrew
ja                Japanese
jw                Javanese
ka                Georgian
kk                Kazakh
km                Khmer
kn                Kannada
ko                Korean
ku                Kurdish
ky                Kyrgyz
la                Latin
lb                Luxembourgish
lo                Lao
lt                Lithuanian
lv                Latvian
mg                Malagasy
mi                Maori
mk                Macedonian
ml                Malayalam
mn                Mongolian
mr                Marathi
ms                Malay
mt                Maltese
my                Myanmar(Burmese)
ne                Nepali
nl                Dutch
no                Norwegian
ny                Nyanja(Chichewa)
pa                Punjabi
pl                Polish
ps                Pashto
pt                Portuguese(Portugal,Brazil)
ro                Romanian
ru                Russian
sd                Sindhi
si                Sinhala(Sinhalese)
sk                Slovak
sl                Slovenian
sm                Samoan
sn                Shona
so                Somali
sq                Albanian
sr                Serbian
st                Sesotho
su                Sundanese
sv                Swedish
sw                Swahili
ta                Tamil
te                Telugu
tg                Tajik
th                Thai
tl                Tagalog(Filipino)
tr                Turkish
uk                Ukrainian
ur                Urdu
uz                Uzbek
vi                Vietnamese
xh                Xhosa
yi                Yiddish
yo                Yoruba
zh                Chinese (Simplified)
zh-cn             Chinese (Simplified)
zh-tw             Chinese (Traditional)
zu                Zulu

All works done.

等到autosub.exe处理完毕后,在视频的同目录下就会生成test.cmn-hans-cn.srt文件,这时候我们就可以使用subtitleedit来修正字幕文件了,就目前的AI水平来说,错误是不可避免的。

关于subtitleedit软件的具体使用,这里就不讲了,记得编辑后保存。这时候的字幕文件就是我们想要的了,对于像youtube这样支持外挂字幕的平台,这时候我们就可以上传到平台了。如果目标平台不支持外挂字幕的话,我们就需要进行最后一步,将字幕压回到视频当中。

字幕压回视频

我们利用ffmpeg来完成,命令如下

ffmpeg -i test.mp4 -vf subtitles=test.cmn-hans-cn.srt test_with_subtitle.mp4

在得到最终带有内嵌字幕的视频文件test_with_subtitle.mp4后,我们就可以将它上传到平台了。

喜欢 (7)
发表我的评论
取消评论

表情