软硬件环境
- Windows 10 64bit
- Anaconda3 with python 3.7
- ffmpeg static
- autosub 0.5.6-alpha
- subtitle edit
视频看这里
此处是youtube
的播放链接,需要科学上网。喜欢我的视频,请记得订阅我的频道,打开旁边的小铃铛,点赞并分享,感谢您的支持。
简介
做自媒体,上字幕一直都是比较头疼的事情,耗时耗力不说,加了字幕也不见得平台能有多大的推荐,所以,很多刚做自媒体的朋友,比如我,就不会花大气力在字幕上。
有些人在做视频之前,会先写好脚本,所以字幕内容实际上已经写好了,因此,他们在上字幕的时候只是需要去调整字幕出现的时间点,可以通过视频编辑工具实现。那另一些人,就是先录制好视频,然后才去上字幕,我就是属于这一种。本文也是基于这种情况。
目前比较流行的做字幕方法,一般都是通过网络上开放的免费或付费AI
语音转文字的工具,得到字幕文件,然后再手工调整,最后将字幕压回视频文件或者直接上传到平台。这其中的语音转文字是最关键的一步,网络上也有一大堆这样的平台,但是都各有利弊
-
网易见外
需要剥离出音频,上传到网易平台
-
讯飞
需要付费
-
gosubtitle
免费套餐只能处理1分钟
准备工作
首先看看整个工作流
开始字幕制作之前,我们需要安装环境
python环境
我们使用的是anaconda
,下载链接 https://www.anaconda.com/distribution,在傻瓜式安装过程中,需要注意,在设置环境变量PATH
的选项我们勾上,省掉后续自行设置的麻烦
ffmpeg环境
到下载链接 https://ffmpeg.zeranoe.com/builds/,下载最新的版本,然后解压后将ffmpeg.exe
文件所在的路径添加到系统环境变量PATH
中
autosub安装
autosub
项目地址是 https://github.com/BingLingGroup/autosub,我们下载zip
包,解压,进入到目录,执行下面命令进行安装
cd autosub-0.5.6-alpha
pip install .
在这一步,如果出现visual c++ 14.0 is required
的错误时,需要到微软的站点 https://visualstudio.microsoft.com/zh-hans/visual-cpp-build-tools/ 下载生成工具并安装C++
组件,然后再次执行上述pip install
的操作
subtitleedit安装
subtitleedit
项目地址是 https://github.com/SubtitleEdit/subtitleedit,下载下来是一个exe
文件,直接安装就好了
生成字幕
假设我们已经剪辑好了视频文件test.mp4
,我们可以通过下面命令得到字幕文件
autosub.exe -i test.mp4 -S cmn-hans-cn -D zh-cn
其中-S
代表语音语言代号,选项可以通过autosub.exe -lsc
来查看
$ autosub.exe -lsc
List of all lang codes for speech-to-text:
Lang code Description
af-za Afrikaans (South Africa)
am-et Amharic (Ethiopia)
ar-ae Arabic (United Arab Emirates)
ar-bh Arabic (Bahrain)
ar-dz Arabic (Algeria)
ar-eg Arabic (Egypt)
ar-il Arabic (Israel)
ar-iq Arabic (Iraq)
ar-jo Arabic (Jordan)
ar-kw Arabic (Kuwait)
ar-lb Arabic (Lebanon)
ar-ma Arabic (Morocco)
ar-om Arabic (Oman)
ar-ps Arabic (State of Palestine)
ar-qa Arabic (Qatar)
ar-sa Arabic (Saudi Arabia)
ar-tn Arabic (Tunisia)
az-az Azerbaijani (Azerbaijan)
bg-bg Bulgarian (Bulgaria)
bn-bd Bengali (Bangladesh)
bn-in Bengali (India)
ca-es Catalan (Spain)
cmn-hans-cn Chinese, Mandarin (Simplified, China)
cmn-hans-hk Chinese, Mandarin (Simplified, Hong Kong)
cmn-hant-tw Chinese, Mandarin (Traditional, Taiwan)
cs-cz Czech (Czech Republic)
da-dk Danish (Denmark)
de-de German (Germany)
el-gr Greek (Greece)
en-au English (Australia)
en-ca English (Canada)
en-gb English (United Kingdom)
en-gh English (Ghana)
en-ie English (Ireland)
en-in English (India)
en-ke English (Kenya)
en-ng English (Nigeria)
en-nz English (New Zealand)
en-ph English (Philippines)
en-sg English (Singapore)
en-tz English (Tanzania)
en-us English (United States)
en-za English (South Africa)
es-ar Spanish (Argentina)
es-bo Spanish (Bolivia)
es-cl Spanish (Chile)
es-co Spanish (Colombia)
es-cr Spanish (Costa Rica)
es-do Spanish (Dominican Republic)
es-ec Spanish (Ecuador)
es-es Spanish (Spain)
es-gt Spanish (Guatemala)
es-hn Spanish (Honduras)
es-mx Spanish (Mexico)
es-ni Spanish (Nicaragua)
es-pa Spanish (Panama)
es-pe Spanish (Peru)
es-pr Spanish (Puerto Rico)
es-py Spanish (Paraguay)
es-sv Spanish (El Salvador)
es-us Spanish (United States)
es-uy Spanish (Uruguay)
es-ve Spanish (Venezuela)
eu-es Basque (Spain)
fa-ir Persian (Iran)
fi-fi Finnish (Finland)
fil-ph Filipino (Philippines)
fr-ca French (Canada)
fr-fr French (France)
gl-es Galician (Spain)
gu-in Gujarati (India)
he-il Hebrew (Israel)
hi-in Hindi (India)
hr-hr Croatian (Croatia)
hu-hu Hungarian (Hungary)
hy-am Armenian (Armenia)
id-id Indonesian (Indonesia)
is-is Icelandic (Iceland)
it-it Italian (Italy)
ja-jp Japanese (Japan)
jv-id Javanese (Indonesia)
ka-ge Georgian (Georgia)
km-kh Khmer (Cambodia)
kn-in Kannada (India)
ko-kr Korean (South Korea)
lo-la Lao (Laos)
lt-lt Lithuanian (Lithuania)
lv-lv Latvian (Latvia)
ml-in Malayalam (India)
mr-in Marathi (India)
ms-my Malay (Malaysia)
nb-no Norwegian Bokmal (Norway)
ne-np Nepali (Nepal)
nl-nl Dutch (Netherlands)
pl-pl Polish (Poland)
pt-br Portuguese (Brazil)
pt-pt Portuguese (Portugal)
ro-ro Romanian (Romania)
ru-ru Russian (Russia)
si-lk Sinhala (Sri Lanka)
sk-sk Slovak (Slovakia)
sl-si Slovenian (Slovenia)
sr-rs Serbian (Serbia)
su-id Sundanese (Indonesia)
sv-se Swedish (Sweden)
sw-ke Swahili (Kenya)
sw-tz Swahili (Tanzania)
ta-in Tamil (India)
ta-lk Tamil (Sri Lanka)
ta-my Tamil (Malaysia)
ta-sg Tamil (Singapore)
te-in Telugu (India)
th-th Thai (Thailand)
tr-tr Turkish (Turkey)
uk-ua Ukrainian (Ukraine)
ur-in Urdu (India)
ur-pk Urdu (Pakistan)
vi-vn Vietnamese (Vietnam)
yue-hant-hk Chinese, Cantonese (Traditional, Hong Kong)
zu-za Zulu (South Africa)
All works done.
而-D
代表翻译后的语言代号,可以通过autosub.exe -ltc
来查看
$ autosub.exe -ltc
List of all lang codes for translation:
Lang code Description
af Afrikaans
am Amharic
ar Arabic
az Azerbaijani
be Belarusian
bg Bulgarian
bn Bengali
bs Bosnian
ca Catalan
ceb Cebuano
co Corsican
cs Czech
cy Welsh
da Danish
de German
el Greek
en English
eo Esperanto
es Spanish
et Estonian
eu Basque
fa Persian
fi Finnish
fr French
fy Frisian
ga Irish
gd Scots Gaelic
gl Galician
gu Gujarati
ha Hausa
haw Hawaiian
he Hebrew
hi Hindi
hmn Hmong
hr Croatian
ht Haitian Creole
hu Hungarian
hy Armenian
id Indonesian
ig Igbo
is Icelandic
it Italian
iw Hebrew
ja Japanese
jw Javanese
ka Georgian
kk Kazakh
km Khmer
kn Kannada
ko Korean
ku Kurdish
ky Kyrgyz
la Latin
lb Luxembourgish
lo Lao
lt Lithuanian
lv Latvian
mg Malagasy
mi Maori
mk Macedonian
ml Malayalam
mn Mongolian
mr Marathi
ms Malay
mt Maltese
my Myanmar(Burmese)
ne Nepali
nl Dutch
no Norwegian
ny Nyanja(Chichewa)
pa Punjabi
pl Polish
ps Pashto
pt Portuguese(Portugal,Brazil)
ro Romanian
ru Russian
sd Sindhi
si Sinhala(Sinhalese)
sk Slovak
sl Slovenian
sm Samoan
sn Shona
so Somali
sq Albanian
sr Serbian
st Sesotho
su Sundanese
sv Swedish
sw Swahili
ta Tamil
te Telugu
tg Tajik
th Thai
tl Tagalog(Filipino)
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
xh Xhosa
yi Yiddish
yo Yoruba
zh Chinese (Simplified)
zh-cn Chinese (Simplified)
zh-tw Chinese (Traditional)
zu Zulu
All works done.
等到autosub.exe
处理完毕后,在视频的同目录下就会生成test.cmn-hans-cn.srt
文件,这时候我们就可以使用subtitleedit
来修正字幕文件了,就目前的AI
水平来说,错误是不可避免的。
关于subtitleedit
软件的具体使用,这里就不讲了,记得编辑后保存。这时候的字幕文件就是我们想要的了,对于像youtube
这样支持外挂字幕的平台,这时候我们就可以上传到平台了。如果目标平台不支持外挂字幕的话,我们就需要进行最后一步,将字幕压回到视频当中。
字幕压回视频
我们利用ffmpeg
来完成,命令如下
ffmpeg -i test.mp4 -vf subtitles=test.cmn-hans-cn.srt test_with_subtitle.mp4
在得到最终带有内嵌字幕的视频文件test_with_subtitle.mp4
后,我们就可以将它上传到平台了。