跳到主要內容

使用RNN模型作語音訊號的除噪

使用RNN模型作語音訊號的除噪

 O​riginal Voice Clip:錄下的人聲語音訊號

 O​ffice Noise:錄下的辦公室背景噪音

 Mixing:前2項混合而成的訊號

 After RNN:透過RNN模型除噪後的語音訊號波形

 By Directional Mic in F15K:筆電上指向型麥克風的所錄下除噪後的波形

結論:透過RNN模型除噪的功能,具有近似指向型麥克風的除噪功能。也就是以軟體處理的技術來取代實體裝置。

 From lower right 2 pictures, voice clip mixed with background noise was restored well in comparison to audio waveform recording by directional mic and original voice clip waveform.


Ref:
https://people.xiph.org/~jm/demo/rnnoise/
https://hacks.mozilla.org/2017/09/rnnoise-deep-learning-noise-suppression/
https://github.com/xiph/rnnoise

mic denoise rnn
        audacity:
        import -> raw data -> Signed 16bit/Little endian/one channel, 48000
        rnn
        frank@frank-GL753VD:~/1T/back0529/mic
        import format: Signed 16 bit mono format
        frank@frank-GL753VD:~/1T/back0529/mic
        ~/1T/back0529/mic/RNNtest/F15K/tsai1.raw  ---->recorded by 2 mic (directional mic)
        ~/1T/back0529/mic/RNNtest/AS3EA/OUTtsai1.pcm  ---->after RNN
        ~/1T/back0529/mic/RNNtest/AS3EA/tsai1.pcm  ----> after Mixing





留言

這個網誌中的熱門文章

A3C in ATARI Pong-V0

ATARI PONG 對戰模式,左邊為遊戲程式,右邊為訓練中的A3C模型。一局以21分決勝負,對手MISS 一球得一分。從LOG可以看出,A3C模型從最初全敗的輸21分,經過2小時左右的TRAINING,已經逆轉至幾乎每局都勝利,偶爾甚至勝出高達13分。 底下為TRAINING A3C MODEL過程的LOG, (base) frank@viper1:~/a3c$ python main.py --env-name "Pong-v0" --num-processes 8 Time 00h 00m 10s, episode reward -21.0, episode length 1026 Time 00h 01m 18s, episode reward -21.0, episode length 1020 Time 00h 02m 26s, episode reward -21.0, episode length 1029 Time 00h 03m 35s, episode reward -21.0, episode length 1023 Time 00h 04m 42s, episode reward -21.0, episode length 1014 Time 00h 05m 50s, episode reward -21.0, episode length 1087 Time 00h 07m 00s, episode reward -21.0, episode length 1359 Time 00h 08m 14s, episode reward -16.0, episode length 1922 Time 00h 09m 30s, episode reward -14.0, episode length 2220 Time 00h 10m 48s, episode reward -15.0, episode length 2431 Time 00h 12m 04s, episode reward -15.0, episode length 2211 Time 00h 13m 23s, episode reward -8.0, episode length 2600 Time 00h 14m 38s, episod...

OCR應用在電子元件上的辨識

 OCR Application Example1: for SMD idenfication : Text detect by CRAFT   OCR文字偵測 原始照片為網路上下載,再套上OCR文字偵測顯示結果,若有侵權請告知移除 彩色區域為偵測到文字的部份 Output 10 coordinates of corresponding text blocks 1.  144,196,286,194,287,259,145,261 2.  298,198,509,196,509,259,298,262 3.  148,262,286,262,286,321,148,321 4.  368,266,513,264,513,321,369,323 5.  145,331,472,333,471,395,145,393 6.  146,404,445,404,445,454,146,454 7.  146,453,512,453,512,502,146,502 8.  147,502,481,499,481,551,148,553 9.  148,550,614,550,614,600,148,600 10.513,600,714,600,714,648,513,648  After image pre-processing:    OCR result1:   After image pre-processing:  OCR result2:     Example2: for datasheet interpretation : Text detect of TI datasheet by CRAFT OCR results: ([[75, 11], [127, 11], [127, 31], [75, 31]], 'TEXAS', 0.999188403930061) ([[474, 4], [928, 4], [928, 32], [474, 32]], 'PACKAGE MATERIALS INFORMATION', 0.6743955072876302) ([[77, 29],...