Chapter 5
Estimation of Excitation Parameters
1. 激励源模型
2. 基音周期估计
Ⅰ.Excitation Source Models
1-1 Ideal Excitation Source
x(n)A(z)e(n)1x(n)A(z)E(z)A(z)X(z)X(z)1A(z)E(z)
If e(n) were encoded directly , high bit rate would be required.
1-2 Binary Excitation Model
1
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters PitchImpulse Train GeneratorV/UA(z) Random Noise Generator Encoded bit rate is low, but the naturalness of reconstructed speech is not good.
1-3 Mixed Excitation Models ( Itakura & Saito, 1968)
Pulse GeneratorV1 NoiseA(z) GeneratorRe(0)U 2
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
1VU=1-V0.250.500.18Re(n)Re(n)Re(0)
在3-15ms内搜寻
Re(0)的峰值位置np
V0.25Re(n)URe(0)V0.18U1001VoicedUnvoiced
0.18Re(np)Re(0)0.25,V01
1-4 Multiband Excitation Model(多带激励)
将话带语音分成若干个子带,然后针对每个子带进行清浊音判决,因此可以更好的建模激励信号。
3
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
1-5 Baseband Excitation Model
A(z)e(n) BasebandHigh ExtractionFrequency1RegenerationA(z) LPAnalysis Baseband Extraction:
e(n) LPFb(n)Decimation (W)(B) L:1fc1000HzFs8kHz,L4BWL1kHz
HFR:
Interpolation 1:LHFRHPF 1-6 Multi-pulse Excitation Model
Bishnu S. Atal,1982, ICASSP
4
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
e(n)g1g20n1n2nng3Mn3gM
e(n)Mgi(nni),M:48/5msi1
Regular-pulse Excitation applied in GSM vocoder
1-7 Code Excitation Model
x(n) e(n)二次残差F(z)P(z)
1024*4010bit/40CODEx(n)BOOKP(z)F(z)STPLTP
5
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
Shorttermpredictor:F(z)a(k)z,20mskk1IpLongtermpredictor:P(z)c(i)ziI(T0i),5ms。I0orI1
Manfred R. Schroeder and Bishnu S. Atal , 1985, ICASSP
CELP:Code Excited Linear Prediction 码激励线性预测
G.728 LD-CELP 16kbps
G.729 CS-ACELP 8kbps IP Phone
G.723 ACELP 5.3kbps
Ⅱ.Pitch Period Estimation
2-1 Introduction
·Definition: Fundamental frequency(and close.
F0) is the rate at which the vocal folds open
6
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
Reciprocal of
F0 is the Pitch period
T01F0
·取值范围: 25-2.5ms (40-400Hz)
·重要性:基音频率是基本的语言声学参数。可用于语音识别,说话人识别,语音合成,语音编码。
·汉语中的四声取决于
F0的变化。
5F05-543-5mamma35-1妈麻a马m骂a22-1-41
·估计
F0的困难:
⑴过渡段语音难分。V/U
⑵
F0的变化范围大
⑶声道对激励参数估计的影响
7
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
·Types of Pitch Detectors
⑴Time-Domain
⑵Frequency-Domain
⑶Hybrid
2-2 Clipping Autocorrelation Pitch Detector
y(n)x(n)CL,x(n)CLx(n)中心y(n)C削波Ly(n)0CLx(n)0,x(n)CLx(n)CL,x(n)CL
8
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
y(n)1CL,x(n)CLx(n)三电平削y(n)CLy(n)0,x(n)CL波0CLx(n)-1CL,x(n)CL
N1kR(k)x(n)x(nk),k0,1,k1,,k2n0k120,(startpoint)2ms500HzIpk0.3,Uk)20ms50Hz,2200,(endpointR(0)0.3,VT0Ipos
9
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
2-3 Simplified Inverse Filtering Tracking (SIFT) Algorithm
10
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
900Hzs(n)LPFDecimationx(n) 5:1A(z)e(n)1fs=10kHz1z Peak PickwH(n)抛物线 内插 1:5LP Analysis p=4T0IpkComputee(n) R(k)IpkDecision RuleVUT00
Ipos基音周期范围:
5kL6,(61)2.5ms400Hz105kH32,(321)15.5ms64.5Hz10
判别Ipk的方法——可变门限法
T015.5msT09msT02.5msIpk0.315R(k)Ipk0.35R(k)Ipk0.7R(k)
T01T03T02(T01,T02,T03表示连续3帧基音周期)2语音流中基音周期延时判决:
11
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
ModifiedAutocorrelation:R(k)x(n)x(nk)k0N1N取2倍T0,也可取最长的T0或用自适应方法。2-4 Cepstral Method
s(n)EnframeFs=10kHz51.2ms/framewH(n)Z(n)FFTlogIFFTU/High V/LowVPeaksionT0IposPickIposRuleUT00IpkDeci-E(n)Silence?No ComputeCepstrum C(n)Peak PickC(n)>T?NoVNoZ(n)>t?YesNext FrameYesYesVU
C(n)IpkIpos2-5 AMDF(短时平均幅度差函数)
n
12
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
2-6 基音检测的后处理
平滑(中值平滑,线性平滑,组合平滑,高斯平滑)
13
Digital Processing of Speech Signal Chapter 5 Estimation of Excitation Parameters
14
因篇幅问题不能全部显示,请点此查看更多更全内容