Image and Speech recognition

Detection of speaker

(P1) Read and write a .wav file, acquire different learning samples of speech

for different speakers – write them as .wav files. Perform initial segmentation

into phonemes and silence. Make detection of voiced and unvoiced

phonemes in the time signal. Present the waveform and the 2-D spectrogram

in graphic windows.

(P2) Perform the windowed Fast Fourier Transform (FFT). Present the

spectrogram image. Detect the maxima for every window – detect the

formants - present the formants in the spectrogram image.

(P1, P2) Propose and test a normalization procedure for the MFCC feature-

detection procedure on base of detected border positions of formants.

