Lui Fungsin writes:
Hi,
Hi, thanks a lot for your interest in cl-wav-synth!
I just finished watching cl-wav-synth demo tutorial, it's way cool!
thanks :)
I see that this is a new project and not much traffic here, so I hope that you guys wouldn't mind a dumb question.
I'm clueless with audio and wav file format, etc. However, there's a simple task that I want to try my hands on with the cl-wav-synth library.
Here're two sound files for some chinese words. Some word has more than one pronounciation (like the first file below) while most of the others only have one.
http://209.172.124.170/pub/two_tone.wav http://209.172.124.170/pub/single_tone.wav
Is it possible to programmically detect if there's a voice uttered at the beginning of a wav file, then some short period of silence, and then another voice uttered.
If this is the case, I want to split that into two files (break at the silence). Otherwise I can just leave it alone.
If this can be done I'd greatly appreciate if someone can briefly describe the procedure, or can point me to a right direction (url to read, etc).
Here is how I write this (load it from slime or the clim repl):
-------------------------------------------------- (in-package :wav)
(defun find-peak (sample &optional (max-level 5000) (min-level 100) (min-index 1000)) "Find the number of peak in a sample. Return the tone count and there index in a list as two values" (with-slots (data) sample (let ((count 0) (find-max nil) (find-min 0) (acc nil)) (loop for sample across data for index from 0 do (cond ((> (abs sample) max-level) (setf find-max t find-min 0)) ((< (abs sample) min-level) (incf find-min) (when (and find-max (> find-min min-index)) (incf count) (setf find-max nil) (push index acc))) (t (setf find-min 0)))) (values count (nreverse acc))))) --------------------------------------------------
Then in the clim REPL:
WAV> Load As Sample (pathname) single_tone.wav WAV> (with-sample (find-peak it)) 0 1 1 (17525)
WAV> Load As Sample (pathname) two_tone.wav WAV> (with-sample (find-peak it)) 0 2 1 (23303 60504)
WAV> (set-sample (mix it (delay it 4))) WAV> (with-sample (find-peak it)) 0 4 1 (23303 60504 111503 148704)
The first value is the number of tone in the file. The second value is a list of each tone index.
Then you can do what you want with this value.
For example to isolate the first tone:
WAV> (set-sample (cut-i it 0 23303)) WAV> (with-sample (write-sample "first-tone.wav" it))
To isolate the second tone:
WAV> (set-sample (cut-i it 23303 60504))
Etc...
And if you want to automate this and save a file per tone:
-------------------------------------------------- (with-sample (multiple-value-bind (total-count index) (find-peak it) (loop for i in index for s = 0 then e for e = i for count from 0 do (write-sample (format nil "tone-~A.wav" count) (cut-i it s e))))) --------------------------------------------------
Note: a sample is just a wav header (bit per sample...) and a big array of data.
You can adjust levels: - Max and min level are detection levels. - Min index is the minimal length of the silence in sample index.
Many thanks.
I hope that helps.
fungsin
Philippe