Re: [cl-wav-synth-devel] newbie question

9 Oct 2006

      Lui Fungsin writes:
...
Hi,
Hi, thanks a lot for your interest in cl-wav-synth!
...
I just finished watching cl-wav-synth demo tutorial, it's way cool!
thanks :)
...
I see that this is a new project and not much traffic here, so I hope
that you guys wouldn't mind a dumb question.
I'm clueless with audio and wav file format, etc.
However, there's a simple task that I want to try my hands on with the
cl-wav-synth library.
Here're two sound files for some chinese words. Some word has more
than one pronounciation (like the first file below) while most of the
others only have one.
http://209.172.124.170/pub/two_tone.wav
http://209.172.124.170/pub/single_tone.wav
Is it possible to programmically detect if there's a voice uttered at
the beginning of a wav file, then some short period of silence, and
then another voice uttered.
If this is the case, I want to split that into two files (break at the
silence). Otherwise I can just leave it alone.
If this can be done I'd greatly appreciate if someone can briefly
describe the procedure, or can point me to a right direction (url to
read, etc).
Here is how I write this (load it from slime or the clim repl):

--------------------------------------------------
(in-package :wav)

(defun find-peak (sample &optional (max-level 5000) (min-level 100) (min-index 1000))
  "Find the number of peak in a sample. Return the tone count and
  there index in a list as two values"
  (with-slots (data) sample
    (let ((count 0)
	  (find-max nil)
	  (find-min 0)
	  (acc nil))
      (loop for sample across data
	    for index from 0 do
	    (cond ((> (abs sample) max-level) (setf find-max t
						    find-min 0))
		  ((< (abs sample) min-level)
		   (incf find-min)
		   (when (and find-max (> find-min min-index))
		     (incf count)
		     (setf find-max nil)
		     (push index acc)))
		  (t (setf find-min 0))))
      (values count (nreverse acc)))))
--------------------------------------------------

Then in the clim REPL:

WAV> Load As Sample (pathname) single_tone.wav
WAV> (with-sample (find-peak it))
0 1
1 (17525)

WAV> Load As Sample (pathname) two_tone.wav
WAV> (with-sample (find-peak it))
0 2
1 (23303 60504)

WAV> (set-sample (mix it (delay it 4)))
WAV> (with-sample (find-peak it))
0 4
1 (23303 60504 111503 148704)

The first value is the number of tone in the file.
The second value is a list of each tone index.

Then you can do what you want with this value.

For example to isolate the first tone:

WAV> (set-sample (cut-i it 0 23303))
WAV> (with-sample (write-sample "first-tone.wav" it))

To isolate the second tone:

WAV> (set-sample (cut-i it 23303 60504))

Etc...

And if you want to automate this and save a file per tone:

--------------------------------------------------
(with-sample
  (multiple-value-bind (total-count index)
      (find-peak it)
    (loop for i in index
	  for s = 0 then e
	  for e = i
	  for count from 0
	  do (write-sample (format nil "tone-~A.wav" count)
			   (cut-i it s e)))))
--------------------------------------------------

Note: a sample is just a wav header (bit per sample...) and a big
array of data.

You can adjust levels:
  - Max and min level are detection levels.
  - Min index is the minimal length of the silence in sample index.
...
Many thanks.
I hope that helps.
...
fungsin
Philippe

-- 
Philippe Brochard    <hocwp@free.fr>
                      http://hocwp.free.fr

-=-= http://www.gnu.org/home.fr.html =-=-