Search The Web With Google

Custom Search
   




Alfian on Facebook | Create Your Badge

Replying to a stranger with tips :)

Sometimes I really appreciate the tips I get from experts who don't even know me. Last year in July, I was trying to understand this one feature used in audio analysis. The feature ended up being very useful for my overall prototype for soccer event detection. I was just asking a YES/NO question from a reputed expert in the field of low-level audio processing and analysis. Besides just giving a binary reply of 1 or 0 (corresponding to Yes and No of course... respectively)... he gave me some tips that would be useful to my problem :) I really really do appreciate the help given by this smart guy. 

1st. Question:

Hi :)

My name is Alfian, from Malaysia.
I hope U can help me with a basic question regarding the melfcc.m function.
I just wanted confirmation... is the first coefficient (from 13) returned the Energy for the frame? (e.g. I set frame size to 20ms)
Sorry for the basic question, because I am just starting out in audio analysis.
Thank you

Regards: Alfian

1st. Reply:

Alfian

No, it's the average log energy (of the Mel spectral bins). It's highly correlated with (log) energy, but not equal.  For one thing, averaging is done after taking log, whereas an energy calculation would sum first, then take the log

2nd. Question:

thanks Dan for the fast reply... really2 appreciate it :)

I have one final question tho 
does this mean that I can use it (the first row of the returned 13-coeffs) as a means to see whether or not
there is increase in audio energy? I am actually trying to detect excited speech in a soccer match (lots of bground noise... because the audiences are cheering, clapping, screaming etc. for the whole duration of the game) 
Along with energy features, I am also looking at pitch (which I believe I have obtained using a function called shrp.m - which btw, is not his function... it was developed by a guy named Xuejing, Sun).

so my plan is to observe these two features... then try to come up with a model for excited speech.
So in your opinion, can that be used? thanks in advance

Regards: Alfian 


2nd. Reply:

> does this mean that I can use it (the first row of the returned 13-coeffs)
> as a means to see whether or not
> there is increase in audio energy?
Yes.

> I am actually trying to detect excited
> speech in a soccer match (lots of bground noise... because the audiences are
> cheering, clapping, screaming etc. for the whole duration of the game)
> Along with energy features, I am also looking at pitch (which I believe I
> have obtained using a function called shrp.m).
> so my plan is to observe these two features... then try to come up with a
> model for excited speech.
> So in your opinion, can that be used?
It's worth a try.  Along with the pitch, its time derivative, and the pitch strength (subharmonic-to-harmonic ratio, the SHR output of shrp.m) might be useful.

Bagus mamat Dan Ellis ni :) Although I know all of this is unimportant to some people... but to me it's one of the most knowledgeable and useful replies I've ever gotten from any senior academician that I know. Hahaha!!! Another guy is that Xuejing Sun guy. He also gives very2 good replies regarding the shrp.m function he developed.

Some academics, they don't have time to reply to 'stupid' questions. Some even don't reply to higher level questions :P Most don't reply AT ALL!!! And some of them ain't even that expert :P But for these two people... although they can be considered otais in their respective domains, still have time to answer the silly willy questions posed by minnows such as myself. I just wanna say I highly appreciate your time... and may you receive success in this life and there hereafter, insyaAllaah :D

4 comments:

noris said...

omg... kau contact Dan Ellis!!!!!!! wa wa wee wah!

Ahmad Javanese said...

aku mmg cuak gak memula. hehehe. tapi cuak tak bertempat of course.... hehehe. sbb kalau email dia tak jawab, bukan jadik apa pon ye dok. tapi dia jawab aa Oyiss. hehehehe. BTW, yg Xuijin tu kau ada jumpa kerja dia mana2 tak? aku tengok dia banyak buat kerja2 berkaitan pitch jek.... pakar pitch la tu kot. hehehehe.

Azwad said...

Errr... hmmmn.. dowh...:)

Good for you Alfian.
whatever all that means..

By the way, i got a call from kg. My mom told me i received a letter from Google. something about getting 150 by 30th or after that 100???. Maybe Adsense. They're resending me the letter by post. I'll ask you more later about it.

Ahmad Javanese said...

owh. yg tu aku dapat gak kot hari tu. tapi tak ingat pe dia..... tapi tak tau la ada nombor2 mcm 150 100 tu. hehehehe. apa2 pon, kalau uwang ringgit.... derma la kat aku 25 perseeeennnn.... Mcm Kassim Patalon la pulak iye. hehehehehe

Related Posts with Thumbnails

Thanks for dropping by