• Post Reply Bookmark Topic Watch Topic
  • New Topic

Analyze Audio from Video  RSS feed

 
Sara Bellum
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am in the process of writing a file management application for film projects. It will parse the video and look for the sound of the clapperboard. Once found, it will provide the user a frame from the video so they can enter the information in some nearby forms. The formats I need to use are mp4 and MTS.

My problem is, I don't understand where to start. As far as mp4 goes, I have tried a combination of JAAD and musicg to split the audio off the video, save it as a .wav, and analyze it with musicg. While this works, it seems rather inefficient and it only works with mp4.

How are mp4 and MTS files saved, and how can I read them in java? Once I get the audio split apart, how do I find the clapperboard?

Thanks in advance.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sara Bellum wrote:How are mp4 and MTS files saved, and how can I read them in java?

I'm no expert on this stuff, but I suspect that only the last part of that question has any relevance. If not, you presumably have the data; just write it out to a file. If MTS is a standard file format, then the chances are that you can read (or write) them fairly easily - with the requisite libraries - but as far as Java is concerned, it's just a bunch of bytes.

Once I get the audio split apart, how do I find the clapperboard?

Dunno. What's a "clapperboard"? A sound? If so, then you'll have to find it - but how you do it I have absolutely no idea.

Winston
 
Sara Bellum
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
First off, thanks for responding so quickly!

Winston Gutkowski wrote:
as far as Java is concerned, it's just a bunch of bytes.


I guess what I was asking for is the file structure of .mp4 and .MTS so I can make sense of the bytes

What's a "clapperboard"?




The clapperboard is what filmmakers use at the beginning of takes so they know what it is during editing. It makes a very distinctive sound when it claps.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sara Bellum wrote:I guess what I was asking for is the file structure of .mp4 and .MTS so I can make sense of the bytes

Doubt whether you need to. A byte is a byte, and it makes not one whit of difference to Java whether it's part of a sound or video file, or just a piece of text (oddly, that's where most people run into problems ),

And if you really think you need to know the structure, then I hate to say, but you'll need to read the specs for the file format in question. There may also be libraries around to help, but I'm afraid I'm not too "up" on that stuff.

The clapperboard is what filmmakers use at the beginning of takes so they know what it is during editing. It makes a very distinctive sound when it claps.

Yuh. Got that (old film buff). The question is: what is that sound? And what makes it distinctive from any other?

That's what you're going to need to know to find it (or something like it). And I hate to say, but I don't think it will be very simple...or 100% accurate.

A first cut might be to download a sample of a clapperboard to compare with, and try some tests.

Winston
 
Sara Bellum
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks again for the fast reply.

I figured it wouldn't be a simple as I had hoped.
At least now I know. Thanks for pointing me in the right direction.

Winston Gutkowski wrote:
And I hate to say, but I don't think it will be very simple...or 100% accurate.


I have some sample footage. Its not always perfect, but for the most part, it is much louder than the ambient noises in the room. I think an FFT would clearly show the clap.
 
Carey Brown
Saloon Keeper
Posts: 3322
46
Eclipse IDE Firefox Browser Java MySQL Database VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sara Bellum wrote:
I have some sample footage. Its not always perfect, but for the most part, it is much louder than the ambient noises in the room. I think an FFT would clearly show the clap.


I think that an FFT is not what you need, it will pull apart all frequency components. A linear convolution tuned to the amplitude and duration of a typical clap (doesn't even have to be an exact match) would give you more usable results with less CPU involved. Sorry, my convolution knowledge is too rusty at this point to be of much further help.
 
Sara Bellum
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I hadn't thought of using a linear convolution. Thanks!
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sara Bellum wrote:I think an FFT would clearly show the clap...

Then I'm afraid you're definitely out of my comfort zone. I wouldn't even know how to try and match an FFT - although again, I suspect there are libraries out there that can help.

Good luck though.

Winston
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!