sox-users@lists.sourceforge.net unofficial mirror
 help / color / mirror / code / Atom feed
From: Jan Stary <hans@stare.cz>
To: sox-users@lists.sourceforge.net
Subject: Re: Search and remove audio sections
Date: Fri, 20 Nov 2020 15:18:25 +0100	[thread overview]
Message-ID: <X7fQMcsAY8dnwmeR@www.stare.cz> (raw)
In-Reply-To: <6ff4d95eaff1bea310dc294ec0903c3d@wingsandbeaks.org.uk> <CAGyjer73XhZxnSkPuHniKi-aCX_0GgpgLDdoKUY9fa3OvNetYQ@mail.gmail.com> <5e992665db5dc96823d9ef6830430718@wingsandbeaks.org.uk> <DB8P195MB0741A75BDDBFA908FCBBD407F3E20@DB8P195MB0741.EURP195.PROD.OUTLOOK.COM>

On Nov 17 15:52:52, Dani@softco.co.il wrote:
> I have a bunch of old MP3 podcasts that have ads in them,
> at the beginning and the end.

Are the ads always at the beginning and at the end,
and never anywhere else?

> The ads are about 30 seconds long and usually
> have a small familiar jingle before they start and after they end. 

Usually: so not all of them have the jingle(s),
or the jingle is not always the same, right?

> I was wondering if there is an ability using SoX (or other tool)
> to do a "search and remove" on these, in a batch format
>- that would apply to hundreds of these files.

General audio search is quite hard.

But if you intend to actually listen to the 10 minutes of podcast,
removing the ads manualy from the beginning and end
is a matter of seconds on top of those 10 minutes.


On Nov 17 20:42:51, jn.ml.sxu.88@wingsandbeaks.org.uk wrote:
> If the jingles at the start and end of each ad are binary equal (which
> they might be if an automated system placed copies of their contents
> in the files) then in theory one could use a conventional file search
> utility to locate each one.

A prerequisite of that would be that oll the files are in the
very same binary format. We know they are mp3s - are they the same
samplerate, bitrate, etc? Because even if the jingle is one and
the same every tome, it won't be, encoded into the individual mp3s.

> Recognising the jingles might be hard if any aspect of mp3 compression
> of the audio means that successive parts of jingles don't appear in the
> exact same bit- and byte- pattern in each file.

Yes, and they probably won't.

> If the files contain, say, continuous music (or maybe even speech) then
> there's a tiny gap (hopefully of digital silence) then a jingle then a
> second tiny gap then more content, I think you could possibly look for
> the positions of the gaps.

The ads are supposed to be at the beginning and end.
But cutting at silence is what I would go fro first,
if there is a telling silence around the ads of course.


	Jan



_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

      parent reply	other threads:[~2020-11-20 14:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-17 15:52 Search and remove audio sections Dani
2020-11-17 20:42 ` Jeremy Nicoll - ml sox users
2020-11-17 21:40   ` Jeff Learman
2020-11-18  0:25     ` Jeremy Nicoll - ml sox users
2020-11-18  8:01       ` Dani
2020-11-19  2:21 ` Rafal Maszkowski
2020-11-20 14:18 ` Jan Stary [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.sourceforge.net/lists/listinfo/sox-users

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X7fQMcsAY8dnwmeR@www.stare.cz \
    --to=sox-users@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/sox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).