sox-devel@lists.sourceforge.net unofficial mirror
 help / color / mirror / code / Atom feed
From: Jan Stary <hans@stare.cz>
To: sox-users@lists.sourceforge.net, sox-devel@lists.sourceforge.net
Subject: silence problems
Date: Tue, 18 Mar 2014 12:09:21 +0100	[thread overview]
Message-ID: <20140318110920.GA16536@www.stare.cz> (raw)

There seem to be problems with the silence effect.
I believe it has been brought up some time ago,
but here is a (longer) complete story with examples.

This is the test file I will be testing on, using 14.4.1:
sox -D -n -c 1 file.wav synth 3 trap 440 sin 480 gain -6 pad 1@0 1@1 1@2 1@3
That makes it three seconds of a dial tone, interpadded with four
seconds of silence: "silence TONE silence TONE silence TONE silence",
seven seconds in total.

First, basic silence trimming at the beginning of the file:

	When above-periods is non-zero, you must also specify a duration
	and threshold. Duration indications the amount of time that non-
	silence must be detected before it stops trimming audio.

I think it would be an improvement if the manpage said explicitly
that the non-silence that is detected (and stops the trimming)
remains itself in the output stream; as opposed to only starting
the output with samples that come _after_ that non-silence.

At least that's how I understand what the manpage says,
and it is what any of the following commands does,
resulting in six seconds starting with the first tone.

	sox file.wav out.wav silence 1 0.1 10%
	sox file.wav out.wav silence 1 0.5 10%
	sox file.wav out.wav silence 1 1.0 10%

There seems to be some rounding (buffers?) involved, e.g.

	sox file.wav out.wav silence 1 1.01 10%

produces the same, although there is no occurence
of a non-silence of length 1.01 in the source file.
On the other hand,

	sox file.wav out.wav silence 1 1.02 10%

already does the expected thing, i.e. results in an empty file.
However, SoX does not fill in the zero length in the header:

	 Input File     : 'out.wav'
	 Channels       : 1
	 Sample Rate    : 48000
	 Precision      : 32-bit
	 Sample Encoding: 32-bit Signed Integer PCM


Now, trimming up to the _second_ non-silence
already presents a problem for me:

	sox file.wav out.wav silence 2 0.1 10%

I would expect this to trim the leading "silence TONE silence"
and result in an output file starting with the second TONE
(as the second above-period). That's the intended behaviour, right?

	For example, if you had an audio file with two songs that each
	contained 2 seconds of silence before the song, you could specify
	an above-period of 2 to strip out both silence periods and the first
	song.

That's my situation. But no, the result is a 00:00:05.90 file
where the first silence and the first 0.1 second of the first
tone are removed. If this is the intended behaviour,
the two-songs example is wrong.

It seems that instead of the first TONE counting
as the first above-period (to be trimmed) and the second TONE
counting as the second above-period (to start the output),
only the first 0.1 seconds if the first TONE count as
the first above-period (trimmed), and after that the output begins.
That's what the above command seems to do.

But is that intended? With the above two-songs example
from the manpage, specifying "silence 2 3 2%" would
just trim the first silence and the first 3 seconds
of the first song, as in my example, right? Let's try:

sox -D -n -c 1 songs.wav synth 60 trap 440 sin 480 gain -6 pad 2@0 2@30
That's 00:02 of silence, 00:30 of song, 00:02 of silence, 00:30 of song,
as in the manpage example. Now running

	sox songs.wav out.wav silence 1 3 10%

does the expected thing: trims the first 00:02 of silence away,
and leaves the rest as 00:30 + 00:02 + 00:30 of output.

Now running "sox songs.wav out.wav silence 2 3 10%" should trim
the first silence, the first song, and the second silence - right?
That's what the example says, but that's not the case:
the result is the same as before, i.e. only the first
00:02 of silence is removed.

That seems wrong, and is also inconsistent with the previous example:
if it was to do the same, the first 00:03 above-period (i.e. the first
00:03 of the first song) would be removed and the rest would
go in the output, right?

Whichever the expected behaviour is, there seems to be a bug.
Or am I missing something in what the manage says?

There are other problems with the silence effect
(trimming from the end), but let's resolve this first.

	Thank you for you time

		Jan



------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech

                 reply	other threads:[~2014-03-18 11:09 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.sourceforge.net/lists/listinfo/sox-devel

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140318110920.GA16536@www.stare.cz \
    --to=sox-devel@lists.sourceforge.net \
    --cc=sox-users@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/sox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).