sox-users@lists.sourceforge.net unofficial mirror
 help / color / Atom feed
* how to interpret tell_off, and the right way to use sox_seek
@ 2017-11-04  7:26 Dan Hitt
  2017-11-06 11:10 ` Jan Stary
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Hitt @ 2017-11-04  7:26 UTC (permalink / raw)
  To: sox-users

I'm on a debian stretch box, using what i imagine is version 14, 4, 1
based on sox.h (SOX_LIB_VERSION(14, 4, 1)).

I need to seek back and forth in a file, ultimately reading the same
samples multiple times.

When i seek backwards, via a call to sox_seek(*,*, SOX_SEEK_SET) with
an offset less than the current position, the file pointer is moved,
according to ftell() applied to ->fp in the sox format structure.

But the ->tell_off field doesn't budge.

And then when i've hauled out a count of samples equal to what the
file holds (but nowhere near the end of the file, according to
ftell()) i get this error message about a premature end of the file,
and my reading stops.

Because of the great age of sox this can hardly be an unknown effect
but i can't find any mention of it on google.

So . . . what's the best way to handle it?

Should i go in and manhandle the ->tell_off field to match what i
think it should be?

Or should i reopen the file each time it has gone through its quota of samples?

Or maybe i'm just crazily wrong and i have to do something extra for sox_seek?

Anyhow, would appreciate any advice, especially if it sounds like i'm
just not making the call right.

TIA!!

dan

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-04  7:26 how to interpret tell_off, and the right way to use sox_seek Dan Hitt
@ 2017-11-06 11:10 ` Jan Stary
  2017-11-06 18:08   ` Dan Hitt
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stary @ 2017-11-06 11:10 UTC (permalink / raw)
  To: sox-users

On Nov 04 00:26:37, dan.hitt@gmail.com wrote:
> I'm on a debian stretch box, using what i imagine is version 14, 4, 1
> based on sox.h (SOX_LIB_VERSION(14, 4, 1)).
> 
> I need to seek back and forth in a file, ultimately reading the same
> samples multiple times.

Why do you need to reread those same samples multiple times?
Why do you need to do that programatically, in C?

Why are you using SoX for this?
This might be easier using libsndfile.

	Jan


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-06 11:10 ` Jan Stary
@ 2017-11-06 18:08   ` Dan Hitt
  2017-11-06 19:06     ` Jan Stary
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Hitt @ 2017-11-06 18:08 UTC (permalink / raw)
  To: sox-users

Hi Jan,

Thanks for your mail!  And thanks for your activity in SoX!

I appreciate your help, and i will try to list my answers below.

I do take your message to imply that SoX will read no more samples
than the number of in a file total, so that, e.g., if you read the
same 10 blocks of samples and sox_seek() to the beginning of them
repeatedly, eventually your reads will fail.  And in fact, if the
length of the file is 50 blocks, those reads will fail as soon as you
have done this 5 times.  If this is not correct, please let me know!!
:)

Now, in answer to your three questions and suggestion about libsndfile:

(1) In fact, i didn't need to read the same samples multiple times, i
just needed to use them multiple times.  And it was possible for me to
implement my own buffering scheme.   And that's what i ended up doing,
because in this particular case there is a pattern of access that
repeats.  So it was possible to code around not being able to seek
freely.

However it would have been easier not to do any buffering, and just
seek, and use the implicit buffering of the system.

A sort of analogous situation is with the file system and linux: if
you're reading through a file the OS will keep it in memory so you
don't have too much overhead just relying on the OS itself to seek
back and forth in a file --- the caching has already been done.

And it's sort of similar in this case too: the file on the other side
of the SoX interface already has the data buffered, so if unlimited
seeking were possible, then there would presumably not be much file io
overhead.

(2) I'm doing this programmatically because i anticipate running the
program many times, and i want to be as sure of correctness as i can.
(In fact, i'm not actually programming it in C, but i'm using the C
calling conventions, and in principle i could certainly do this in C.)

(3) I'm using SoX to do this because it is so good :) :).  I also have
experience programming in SoX, although obviously i'm still learning.

(4) libsndfile would definitely have been a possibility, but every
library has trade offs.  As i understand it, libsndfile does not yet
support mp3, although that is on the road map.  And it is possible
that there would be some other move that i was making that wouldn't
work in libsndfile.  So i'd have to recode to see --- although as far
as i can tell, libsndfile and SoX are both excellent pieces of
software.

Thanks again for providing me information, and if anything i said is
wrong, please correct me! :)

dan


On Mon, Nov 6, 2017 at 3:10 AM, Jan Stary <hans@stare.cz> wrote:
> On Nov 04 00:26:37, dan.hitt@gmail.com wrote:
>> I'm on a debian stretch box, using what i imagine is version 14, 4, 1
>> based on sox.h (SOX_LIB_VERSION(14, 4, 1)).
>>
>> I need to seek back and forth in a file, ultimately reading the same
>> samples multiple times.
>
> Why do you need to reread those same samples multiple times?
> Why do you need to do that programatically, in C?
>
> Why are you using SoX for this?
> This might be easier using libsndfile.
>
>         Jan
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Sox-users mailing list
> Sox-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/sox-users

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-06 18:08   ` Dan Hitt
@ 2017-11-06 19:06     ` Jan Stary
  2017-11-06 20:14       ` Dan Hitt
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stary @ 2017-11-06 19:06 UTC (permalink / raw)
  To: sox-users

> I do take your message to imply that SoX will read no more samples
> than the number of in a file total, so that, e.g., if you read the
> same 10 blocks of samples and sox_seek() to the beginning of them
> repeatedly, eventually your reads will fail.  And in fact, if the
> length of the file is 50 blocks, those reads will fail as soon as you
> have done this 5 times.  If this is not correct, please let me know!!
> :)

There is a difference between SoX the utility and libsox the library.
I am not sure if SoX itself ever reads a block of audio more than once
(I can imagina situations where it would), but that's not a constraint
on what the libsox library can do. If it has a sox_seek(), then I suppose
it does what the name says; in particular, it lets you read the same
block of audio over ano over again as long as you keep seekign back.
(Disclaimer: I have never used libsox directly,
I only use SoX the binary.)

> Now, in answer to your three questions and suggestion about libsndfile:
> 
> (1) In fact, i didn't need to read the same samples multiple times, i
> just needed to use them multiple times.
> And it was possible for me to implement my own buffering scheme.
> And that's what i ended up doing,
> because in this particular case there is a pattern of access that
> repeats.  So it was possible to code around not being able to seek
> freely.

Well, that changes the whole premise.
Is there anything left of the original problem then?

> However it would have been easier not to do any buffering, and just
> seek, and use the implicit buffering of the system.

Using the same data again does not imply any buffering by itself.
You sox_read() into an array and then use it as you wish
(such as play it seven times).

> A sort of analogous situation is with the file system and linux: if
> you're reading through a file the OS will keep it in memory so you
> don't have too much overhead just relying on the OS itself to seek
> back and forth in a file --- the caching has already been done.

You already _have_ the data in memory once you have sox_read() them,
so I fail to see the analogy. Why would you seek back and read it
again _from_the_file_ once you have it in an array?

> (2) I'm doing this programmatically because i anticipate running the
> program many times, and i want to be as sure of correctness as i can.
> (In fact, i'm not actually programming it in C, but i'm using the C
> calling conventions, and in principle i could certainly do this in C.)

What makes you think that writing your own code will make
your use of SoX more "correct"?

> (3) I'm using SoX to do this because it is so good :) :).  I also have
> experience programming in SoX, although obviously i'm still learning.

I still have no idea about what you want to do, eventually.
You are asking about technicalities; what are you actually _doing_ ?

> (4) libsndfile would definitely have been a possibility, but every
> library has trade offs.  As i understand it, libsndfile does not yet
> support mp3, although that is on the road map.  And it is possible
> that there would be some other move that i was making that wouldn't
> work in libsndfile.  So i'd have to recode to see --- although as far
> as i can tell, libsndfile and SoX are both excellent pieces of
> software.

I still have no idea about what you want to do.

	Jan


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-06 19:06     ` Jan Stary
@ 2017-11-06 20:14       ` Dan Hitt
  2017-11-06 21:09         ` Jan Stary
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Hitt @ 2017-11-06 20:14 UTC (permalink / raw)
  To: sox-users

Hi Jan!

Thanks for your mail.

Answers interspersed below, hopefully simple to read.

On Mon, Nov 6, 2017 at 11:06 AM, Jan Stary <hans@stare.cz> wrote:
>> I do take your message to imply that SoX will read no more samples
>> than the number of in a file total, so that, e.g., if you read the
>> same 10 blocks of samples and sox_seek() to the beginning of them
>> repeatedly, eventually your reads will fail.  And in fact, if the
>> length of the file is 50 blocks, those reads will fail as soon as you
>> have done this 5 times.  If this is not correct, please let me know!!
>> :)
>
> There is a difference between SoX the utility and libsox the library.
> I am not sure if SoX itself ever reads a block of audio more than once
> (I can imagina situations where it would), but that's not a constraint
> on what the libsox library can do. If it has a sox_seek(), then I suppose
> it does what the name says; in particular, it lets you read the same
> block of audio over ano over again as long as you keep seekign back.
> (Disclaimer: I have never used libsox directly,
> I only use SoX the binary.)

This is the crux of the matter, and appears not to be the case.

I wrote a little program to test exactly this assertion, that you
cannot repeatedly
call sox_seek().

The answer here appears to be that indeed you cannot.

Here's the test program:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sox.h>

    int main( int argc, char** argv ) {
      if ( argc < 2 )  {
        fprintf(stderr,"Call with one arg, the name of a sound file.\n" );
        exit(1);
      }
      int istat = sox_init();
      if (istat != SOX_SUCCESS ) {
        fprintf(stderr, "Failed to initialize sox, error %d.\n", istat);
        exit(1);
      }
      char* infile = argv[1];
      sox_format_t* s = sox_open_read( infile, 0, 0, 0 );
      if ( ! s ) {
        fprintf(stderr,"Failed to open `%s' .\n", infile);
        exit(1);
      }
      int buf[1024];
      int count = 0;
      while ( 1 ) {
        int rcnt = sox_read( s, buf, 1024 );
        if ( rcnt <= 0 ) {
          printf( "Failed on read, attempt %d\n", count );
          exit( 0 );
        }
        int status = sox_seek( s, 0, SOX_SEEK_SET );
        if ( status ) {
          fprintf(stderr,"Failed on seek.\n" );
          exit(1);
        }
        count++;
      }
      return 0; // not reached
    }

It compiles and links without error or warning (link with -lsox).

I ran it on a file a few megabytes long, and it did a few thousand
seeks, then failed on the read.  (It's amazing with modern hardware
how fast something like this runs.)

So it behaves as though there's some kind of internal counter
initialized to the length of the file, and each read decrements that
counter by the amount read; when the counter runs out, then reads are
no longer permitted.

For me, that's the most important point: to try to have an exact
characterization of the behavior of the library under these
circumstances.


>
> ... (cut)
> Is there anything left of the original problem then?

No.

That's been solved.

But it would be useful to have confirmation that this behavior is
intentional, or information on how to avoid the phenoomenon, but for
now i've dealt with it, although perhaps unnecessarily given the right
combination of function calls and arguments.

> ... (cut)
>> A sort of analogous situation is with the file system and linux: if
>> you're reading through a file the OS will keep it in memory so you
>> don't have too much overhead just relying on the OS itself to seek
>> back and forth in a file --- the caching has already been done.
>
> You already _have_ the data in memory once you have sox_read() them,
> so I fail to see the analogy. Why would you seek back and read it
> again _from_the_file_ once you have it in an array?

Not necessarily, as you may be reusing the storage you've set aside.

>
>> (2) I'm doing this programmatically because i anticipate running the
>> program many times, and i want to be as sure of correctness as i can.
>> (In fact, i'm not actually programming it in C, but i'm using the C
>> calling conventions, and in principle i could certainly do this in C.)
>
> What makes you think that writing your own code will make
> your use of SoX more "correct"?

Well, i don't think that at all!!! :) :) :)

Or more exactly, correct or not it may very well not be very idiomatic.

I think the best, simplest way to obtain the same samples over and
over would be to seek to them, and read them again.  That may even be
possible, by executing some kind of a reset, however i don't know.

Thanks again for your help, and if anything i've said is wrong, i'd
appreciate any correction, clarification, or references.

dan

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-06 20:14       ` Dan Hitt
@ 2017-11-06 21:09         ` Jan Stary
  2017-11-06 21:47           ` Dan Hitt
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stary @ 2017-11-06 21:09 UTC (permalink / raw)
  To: sox-users

On Nov 06 12:14:59, dan.hitt@gmail.com wrote:
> On Mon, Nov 6, 2017 at 11:06 AM, Jan Stary <hans@stare.cz> wrote:
> >> I do take your message to imply that SoX will read no more samples
> >> than the number of in a file total, so that, e.g., if you read the
> >> same 10 blocks of samples and sox_seek() to the beginning of them
> >> repeatedly, eventually your reads will fail.  And in fact, if the
> >> length of the file is 50 blocks, those reads will fail as soon as you
> >> have done this 5 times.  If this is not correct, please let me know!!
> >> :)
> >
> > There is a difference between SoX the utility and libsox the library.
> > I am not sure if SoX itself ever reads a block of audio more than once
> > (I can imagina situations where it would), but that's not a constraint
> > on what the libsox library can do. If it has a sox_seek(), then I suppose
> > it does what the name says; in particular, it lets you read the same
> > block of audio over ano over again as long as you keep seekign back.
> > (Disclaimer: I have never used libsox directly,
> > I only use SoX the binary.)
> 
> This is the crux of the matter, and appears not to be the case.
> 
> I wrote a little program to test exactly this assertion, that you
> cannot repeatedly
> call sox_seek().
> 
> The answer here appears to be that indeed you cannot.
> 
> Here's the test program:
> 
>     #include <stdio.h>
>     #include <stdlib.h>
>     #include <sox.h>
> 
>     int main( int argc, char** argv ) {
>       if ( argc < 2 )  {
>         fprintf(stderr,"Call with one arg, the name of a sound file.\n" );
>         exit(1);
>       }
>       int istat = sox_init();
>       if (istat != SOX_SUCCESS ) {
>         fprintf(stderr, "Failed to initialize sox, error %d.\n", istat);
>         exit(1);
>       }
>       char* infile = argv[1];
>       sox_format_t* s = sox_open_read( infile, 0, 0, 0 );
>       if ( ! s ) {
>         fprintf(stderr,"Failed to open `%s' .\n", infile);
>         exit(1);
>       }
>       int buf[1024];
>       int count = 0;
>       while ( 1 ) {
>         int rcnt = sox_read( s, buf, 1024 );
>         if ( rcnt <= 0 ) {
>           printf( "Failed on read, attempt %d\n", count );
>           exit( 0 );
>         }
>         int status = sox_seek( s, 0, SOX_SEEK_SET );
>         if ( status ) {
>           fprintf(stderr,"Failed on seek.\n" );
>           exit(1);
>         }
>         count++;
>       }
>       return 0; // not reached
>     }
> 

$ cc -o soxseek soxseek.c -lsox -I/usr/local/include/ -L/usr/local/lib
$ sox -n /tmp/file.wav synth trim 0 $((1024 * 1000))s
$ ./soxseek /tmp/file.wav                             
Failed on read, attempt 1000

That makes me suspect it's just an end of file.
To quote the outdated and incomplete libsox manpage:

  sox_read and sox_write return the number of samples successfully read
  or written. If an error occurs, or the end-of-file is reached, the return
  value is a short item count or SOX_EOF. TODO: sox_read does not
  distiguish between end-of-file and error. Need an feof() and ferror()
  concept to determine which occured.


> I ran it on a file a few megabytes long, and it did a few thousand
> seeks, then failed on the read.

The above does exactly 1000 reads of 1024,
which is exactly how many samples the file contains.
So ( rcnt <= 0 ) does not mean there was an error.

> So it behaves as though there's some kind of internal counter
> initialized to the length of the file, and each read decrements that
> counter by the amount read; when the counter runs out, then reads are
> no longer permitted.

Yes.

If I am reading your code right, it does not seek at all.
It just reads up to EOF. Which might be an error in sox_seek().
As I said, I haven't used libsox before, so I don't know if
this is a correct use of sox_seek(), but beware: not every
input is seekable (a regular file on disk should be, though).

> For me, that's the most important point: to try to have an exact
> characterization of the behavior of the library under these
> circumstances.

There is no good documentation of libsox. The libosx(3) man page
is both incomplete and outdated, and it has been for years.
You might want to look an the src/example*c files.

> >
> > ... (cut)
> > Is there anything left of the original problem then?
> 
> No.
> 
> That's been solved.
> 
> But it would be useful to have confirmation that this behavior is
> intentional, or information on how to avoid the phenoomenon, but for
> now i've dealt with it, although perhaps unnecessarily given the right
> combination of function calls and arguments.

I still suspect that it is first of all unnecessary to read the input
again and again while seeking back again and again; but I can't be sure,
because I still have no idea about what you are doing.

> > ... (cut)
> >> A sort of analogous situation is with the file system and linux: if
> >> you're reading through a file the OS will keep it in memory so you
> >> don't have too much overhead just relying on the OS itself to seek
> >> back and forth in a file --- the caching has already been done.
> >
> > You already _have_ the data in memory once you have sox_read() them,
> > so I fail to see the analogy. Why would you seek back and read it
> > again _from_the_file_ once you have it in an array?
> 
> Not necessarily, as you may be reusing the storage you've set aside.

So don't reuse it: keep the copy of what you have read
so that you don't have to read it over again.

> I think the best, simplest way to obtain the same samples over and
> over would be to seek to them, and read them again.

Why do you think that? Just sox_read() it into an array,
then keep reading that array, instead of reading from a file.

> Thanks again for your help, and if anything i've said is wrong, i'd
> appreciate any correction, clarification, or references.

I staill have no idea about what you are actually trying to do.

	Jan


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-06 21:09         ` Jan Stary
@ 2017-11-06 21:47           ` Dan Hitt
  2017-11-07  8:50             ` Jan Stary
  0 siblings, 1 reply; 11+ messages in thread
From: Dan Hitt @ 2017-11-06 21:47 UTC (permalink / raw)
  To: sox-users

Hi Jan!

Thanks for your mail.

On Mon, Nov 6, 2017 at 1:09 PM, Jan Stary <hans@stare.cz> wrote:
> On Nov 06 12:14:59, dan.hitt@gmail.com wrote:
....... (cut)
>>
>>     #include <stdio.h>
>>     #include <stdlib.h>
>>     #include <sox.h>
>>
>>     int main( int argc, char** argv ) {
>>       if ( argc < 2 )  {
>>         fprintf(stderr,"Call with one arg, the name of a sound file.\n" );
>>         exit(1);
>>       }
>>       int istat = sox_init();
>>       if (istat != SOX_SUCCESS ) {
>>         fprintf(stderr, "Failed to initialize sox, error %d.\n", istat);
>>         exit(1);
>>       }
>>       char* infile = argv[1];
>>       sox_format_t* s = sox_open_read( infile, 0, 0, 0 );
>>       if ( ! s ) {
>>         fprintf(stderr,"Failed to open `%s' .\n", infile);
>>         exit(1);
>>       }
>>       int buf[1024];
>>       int count = 0;
>>       while ( 1 ) {
>>         int rcnt = sox_read( s, buf, 1024 );
>>         if ( rcnt <= 0 ) {
>>           printf( "Failed on read, attempt %d\n", count );
>>           exit( 0 );
>>         }
>>         int status = sox_seek( s, 0, SOX_SEEK_SET );
>>         if ( status ) {
>>           fprintf(stderr,"Failed on seek.\n" );
>>           exit(1);
>>         }
>>         count++;
>>       }
>>       return 0; // not reached
>>     }
>>
 .... (cut)
> If I am reading your code right, it does not seek at all.

It's in a loop.

First, it reads (which presumably moves the file pointer forwards),
then it seeks back to where it was.

I've verified in gdb that it actually does call sox_seek() (again and
again and again, but not forever).  I have also verified in gdb that
it is reading the same samples.

So sox_seek() is definitely being called, and definitely working.

But nevertheless, the reading done after the seeking eventually fails.

Certainly if you do analogous coding with calls to fread() and
fseek(), it would not terminate.

>
> I still suspect that it is first of all unnecessary to read the input
> again and again while seeking back again and again; but I can't be sure,
> because I still have no idea about what you are doing.

What i'm trying to do now is to determine the correct way to use
sox_seek() if i am using it incorrectly.  If i'm using it correctly,
then i would like confirmation of my characterization (that it has
some internal counter which is decremented until it hits zero).

(I did have a signal processing problem that i was considering
earlier, but that's not relevant now because i avoided the sox_seek
issue by rearranging the computation.  So i can do my dsp, but i do
wonder about correct usage of sox_seek.)

Anyhow, i do appreciate your investigation of this!!! :) :)

dan

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-06 21:47           ` Dan Hitt
@ 2017-11-07  8:50             ` Jan Stary
  2017-11-07  9:01               ` Jan Stary
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stary @ 2017-11-07  8:50 UTC (permalink / raw)
  To: sox-users

> First, it reads (which presumably moves the file pointer forwards),
> then it seeks back to where it was.
> I've verified in gdb that it actually does call sox_seek() (again and
> again and again, but not forever).  I have also verified in gdb that
> it is reading the same samples.

Ah, I see what your problem is now.

> So sox_seek() is definitely being called, and definitely working.
> But nevertheless, the reading done after the seeking eventually fails.

Here is my slight rewrite of your example:

#include <stdio.h>
#include <stdlib.h>
#include <err.h>
#include <sox.h>

int 
main(int argc, char **argv)
{
	sox_format_t *s;
	int32_t buf[1024];
	ssize_t r;
	int i;

	if (argc < 2)
		errx(1, "usage: ./soxseek input");
	if (sox_init() != SOX_SUCCESS)
		errx(1, "Cannot init libsox");
	if ((s = sox_open_read(*++argv, 0, 0, 0)) == NULL)
		errx(1, "Cannot open `%s'", *argv);
	for (i = 1; (r = sox_read(s, buf, 1024)) > 0; i++) {
		printf("[%04d] %zd samples, starting with %0x\n", i, r, *buf);
		if (sox_seek(s, 0, SOX_SEEK_SET) != SOX_SUCCESS)
			errx(1, "Cannot seek");
	}
	/* No way to test for sox_read() error */
	return 0;
}

(Note how I don't use sox_site_t for the sox_read() return value,
because it does not exist, eventhough that's what libsox(3) documents.)

$ sox -n /tmp/file.wav synth trim 0 $((1024 * 1000))s  
$ cc -o soxseek soxseek.c -lsox -I/usr/local/include/ -L/usr/local/lib
$ ./soxseek /tmp/file.wav 
[0000] 1024 samples, starting with 0
[0001] 1024 samples, starting with 0
[....]
[0999] 1024 samples, starting with 0
[1000] 1024 samples, starting with 0

So I think you are right. It does seek back to the begining of the sine wave
(thus reporting 0 as the first sample value in the buffer), but it gets
exhausted at EOF anyway. I suspect now it is a bug in sox_seek()
if we are calling it right.

> Certainly if you do analogous coding with calls to fread() and
> fseek(), it would not terminate.

Yes, the following will read the same file forever:

#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <err.h>

int 
main(int argc, char **argv)
{
	int fd;
	int32_t buf[1024];
	ssize_t r;
	int i;

	if (argc < 2)
		errx(1, "usage: ./seek input");
	if ((fd = open(*++argv, O_RDONLY)) == -1)
		err(1, NULL);
	for (i = 1; (r = read(fd, buf, 1024)) > 0; i++) {
		printf("[%04d] %zd samples, starting with %0x\n", i, r, *buf);
		if (lseek(fd, 0, SEEK_SET) == -1)
			err(1, NULL);
	}
	return (r != 0);
}


> What i'm trying to do now is to determine the correct way to use
> sox_seek() if i am using it incorrectly.  If i'm using it correctly,
> then i would like confirmation of my characterization (that it has
> some internal counter which is decremented until it hits zero).

I share your suspition now.

> (I did have a signal processing problem that i was considering
> earlier, but that's not relevant now because i avoided the sox_seek
> issue by rearranging the computation.  So i can do my dsp, but i do
> wonder about correct usage of sox_seek.)

Should we move this to sox-devel?

	Jan


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-07  8:50             ` Jan Stary
@ 2017-11-07  9:01               ` Jan Stary
  2017-11-07  9:13                 ` Jan Stary
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stary @ 2017-11-07  9:01 UTC (permalink / raw)
  To: sox-users

On Nov 07 09:50:36, hans@stare.cz wrote:
> So I think you are right. It does seek back to the begining of the sine wave
> (thus reporting 0 as the first sample value in the buffer), but it gets
> exhausted at EOF anyway. I suspect now it is a bug in sox_seek()
> if we are calling it right.

Much as I like sox, I don't trust this code very much:

int sox_seek(sox_format_t * ft, sox_uint64_t offset, int whence)
{
    /* FIXME: Implement SOX_SEEK_CUR and SOX_SEEK_END. */
    if (whence != SOX_SEEK_SET)
        return SOX_EOF; /* FIXME: return SOX_EINVAL */

    /* If file is a seekable file and this handler supports seeking,
     * then invoke handler's function.
     */
    if (ft->seekable && ft->handler.seek)
      return (*ft->handler.seek)(ft, offset);
    return SOX_EOF; /* FIXME: return SOX_EBADF */
}


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-07  9:01               ` Jan Stary
@ 2017-11-07  9:13                 ` Jan Stary
  2017-11-07  9:22                   ` Jan Stary
  0 siblings, 1 reply; 11+ messages in thread
From: Jan Stary @ 2017-11-07  9:13 UTC (permalink / raw)
  To: sox-users

On Nov 07 10:01:41, hans@stare.cz wrote:
> On Nov 07 09:50:36, hans@stare.cz wrote:
> > So I think you are right. It does seek back to the begining of the sine wave
> > (thus reporting 0 as the first sample value in the buffer), but it gets
> > exhausted at EOF anyway. I suspect now it is a bug in sox_seek()
> > if we are calling it right.
> 
> Much as I like sox, I don't trust this code very much:
> 
> int sox_seek(sox_format_t * ft, sox_uint64_t offset, int whence)
> {
>     /* FIXME: Implement SOX_SEEK_CUR and SOX_SEEK_END. */
>     if (whence != SOX_SEEK_SET)
>         return SOX_EOF; /* FIXME: return SOX_EINVAL */
> 
>     /* If file is a seekable file and this handler supports seeking,
>      * then invoke handler's function.
>      */
>     if (ft->seekable && ft->handler.seek)
>       return (*ft->handler.seek)(ft, offset);
>     return SOX_EOF; /* FIXME: return SOX_EBADF */
> }

This is wav's handler's seek() from src/wav.c (14.4.2):


static int seek(sox_format_t * ft, uint64_t offset)
{
  priv_t *   wav = (priv_t *) ft->priv;

  if (ft->encoding.bits_per_sample & 7)
    lsx_fail_errno(ft, SOX_ENOTSUP, "seeking not supported with this encoding");
  else if (wav->formatTag == WAVE_FORMAT_GSM610) {
    int alignment;
    size_t gsmoff;

    /* rounding bytes to blockAlign so that we
     * don't have to decode partial block. */
    gsmoff = offset * wav->blockAlign / wav->samplesPerBlock +
             wav->blockAlign * ft->signal.channels / 2;
    gsmoff -= gsmoff % (wav->blockAlign * ft->signal.channels);

    ft->sox_errno = lsx_seeki(ft, (off_t)(gsmoff + wav->dataStart), SEEK_SET);
    if (ft->sox_errno == SOX_SUCCESS) {
      /* offset is in samples */
      uint64_t new_offset = offset;
      alignment = offset % wav->samplesPerBlock;
      if (alignment != 0)
          new_offset += (wav->samplesPerBlock - alignment);
      wav->numSamples = ft->signal.length - (new_offset / ft->signal.channels);
    }
  } else {
    double wide_sample = offset - (offset % ft->signal.channels);
    double to_d = wide_sample * ft->encoding.bits_per_sample / 8;
    off_t to = to_d;
    ft->sox_errno = (to != to_d)? SOX_EOF : lsx_seeki(ft, (off_t)wav->dataStart + (off_t)to, SEEK_SET);
    if (ft->sox_errno == SOX_SUCCESS)
      wav->numSamples -= (size_t)wide_sample / ft->signal.channels;
  }

  return ft->sox_errno;
}


It seems you are right: wav->numSamples get decremented no matter what.
That seems wrong, but I am not sure if that is exactly our problem.

This is the lsx_seeki() (in src/formats_i.c) that
eventually does fseeko() on the underlying (FILE*)ft->fp 


/* Implements traditional fseek() behavior.  Meant to abstract out
 * file operations so that they could one day also work on memory
 * buffers.
 *
 * N.B. Can only seek forwards on non-seekable streams!
 */
int lsx_seeki(sox_format_t * ft, off_t offset, int whence)
{
    if (ft->seekable == 0) {
        /* If a stream peel off chars else EPERM */
        if (whence == SEEK_CUR) {
            while (offset > 0 && !feof((FILE*)ft->fp)) {
                getc((FILE*)ft->fp);
                offset--;
                ++ft->tell_off;
            }
            if (offset)
                lsx_fail_errno(ft,SOX_EOF, "offset past EOF");
            else
                ft->sox_errno = SOX_SUCCESS;
        } else
            lsx_fail_errno(ft,SOX_EPERM, "file not seekable");
    } else {
        if (fseeko((FILE*)ft->fp, offset, whence) == -1)
            lsx_fail_errno(ft,errno, "%s", strerror(errno));
        else
            ft->sox_errno = SOX_SUCCESS;
    }
    return ft->sox_errno;
}


Jan


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: how to interpret tell_off, and the right way to use sox_seek
  2017-11-07  9:13                 ` Jan Stary
@ 2017-11-07  9:22                   ` Jan Stary
  0 siblings, 0 replies; 11+ messages in thread
From: Jan Stary @ 2017-11-07  9:22 UTC (permalink / raw)
  To: sox-users

> static int seek(sox_format_t * ft, uint64_t offset)
> {
>   priv_t *   wav = (priv_t *) ft->priv;
> 
>   if (ft->encoding.bits_per_sample & 7)
>     lsx_fail_errno(ft, SOX_ENOTSUP, "seeking not supported with this encoding");
>   else if (wav->formatTag == WAVE_FORMAT_GSM610) {
>     int alignment;
>     size_t gsmoff;
> 
>     /* rounding bytes to blockAlign so that we
>      * don't have to decode partial block. */
>     gsmoff = offset * wav->blockAlign / wav->samplesPerBlock +
>              wav->blockAlign * ft->signal.channels / 2;
>     gsmoff -= gsmoff % (wav->blockAlign * ft->signal.channels);
> 
>     ft->sox_errno = lsx_seeki(ft, (off_t)(gsmoff + wav->dataStart), SEEK_SET);
>     if (ft->sox_errno == SOX_SUCCESS) {
>       /* offset is in samples */
>       uint64_t new_offset = offset;
>       alignment = offset % wav->samplesPerBlock;
>       if (alignment != 0)
>           new_offset += (wav->samplesPerBlock - alignment);
>       wav->numSamples = ft->signal.length - (new_offset / ft->signal.channels);
>     }
>   } else {
>     double wide_sample = offset - (offset % ft->signal.channels);
>     double to_d = wide_sample * ft->encoding.bits_per_sample / 8;
>     off_t to = to_d;
>     ft->sox_errno = (to != to_d)? SOX_EOF : lsx_seeki(ft, (off_t)wav->dataStart + (off_t)to, SEEK_SET);
>     if (ft->sox_errno == SOX_SUCCESS)
>       wav->numSamples -= (size_t)wide_sample / ft->signal.channels;
>   }
> 
>   return ft->sox_errno;
> }
> 
> 
> It seems you are right: wav->numSamples get decremented no matter what.

Hm, not true; but I don't understand the computation either.
Is the double there so that the integer count does not overflow?
And is that related to the problem you are seeing?

	Jan


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sox-users

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-04  7:26 how to interpret tell_off, and the right way to use sox_seek Dan Hitt
2017-11-06 11:10 ` Jan Stary
2017-11-06 18:08   ` Dan Hitt
2017-11-06 19:06     ` Jan Stary
2017-11-06 20:14       ` Dan Hitt
2017-11-06 21:09         ` Jan Stary
2017-11-06 21:47           ` Dan Hitt
2017-11-07  8:50             ` Jan Stary
2017-11-07  9:01               ` Jan Stary
2017-11-07  9:13                 ` Jan Stary
2017-11-07  9:22                   ` Jan Stary

sox-users@lists.sourceforge.net unofficial mirror

Archives are clonable: git clone --mirror https://public-inbox.org/sox-users

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.audio.sox
	nntp://ou63pmih66umazou.onion/inbox.comp.audio.sox
	nntp://news.gmane.org/gmane.comp.audio.sox

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox