From: Jan Stary <hans@stare.cz>
To: sox-users@lists.sourceforge.net
Subject: Re: sox to chatscript
Date: Sat, 3 Jan 2015 18:49:41 +0100 [thread overview]
Message-ID: <20150103174941.GA8876@www.stare.cz> (raw)
In-Reply-To: <trinity-7a92ebd8-e97c-47bb-bfb8-326ed4a86a95-1420300651196@3capp-mailcom-bs10>
On Jan 03 16:57:31, 4-werk@gmx.com wrote:
> The idea as I said before is to use sox as one half of a speech to text system.
> The other half will be writen in a program / language called chatscript
> Chatscript can read an input from a .txt file, normaly the input would be some thing like ???i love you.???and the output might be ???do you????, normal conversationa chat. my intention is to use the .txt file that sox will produce when It ears the sound / p / to out put the letter ???p???.
SoX cannot do anything like that.
> > 1) the microphone picks up the sound and the hardware converts it to digital.
> > 2) sox then tells the computer to treat the input as 32 bit words, one to describe each 1/8000 of a second.
> > 3) these input samples are then processed by sox to produce the output samples.
>
> Up to here, that's a pretty standard recording scenario.
> rec -b 32 -r 8k
>
> as the intention process then output the input, not to keep it, so working from the help that you gave me previously and the sox pdf I think it might start something like.
> -b 23 -r 8k -e filename.raw
>
> > 4) after 54 milliseconds these output samples will contain information from the current input sample plus those from 11, 13, 17, 19, 23, 29, 31, 37, 47 and 53, milliseconds erlaer.
> > That is what the echoes are for.
>
> I still have no idea why you want to do this at all,
>
> this is where we have been we have been gtting our selves confussed.
> You have been using the word sample in the sence that you might take a 30 sample of birdsong, and from that piont of view I can see why the echoes made absolutly no sence.
No. I have been using the word 'sample'
in the obvious audio-related sense. A number.
> I have been thinkig in terms of the last input from the mic, the latest of 8000 samples this second. And I could not understand why you seemed to be saying that this 8000 th of a second some how had information about what happened upto 400 samples previous.
Again, no; that's what you was saying will happen if you superimposed
the echoes of the previous samples onto that last sample.
> The reason for the echoes is to add those erlyer inputs to the to t he curent one, to make the output. There are 2 reasons for not just recrdng a tweniteth of a second of sound and usng that as the input to chatscript.
I am not suggesting anything like that.
> 1] A tweniteth at -b 32 -r 8k is 2*10^51200 possible combinations of 1s 0s, eich is far to many to find the match. I would have big nmber thre but my calculator lies and says its infnite. A singel 32 bit word has 2*10^32 = 4294967296 combinations. 2] what about the part fo the sound that crosses between samples, it would not be mached.
>
> let alone why you want it to be precisely 11, 13, etc.
>
> the tweniteth of a second is a best to fit with the rate f changes
> of sounds with in spoken words, and my well have to change.
I am having a hard time even parsing that sentence.
I have no idea why exactly is a twentieth of a second
"a best to fit the rate of change of sounds" (whatever that means).
Also, I don't think you have any idea either.
> The actualy quoted numbers are just prime munberd intervals witin that.
What on earth do "prime numbered intervals" have to do with any of that?
Again, you don't have a clue, do you.
> > 5) each output sample is re-branded as *.txt
>
> What do you even mean by a sample being "rebranded as *.txt", exactly?
>
> From brousing this forum, I came across ???convet mp3 to .txt file??? started by John S Higgs. Ulrich Klauer suggested.
> sox in.mp3 -t dat out.txt
> later on in the conversation Jan Stary says
>
> ???You can take any string of integers and make a WAV of them.
> But yes, you can of course convert DAT <-> WAV both ways.???
> so the idea of having sox give .txt output is not imposible. If I have to use wav as my input file type, so be it but I need .txt output to feed chatscript.
Sigh. What do you _mean_ by "txt output"? What _format_?
Example: record yourself with a microphone, saying "bullshit".
That's a recorded sound, right? What exactly would be the desired
"txt output" you are imagining to come out?
> > F) multi thread, to process the tracks in parallel.
>
> What "tracks", and why does that imply "multi-threaded"?
>
> > G) divide this into 2 tracks,
>
> Divide _what_ into 2 tracks?
>
> > if possible one the inverse of the other.
>
> Why?
>
> Actualy following from our discution I can see that this was me, unnesassrily complicatng thing that putting all of the echoes In a line, would work just as well
> echo 11 1 13 1 17 1 19 1 23 1 29 1 31 1 37 1 47 1 53 1
> this will add the selected previous input smples to the current input sample to make the output sample.
No it won't. Apparently, you haven't even looked at the syntax
of the echo effect, and you still haven't explained whta you even
expect from it and why.
> > H) using the echo function (not echoes) add multi pal echoes to each track,
> > \these echoes should be at the same volume as the track and have no decay. It is these echoes superimposed on there original track that carry the information about how the sound is changing with time.
>
> The original soundwave already contains precisely that information,
> woithout superimposing any echo.
>
> as mentioned above a wav file will contain all of the information that was recoded to make it, the individule samples that make up that wav file do not contain the information about other samples.
Yes. So what?
>
> > I) one track would have delays of 11, 17, 23, 31, 47 millisecond.
> > And he other would be 13, 19, 29, 37, 53,, millisecond.
>
> Why?
>
> to add the selected previous input samples to the current input sample and make the output sample.
WHY?
> > J) these 2 tracks are then converted into a single bindery file
> > by using soxes multiplication option.
>
> What "multiplication option"?
>
> Merging files by multiplication is a built in option for sox see the sox pdf. I do not need it now.
The only occurence of 'multiplication' in the SoX manpage
is in the description of the hilbert phase-shifter. So
what 'multiplication option' (and what 'SoX pdf') are you talking about.
> > K) this is then given the .txt file name.
>
> You can give any file any name you want.
> What does the "txt" have to do with anything?
>
> I need the output to be .txt because chatscript needs .txt for its input file.
The "txt" most probably does not mean what you think it means.
In particular, it does not men it is text.
> although there as been much misunderstanding between us. You have helped me a lot.
> -d gain -n -D -b 23 -r 8k -e filename.raw dat root/chatin.txt echo 11 1 13 1 17 1 19 1 23 1 29 1 31 1 37 1 47 1 53 1
> or something like this, thanks
This is, of course, not even valid syntax of a SoX command.
Let's wrap it up: SoX cannot do what you want it to do.
PLEASE take this elsewhere. Don't forget your medicine.
------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
next prev parent reply other threads:[~2015-01-03 17:50 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-03 15:57 sox to chatscript paul fellows
2015-01-03 17:49 ` Jan Stary [this message]
-- strict thread matches above, loose matches on Subject: below --
2015-01-04 13:23 Mike Hamilton
2015-01-04 16:02 paul fellows
2015-01-04 16:31 ` Chris Angelico
2015-01-04 20:31 ` fmiser
2015-01-04 17:47 Mike Hamilton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.sourceforge.net/lists/listinfo/sox-users
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150103174941.GA8876@www.stare.cz \
--to=sox-users@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/sox.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).