From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-4.0 required=3.0 tests=ALL_TRUSTED,BAYES_00 shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 343241F4C0; Thu, 24 Oct 2019 22:34:51 +0000 (UTC) Date: Thu, 24 Oct 2019 22:34:51 +0000 From: Eric Wong To: meta@public-inbox.org Subject: Re: RFC: monthly epochs for v2 Message-ID: <20191024223451.GA17949@dcvr> References: <20191024195304.5b7zlx7e3vxfxmtg@chatter.i7.local> <20191024203503.GA31522@dcvr> <20191024212108.zfbwh7bmfbo3cgu5@chatter.i7.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20191024212108.zfbwh7bmfbo3cgu5@chatter.i7.local> List-Id: Konstantin Ryabitsev wrote: > On Thu, Oct 24, 2019 at 08:35:03PM +0000, Eric Wong wrote: > > > - if someone is only interested in a few months worth of archives, they > > > don't have to clone the entire collection > > > - similarly, someone using public-inbox to feed messages to their inbox > > > (e.g. using the l2md tool [1]) doesn't need to waste gigabytes storing > > > archives they aren't interested in > > > > NNTP or d:YYYYMMDD..YYYYMMDD mboxrd downloads via HTTP search > > are better suited for those cases. > > I know you really like nntp, but I'm worried that with Big Corp's love of > deep packet inspection and filtering, NNTP ports aren't going to be usable > by a large subset of developers. We already have enough problems with port > 9418 not being reachable (and sometimes not even port 22). Since usenet's > descent into mostly illegal content, many corporate environments probably > have ports 119 and 563 blocked off entirely and changing that would be > futile. I would consider the possibility of an HTTP API which looks like NNTP commands, too. But it wouldn't work with existing NNTP clients... Maybe websockets can be used *shrug* NNTP can also run off 80/443 if somebody has an extra IP. Not sure if supporting HTTP and NNTP off the same port is a possibility since some HTTP clients pre-connect TCP and NNTP is server-talks-first whereas HTTP is client-talks-first. > > If people only want a backup via git (and not host HTTP/NNTP), > > it's FAR easier for them to run ubiquitous commands such as > > "git clone --mirror && git fetch" rather than > > "install $TOOL which may be out-of-date-or-missing-on-your-distro" > > I think that anyone who is likely to use public-inbox repositories for more > than just a copy of archives is likely to be using some kind of tool. I > mean, SMTP can be used with "telnet" but nobody really does. :) If we > provide a convenient library that supports things like intelligent selective > cloning, indexing, fetching messages, etc, then that would avoid everyone > doing it badly. In fact, libpublicinbox and bindings to most common > languages is probably something that should happen early on. I'm not sure about a libpublicinbox... I have been really hesitant to depend on shared C/C++ libraries whenever I use Perl or Ruby because of build and install complexity; especially for stuff that's not-yet-available on distros. Well-defined and stable protocols + data formats? Yes. 100 times yes. What would be nice is to have a local server so they could access everything via HTTP using curl or whatever HTTP library users want. On shared systems, it could be HTTP over a UNIX socket. I don't think libcurl supports Unix domain sockets, yet, but HTTP/1.1 parsers are pretty common. JSON is a possibility, too; but I'm not sure if JSON is even necessary if all that's exchanged are git blob OIDs and URLs for mboxes. Parsing MIME + RFC822(-ish) are already sunk costs.