user/dev discussion of public-inbox itself
 help / color / mirror / code / Atom feed
* ssoma_repository.txt on Duplicate Message-IDs.
@ 2017-05-16 17:11 Ralph Corderoy
  2017-05-16 17:31 ` Eric Wong
  0 siblings, 1 reply; 2+ messages in thread
From: Ralph Corderoy @ 2017-05-16 17:11 UTC (permalink / raw)
  To: meta

Hi,

https://ssoma.public-inbox.org/ssoma_repository.txt says

    Thus the blobs for conflicting Message-IDs will be the SHA-1
    hexdigest of the Subject header and raw body (no extra whitespace
    delimiting the two).

        PFX=21/4527ce3741f50bb9afa65e7c5003c8a8ddc4b1

        $PFX/287d8b67bf8ebdb30e34cb4ca9995dbd465f37aa # first copy
        $PFX/287d8b67bf8ebdb30e34cb4ca9995dbd465f37ab # second copy
        $PFX/287d8b67bf8ebdb30e34cb4ca9995dbd465f37ac # third copy

So what happens when a second email with the same (message-ID, subject,
body) arrives, but different Date, CC, etc?

Also, how do the three copies above have near identical digests?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: ssoma_repository.txt on Duplicate Message-IDs.
  2017-05-16 17:11 ssoma_repository.txt on Duplicate Message-IDs Ralph Corderoy
@ 2017-05-16 17:31 ` Eric Wong
  0 siblings, 0 replies; 2+ messages in thread
From: Eric Wong @ 2017-05-16 17:31 UTC (permalink / raw)
  To: Ralph Corderoy; +Cc: meta

Ralph Corderoy <ralph@inputplus.co.uk> wrote:
> Hi,
> 
> https://ssoma.public-inbox.org/ssoma_repository.txt says
> 
>     Thus the blobs for conflicting Message-IDs will be the SHA-1
>     hexdigest of the Subject header and raw body (no extra whitespace
>     delimiting the two).
> 
>         PFX=21/4527ce3741f50bb9afa65e7c5003c8a8ddc4b1
> 
>         $PFX/287d8b67bf8ebdb30e34cb4ca9995dbd465f37aa # first copy
>         $PFX/287d8b67bf8ebdb30e34cb4ca9995dbd465f37ab # second copy
>         $PFX/287d8b67bf8ebdb30e34cb4ca9995dbd465f37ac # third copy
> 
> So what happens when a second email with the same (message-ID, subject,
> body) arrives, but different Date, CC, etc?

They're skipped, unfortunately.  I suppose certain headers
(From/To/Date/???) are more important than others (Received)
and warrant storing the extra copy...  this was probably a design
flaw, but I haven't worked on ssoma much in recent years.

Fwiw, current public-inbox hasn't implemented this Message-ID
conflict resolution at all (it no longer uses ssoma); duplicates
just get lost entirely.  (AFAIK it's not possible to implement
NNTP correctly with duplicate Message-IDs)

> Also, how do the three copies above have near identical digests?

They're made up, I just wanted :)

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-05-16 17:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-16 17:11 ssoma_repository.txt on Duplicate Message-IDs Ralph Corderoy
2017-05-16 17:31 ` Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/public-inbox.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).