user/dev discussion of public-inbox itself
 help / color / Atom feed
* How to force stricter threading
@ 2020-03-09 13:15 Konstantin Ryabitsev
  2020-03-11 10:20 ` Eric Wong
  2020-03-19  7:22 ` Eric Wong
  0 siblings, 2 replies; 5+ messages in thread
From: Konstantin Ryabitsev @ 2020-03-09 13:15 UTC (permalink / raw)
  To: meta

Hello:

I think public-inbox currently does some heuristic-based threading, 
which may actually not be that useful. For example:

https://lore.kernel.org/linux-renesas-soc/20200217101741.3758-1-geert+renesas@glider.be/

None of the [PATCH] messages have references or in-reply-to set, but for 
some reason they are threaded together. I can generally see this being 
useful for exact subject matches, but in this case all of the subjects 
are different (despite being similar).

Is there a way to enforce stricter threading rules?

-K

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to force stricter threading
  2020-03-09 13:15 How to force stricter threading Konstantin Ryabitsev
@ 2020-03-11 10:20 ` Eric Wong
  2020-03-19  7:22 ` Eric Wong
  1 sibling, 0 replies; 5+ messages in thread
From: Eric Wong @ 2020-03-11 10:20 UTC (permalink / raw)
  To: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> Hello:
> 
> I think public-inbox currently does some heuristic-based threading, 
> which may actually not be that useful. For example:
> 
> https://lore.kernel.org/linux-renesas-soc/20200217101741.3758-1-geert+renesas@glider.be/
> 
> None of the [PATCH] messages have references or in-reply-to set, but for 
> some reason they are threaded together. I can generally see this being 
> useful for exact subject matches, but in this case all of the subjects 
> are different (despite being similar).

That's a strange bug.  Will have to look at it another time.

> Is there a way to enforce stricter threading rules?

Right now, it's a combination of strict threading based on
references/in-reply-to and exact subject matching by hash.
It could be possible to disable the subject matching part...

But that case is a strange bug which I'll have to examine more
closely another time (maybe later this week, but more likely
next).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to force stricter threading
  2020-03-09 13:15 How to force stricter threading Konstantin Ryabitsev
  2020-03-11 10:20 ` Eric Wong
@ 2020-03-19  7:22 ` Eric Wong
  2020-03-19  7:58   ` Eric Wong
  1 sibling, 1 reply; 5+ messages in thread
From: Eric Wong @ 2020-03-19  7:22 UTC (permalink / raw)
  To: meta

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> Hello:
> 
> I think public-inbox currently does some heuristic-based threading, 
> which may actually not be that useful. For example:
> 
> https://lore.kernel.org/linux-renesas-soc/20200217101741.3758-1-geert+renesas@glider.be/
> 
> None of the [PATCH] messages have references or in-reply-to set, but for 
> some reason they are threaded together. I can generally see this being 
> useful for exact subject matches, but in this case all of the subjects 
> are different (despite being similar).

So the "Patchwork summary for: linux-renesas-soc" message:

https://lore.kernel.org/linux-renesas-soc/158229483332.12219.5639020605006542672.git-patchwork-summary@kernel.org/raw

has the following header:

References: <20200217101741.3758-1-geert+renesas@glider.be>,
 <20200218112414.5591-1-geert+renesas@glider.be>,
 <20200218112449.5723-1-geert+renesas@glider.be>,
 <20200219153929.11073-1-geert+renesas@glider.be>,
 <20200218132217.21454-1-geert+renesas@glider.be>,
 <20200217103251.5205-1-geert+renesas@glider.be>

Which seems to have tied a bunch of unrelated threads together
as one, similar to how a merge commit works in git but is
unexpected and rare for mail threads.

> Is there a way to enforce stricter threading rules?

So I think the internal indexing database behavior is correct
in tying a bunch of unrelated threads together based on that
References: header.

But the thread rendering could be improved.  What mutt does
seems alright, but doesn't convey the "merge" scenario
(I think) your bot was going for...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to force stricter threading
  2020-03-19  7:22 ` Eric Wong
@ 2020-03-19  7:58   ` Eric Wong
  2020-03-19 18:45     ` Konstantin Ryabitsev
  0 siblings, 1 reply; 5+ messages in thread
From: Eric Wong @ 2020-03-19  7:58 UTC (permalink / raw)
  To: meta

Eric Wong <e@yhbt.net> wrote:
> So the "Patchwork summary for: linux-renesas-soc" message:
> 
> https://lore.kernel.org/linux-renesas-soc/158229483332.12219.5639020605006542672.git-patchwork-summary@kernel.org/raw
> 
> has the following header:
> 
> References: <20200217101741.3758-1-geert+renesas@glider.be>,
>  <20200218112414.5591-1-geert+renesas@glider.be>,
>  <20200218112449.5723-1-geert+renesas@glider.be>,
>  <20200219153929.11073-1-geert+renesas@glider.be>,
>  <20200218132217.21454-1-geert+renesas@glider.be>,
>  <20200217103251.5205-1-geert+renesas@glider.be>
> 
> Which seems to have tied a bunch of unrelated threads together
> as one, similar to how a merge commit works in git but is
> unexpected and rare for mail threads.

<snip>

> But the thread rendering could be improved.  What mutt does
> seems alright, but doesn't convey the "merge" scenario
> (I think) your bot was going for...

Fwiw, RFC 5322 sec 3.6.4 states:

	Therefore, trying to form a "References:" field for a reply that
	has multiple parents is discouraged; how to do so is not defined
	in this document.

So there's also that...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How to force stricter threading
  2020-03-19  7:58   ` Eric Wong
@ 2020-03-19 18:45     ` Konstantin Ryabitsev
  0 siblings, 0 replies; 5+ messages in thread
From: Konstantin Ryabitsev @ 2020-03-19 18:45 UTC (permalink / raw)
  To: Eric Wong; +Cc: meta

On Thu, Mar 19, 2020 at 07:58:20AM +0000, Eric Wong wrote:
> > So the "Patchwork summary for: linux-renesas-soc" message:
> > 
> > https://lore.kernel.org/linux-renesas-soc/158229483332.12219.5639020605006542672.git-patchwork-summary@kernel.org/raw
> > 
> > has the following header:
> > 
> > References: <20200217101741.3758-1-geert+renesas@glider.be>,
> >  <20200218112414.5591-1-geert+renesas@glider.be>,
> >  <20200218112449.5723-1-geert+renesas@glider.be>,
> >  <20200219153929.11073-1-geert+renesas@glider.be>,
> >  <20200218132217.21454-1-geert+renesas@glider.be>,
> >  <20200217103251.5205-1-geert+renesas@glider.be>
> > 
> > Which seems to have tied a bunch of unrelated threads together
> > as one, similar to how a merge commit works in git but is
> > unexpected and rare for mail threads.

Thanks for tracking that down.

> > But the thread rendering could be improved.  What mutt does
> > seems alright, but doesn't convey the "merge" scenario
> > (I think) your bot was going for...
> 
> Fwiw, RFC 5322 sec 3.6.4 states:
> 
> 	Therefore, trying to form a "References:" field for a reply that
> 	has multiple parents is discouraged; how to do so is not defined
> 	in this document.
> 
> So there's also that...

I'll plan to stop adding these to the References: header and just give 
lore.kernel.org/r/<msgid> links. I don't think anyone would care and it 
will avoid this problem.

Thanks again,
-K

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-09 13:15 How to force stricter threading Konstantin Ryabitsev
2020-03-11 10:20 ` Eric Wong
2020-03-19  7:22 ` Eric Wong
2020-03-19  7:58   ` Eric Wong
2020-03-19 18:45     ` Konstantin Ryabitsev

user/dev discussion of public-inbox itself

Archives are clonable:
	git clone --mirror http://public-inbox.org/meta
	git clone --mirror http://czquwvybam4bgbro.onion/meta
	git clone --mirror http://hjrcffqmbrq6wope.onion/meta
	git clone --mirror http://ou63pmih66umazou.onion/meta

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.mail.public-inbox.meta
	nntp://ou63pmih66umazou.onion/inbox.comp.mail.public-inbox.meta
	nntp://czquwvybam4bgbro.onion/inbox.comp.mail.public-inbox.meta
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.mail.public-inbox.meta
	nntp://news.gmane.io/gmane.mail.public-inbox.general

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git