ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: finch.parker@gmail.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:99270] [Ruby master Feature#17016] Enumerable#scan_left
Date: Wed, 22 Jul 2020 17:23:47 +0000 (UTC)	[thread overview]
Message-ID: <redmine.journal-86657.20200722172346.15511@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-17016.20200707174828.15511@ruby-lang.org

Issue #17016 has been updated by parker (Parker Finch).


matz (Yukihiro Matsumoto) wrote in #note-17:
> I don't see any real-world use-case for `scan_left`. Is there any?


I think there are real-world use cases!

Would you consider converting a history of transactions into a history of account balances a valid use-case? That can be done easily with a scan. For example, if you have `transactions = [100, -200, 200]` then you can find the history of account balances with `transactions.scan_left(0, &:+) # => [0, 100, -100, 100]`.

I have described our current use case of `scan_left` [here](https://medium.com/building-panorama-education/scan-left-a-lazy-incremental-alternative-to-inject-f6e946f74c00). We use an event-sourced architecture where we scan over a lazy stream of events, building up the state as we go. It is important for our use case that we are able to access past states in addition to the final state.

[This post](https://medium.com/beauty-date-stories/algorithms-how-prefix-sums-can-help-improving-operations-over-arrays-b1f8e8141668) shows how it can make repeated subarray operations more efficient -- the example there is that if you have an array of values representing changes over time periods, then you can easily aggregate those into changes over different time periods using a `scan`. This post does not use `scan`, but instead has a workaround because `scan` doesn't exist:
```ruby
sums = [0]
(1..gains.length).each do |i|
  sums[i] = sums[i - 1] + gains[i - 1]
end
```
could, if `scan` was introduced, be replaced with:
```ruby
sums = gains.scan_left(0, &:+)
```

Do those use cases seem sufficient?

---

> In addition, the term `scan` does not seem to describe the behavior of keeping the past sequence of accumulation.
> Although I know the name from the Haskell community, I don't agree with the name.

I agree the name is not ideal. It is easy to get it confused with the idea of a string scanner. As you mentioned, `scan` is the name used by Haskell, but it is also used by [Scala](https://www.scala-lang.org/api/current/scala/Array.html), [Rust](https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.scan), and [C++](https://en.cppreference.com/w/cpp/algorithm/inclusive_scan). So it is a widely-used term, even if it's not the best one, which makes me think that it might be good to use it.

Alternatively, what do you think about the name `accumulate`, which [Wolfram uses](https://reference.wolfram.com/language/ref/Accumulate.html)? I think it gets the idea across better than `scan`.

Are there other names you think should be considered?

----------------------------------------
Feature #17016: Enumerable#scan_left
https://bugs.ruby-lang.org/issues/17016#change-86657

* Author: parker (Parker Finch)
* Status: Open
* Priority: Normal
----------------------------------------
## Proposal

Add a `#scan_left` method to `Enumerable`.

(The name "scan_left" is based on Scala's scanLeft and Haskell's scanl. It seems like "scan_left" would be a ruby-ish name for  this concept, but I'm curious if there are other thoughts on naming here!)

## Background

`#scan_left` is similar to `#inject`, but it accumulates the partial results that are computed. As a comparison:
```
[1, 2, 3].inject(0, &:+) => 6
[1, 2, 3].scan_left(0, &:+) => [0, 1, 3, 6]
```

Notably, the `scan_left` operation can be done lazily since it doesn't require processing the entire collection before computing a value.

I recently described `#scan_left`, and its relationship to `#inject`, more thoroughly in [this blog post](https://medium.com/building-panorama-education/scan-left-a-lazy-incremental-alternative-to-inject-f6e946f74c00).

## Reasoning
We heavily rely on the scan operation. We use an [event-sourcing](https://martinfowler.com/eaaDev/EventSourcing.html) pattern, which means that we are scanning over individual "events" and building up the corresponding state. We rely on the history of states and need to do this lazily (we stream events because they cannot fit in memory). Thus the scan operation is much more applicable than the inject operation.

We suspect that there are many applications that could leverage the scan operation. [This question](https://stackoverflow.com/questions/1475808/cumulative-array-sum-in-ruby) would be more easily answered by `#scan_left`. It is a natural fit for any application that needs to store the incrementally-computed values of an `#inject`, and a requirement for an application that needs to use `#inject` while maintaining laziness.

## Implementation
There is a Ruby implementation of this functionality [here](https://github.com/panorama-ed/scan_left/) and an implementation in C [here](https://github.com/ruby/ruby/pull/3078).

## Counterarguments
Introducing a new public method is committing to maintenance going forward and expands the size of the Ruby codebase -- it should not be done lightly. I think that providing the functionality here is worth the tradeoff, but I understand any hesitation to add yet more to Ruby!

---Files--------------------------------
scan_left_example.rb (2.93 KB)


-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2020-07-22 17:24 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-07 17:48 [ruby-core:99078] [Ruby master Feature#17016] Enumerable#scan_left finch.parker
2020-07-08  4:00 ` [ruby-core:99083] " sawadatsuyoshi
2020-07-08 14:51 ` [ruby-core:99089] " finch.parker
2020-07-09 21:33 ` [ruby-core:99102] " eregontp
2020-07-10 20:57 ` [ruby-core:99117] " finch.parker
2020-07-11  5:46 ` [ruby-core:99121] " nobu
2020-07-11 17:47 ` [ruby-core:99129] " eregontp
2020-07-11 17:49 ` [ruby-core:99130] " eregontp
2020-07-12  8:52 ` [ruby-core:99133] " nobu
2020-07-12 11:32 ` [ruby-core:99134] " eregontp
2020-07-12 23:51 ` [ruby-core:99141] " shyouhei
2020-07-14 17:50 ` [ruby-core:99167] " finch.parker
2020-07-14 18:20 ` [ruby-core:99168] " finch.parker
2020-07-15 14:33 ` [ruby-core:99177] " eregontp
2020-07-16  5:17 ` [ruby-core:99189] " mame
2020-07-17 14:29 ` [ruby-core:99203] " finch.parker
2020-07-18  3:23 ` [ruby-core:99212] " nobu
2020-07-20  6:02 ` [ruby-core:99234] " matz
2020-07-22 17:23 ` finch.parker [this message]
2020-07-22 18:12 ` [ruby-core:99274] " msiegel
2020-07-23  3:12 ` [ruby-core:99286] " nobu
2020-07-23 15:39 ` [ruby-core:99301] " finch.parker
2020-07-23 21:44 ` [ruby-core:99307] " annikoff
2020-07-24  2:30 ` [ruby-core:99310] " nobu
2020-07-25  6:20 ` [ruby-core:99327] " duerst
2020-07-25  7:49 ` [ruby-core:99328] " duerst
2020-07-25 10:12 ` [ruby-core:99330] " annikoff
2020-07-28 17:42 ` [ruby-core:99367] " msiegel
2020-07-28 23:34 ` [ruby-core:99378] " duerst
2020-07-30 19:16 ` [ruby-core:99404] " finch.parker
2020-07-31  5:47 ` [ruby-core:99410] " nobu
2020-08-10 16:47 ` [ruby-core:99545] " finch.parker
2020-08-21 14:22 ` [ruby-core:99664] " finch.parker
2020-08-29 11:41 ` [ruby-core:99769] " daniel
2020-09-03 14:57 ` [ruby-core:99880] " finch.parker
2020-11-10 14:06 ` [ruby-core:100762] " annikoff
2020-11-30 14:38 ` [ruby-core:101156] " finch.parker
2021-04-13 14:51 ` [ruby-core:103429] " finch.parker
2021-04-17  8:15 ` [ruby-core:103497] [Ruby master Feature#17016] Enumerable#accumulate mame
2021-04-27  9:05 ` [ruby-core:103614] " duerst
2021-04-27 15:14 ` [ruby-core:103615] " msiegel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-86657.20200722172346.15511@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).