ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: 6ftdan@gmail.com
To: ruby-core@ruby-lang.org
Subject: [ruby-core:72143] [Ruby trunk - Feature #11815] Proposal for method `Array#difference`
Date: Tue, 15 Dec 2015 12:27:03 +0000	[thread overview]
Message-ID: <redmine.journal-55552.20151215122701.09106c2af2e27dc9@ruby-lang.org> (raw)
In-Reply-To: redmine.issue-11815.20151214082709@ruby-lang.org

Issue #11815 has been updated by Daniel P. Clark.


I like how your Array#difference method works well with duplicate entries.  I've only come across times where the difference of id references between two lists needed to be determined.  In my case it's

~~~ruby
a = [2, 4, 6, 8, 2, 4, 6, 8]
b = [1, 2, 3, 4, 1, 2, 3, 4]

# example
b - a
# => [1, 3, 1, 3]

b - a | a - b
# => [1, 3, 6, 8]
~~~

Like the example you first gave with added `| b - a` for getting two way evaluation on uniqueness.  If I wanted to get the same thing with Array#difference it looks the same as my example above.

~~~ruby
a = [2, 4, 6, 8, 2, 4, 6, 8]
b = [1, 2, 3, 4, 1, 2, 3, 4]

# example
b.difference(a)
# => [1, 3, 1, 3]

a.difference(b) | b.difference(a)
# => [1, 3, 6, 8]
~~~

So as to not cause confusion these are not the same as I will demonstrate with Cary Swoveland's input.

~~~ruby
a = [1,2,3,4,3,2,2,4]
b = [2,3,4,4,4]

b.difference(a)
# => [4] 
b - a
# => []

a.difference(b)
# => [1, 3, 2, 2] 
a - b
# => [1]
~~~

As far as a real world use case for Array#difference: Service (A) exports all data to CSV files with a background worker. Service (B) exports to a database with a background worker.  Sometimes a background worker crashes.  Now to figure out what's missing we compare the difference between to two datasets.  *One flaw in my example is there is no determination in the position the new data needs to be entered to match the other.  In this case we would need to use something like Enumerator#with_index*

@Cary Swoveland; If I could make one recommendation on the implementation. I think it would be best to have it as an Enumerator so it can be performed with lazy evaluation.  That way when the difference is being compared we can perform operations along the way and save system resources.

----------------------------------------
Feature #11815: Proposal for method `Array#difference`
https://bugs.ruby-lang.org/issues/11815#change-55552

* Author: Cary Swoveland
* Status: Open
* Priority: Normal
* Assignee: 
----------------------------------------
I propose that a method `Array#difference` be added to the Ruby core. It is similar to [Array#-](http://ruby-doc.org/core-2.2.0/Array.html#method-i-2D) but for each element of the (array) argument it removes only one matching element from the receiver. For example:

    a = [1,2,3,4,3,2,2,4]
    b = [2,3,4,4,4]

    a - b #=> [1]
    c = a.difference b #=> [1, 3, 2, 2] 

As you see, `a` contains three `2`'s and `b` contains `1`, so the first `2` in `a` has been removed from `a` in constructing `c`. When `b` contains as least as many instances of an element as does `a`, `c` contains no instances of that element. 

It could be implemented as follows:

     class Array
       def difference(other)
         dup.tap do |cpy|
           other.each do |e|
             ndx = cpy.index(e)
             cpy.delete_at(ndx) if ndx
            end
          end
        end
      end

Here are a few examples of its use:

*Identify an array's unique elements*

      a = [1,3,2,4,3,4]
      u = a.uniq #=> [1, 2, 3, 4]
      u - a.difference(u) #=> [1, 2]

*Determine if two words of the same size are anagrams of each other*

      w1, w2 = "stop", "pots"
      w1.chars.difference(w2.chars).empty?
        #=> true

*Identify a maximal number of 1-1 matches between the elements of two arrays and return an array of all elements from both arrays that were not matched*

      a = [1, 2, 4, 2, 1, 7, 4, 2, 9] 
      b = [4, 7, 3, 2, 2, 7] 
      a.difference(b).concat(b.difference(a))
        #=> [1, 1, 4, 2, 9, 3, 7] 
  
To remove elements from `a` starting at the end (rather the beginning) of `a`:

    a = [1,2,3,4,3,2,2,4]
    b = [2,3,4,4,4]

    a.reverse.difference(b).reverse #=> [1,2,3,2]

`Array#difference!` could be defined in the obvious way.



-- 
https://bugs.ruby-lang.org/

  parent reply	other threads:[~2015-12-15 11:55 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <redmine.issue-11815.20151214082709@ruby-lang.org>
2015-12-14  8:27 ` [ruby-core:72107] [Ruby trunk - Feature #11815] [Open] Proposal for method `Array#difference` cary
2015-12-14  9:38 ` [ruby-core:72111] [Ruby trunk - Feature #11815] " cary
2015-12-15  4:33 ` [ruby-core:72132] " matz
2015-12-15  5:15 ` [ruby-core:72135] " cary
2015-12-15  6:56 ` [ruby-core:72138] " cary
2015-12-15  7:02 ` [ruby-core:72139] " cary
2015-12-15  8:01 ` [ruby-core:72140] " duerst
2015-12-15  8:47 ` [ruby-core:72141] " cary
2015-12-15 12:27 ` 6ftdan [this message]
2015-12-15 17:37 ` [ruby-core:72151] " cary
2015-12-15 17:52 ` [ruby-core:72154] " 6ftdan
2015-12-18  3:48 ` [ruby-core:72235] " rp.beltran
2015-12-18 10:17 ` [ruby-core:72363] " duerst
2015-12-19 19:22 ` [ruby-core:72392] " cary
2016-08-19 22:43 ` [ruby-core:76988] [Ruby trunk Feature#11815] " cary
2016-09-15 16:43 ` [ruby-core:77285] " cary
2018-09-29 14:14 ` [ruby-core:89215] " florian.ebeling
2018-09-30 12:16 ` [ruby-core:89219] " florian.ebeling
2018-10-05 20:04 ` [ruby-core:89290] " florian.ebeling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=redmine.journal-55552.20151215122701.09106c2af2e27dc9@ruby-lang.org \
    --to=ruby-core@ruby-lang.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).