ruby-core@ruby-lang.org archive (unofficial mirror)
 help / color / mirror / Atom feed
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
To: Maurice Smulders <maurice.smulders@genevatech.net>
Cc: Ruby developers <ruby-core@ruby-lang.org>
Subject: [ruby-core:99482] Re: Reduction of ENCODER files for embedded systems
Date: Wed, 5 Aug 2020 16:14:21 +0900	[thread overview]
Message-ID: <4c0ce664-c176-d637-1175-19158ca1ca2a@it.aoyama.ac.jp> (raw)
In-Reply-To: <CACxOKoKQ=UkXE570n2ruijD0ikzCLp+gcx1Fn=dD2Gso1hL3jA@mail.gmail.com>

On 05/08/2020 02:27, Maurice Smulders wrote:
> What is the best way to not build/remove the encoder files in enc and
> trans in the ruby source tree?
> 
> I am building for an embedded system. The code running on it will only
> ever support USASCII, and reduction of size is paramount...
> 
> Thanks,
> 

Hello Maurice,

I have been involved in the transcoding part, but that was quite some 
time ago.

First, for embedded systems, I'd definitely also have a look at mruby 
(http://mruby.org/).

Second, I'd have a look at miniruby, which uses only a few encodings.

Third, I'd just start by removing some of the relevant files in enc and 
enc/trans, and see what happens (with the make process, testing,...).

Quite some effort, such as the automatic generation of encdb.h and 
transdb.h, went into making sure (at least in theory) that new 
encodings/transcodings could be added easily. On the other hand, many 
encodings turn up in special situations, and it may be somewhat 
difficult to get rid of them.

In particular, I'd start removing encodings labeled as 
Japanese/Korean/Chinese (because they use relatively more data), then 
move on to the various Windows-xxxx and ISO-8859-XX variants, leaving 
UTF-16/32, ISO-8859-1, ASCII-8BIT (aka BINARY), and UTF-8 for later. In 
particular UTF-8 may be difficult to remove, because it is used as the 
default source encoding, and there are many optimizations because it's 
widely used and has a very special structure.

Please feel free to ask here again if you run into any issues.

Regards,   Martin.

      reply	other threads:[~2020-08-05  7:14 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-04 17:27 [ruby-core:99478] Reduction of ENCODER files for embedded systems Maurice Smulders
2020-08-05  7:14 ` Martin J. Dürst [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.ruby-lang.org/en/community/mailing-lists/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c0ce664-c176-d637-1175-19158ca1ca2a@it.aoyama.ac.jp \
    --to=ruby-core@ruby-lang.org \
    --cc=maurice.smulders@genevatech.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).