Simplified Python caching

From: Collin Funk <collin.funk1@gmail.com>
To: bug-gnulib@gnu.org
Subject: Simplified Python caching
Date: Mon, 15 Apr 2024 10:08:52 -0700	[thread overview]
Message-ID: <948b9fbe-30c1-48a0-be22-6c136e40cc15@gmail.com> (raw)

I was looking at the documentation for the Python standard library
yesterday and discovered the 'lru_cache()' function [1]. Python 3.9
defines 'cache()' to be 'lru_cache(maxsize=None)' [2]. This is a
simple cache doesn't require a list and keeping track of size for LRU
replacements [3].

I was curious how this worked since we use a dictionary as a cache in
each GLModule object. The reasoning for this is that the raw data from
the module description needs extra processing before it is used.

Here is a small test program I used:

-----------------------------------------------
#!/usr/bin/env python3

import timeit

function1 = '''\
table = dict()
def fibonacci(value: int) -> int:
    if table.get(value) == None:
        if value <= 2:
            result = 1
        else:
            result = fibonacci(value - 1) + fibonacci(value - 2)
        table[value] = result
    return table[value]'''

function2 = '''\
from functools import lru_cache
@lru_cache(maxsize=None)
def fibonacci(value: int) -> int:
    if value <= 2:
        result = 1
    else:
        result = fibonacci(value - 1) + fibonacci(value - 2)
    return result'''

time1 = timeit.timeit(stmt='fibonacci(300)', setup=function1, number=10000000)
time2 = timeit.timeit(stmt='fibonacci(300)', setup=function2, number=10000000)

print(f'dict:      {time1}')
print(f'lru_cache: {time2}')
-----------------------------------------------

Results with Python 3.7:
$ python3.7 example.py 
dict:      1.5672868309993646
lru_cache: 0.5880454199996166

Results with Python 3.12:
$ python3.12 example.py 
dict:      0.9677453169988439
lru_cache: 0.5958652549998078

Any thoughts on using this feature? The main benefit I see in using
this is that it simplifies code. Right now alot of the functions have
their entire body wrapped in:

       if 'makefile-unconditional' not in self.cache:
           # do work and then save it
           self.cache['makefile-unconditional'] = result
       return self.cache['makefile-unconditional']

Using '@lru_cache(maxsize=None)' would let Python deal with all of
this caching for us. It seems that this is pretty standard way to
optimize Python 3 programs [4].

[1] https://docs.python.org/3/library/functools.html#functools.lru_cache
[2] https://docs.python.org/3/library/functools.html#functools.cache
[3] https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)
[4] https://codesearch.debian.net/search?q=%40lru_cache%28maxsize%3DNone%29&literal=1

Collin