Skip to content

Memoization strategy #539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ documentation](https://docs.rubocop.org/rubocop/configuration.html#inheriting-co
* [Regular Expressions](#regular-expressions)
* [Percent Literals](#percent-literals)
* [Testing](#testing)
* [Memoization](#memoization)

## General

Expand Down Expand Up @@ -1391,3 +1392,21 @@ documentation](https://docs.rubocop.org/rubocop/configuration.html#inheriting-co
do_something
end
~~~

## Memoization

* Prefer `return @x if defined?(@x)` over a simple `||=`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Prefer `return @x if defined?(@x)` over a simple `||=`
* Prefer `return @x if defined?(@x)` over `||=` if the statement being memoized can evaluate to `nil` or `false`.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I didn't specify this nuance on purpose. I think having a single way of doing things simplifies the process of coding, and can protect the code from certain changes.

To illustrate my point, let's say we have the following method:

def merchants
  @merchant ||= Merchant.where(some_condition)
end

Now, even when no merchants are found, the result would be [], which according to your suggestion would not grant a guard statement because it would never be false-y. If for some reason, though, the method changes to just check if merchants exist, it would require the person changing it to remember this distinction in behaviour and then apply the guard.

Without having the guard in the original method, I find it very likely the method would evolve to something like this:

def merchants?
  @merchants_exist ||= Merchant.exists?(some_condition)
end

Which could in fact result in N queries.

Thoughts?

Copy link
Member

@cbothner cbothner May 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, everything around memoization requires nuance, because memoization is caching and cache invalidation is one of the Two Hard Problems1. I think your addition is a good detail to include in our style guide to reduce the chance someone mindlessly repeats the rose memoization pattern. But it would also be suboptimal for someone to mindlessly repeat return @x if defined?(@x).

I don't support forbidding rose memoization, so I'll advocate for the nuance in our discussion of it.


On a similar note, a section on Memoization might also include a note about memoizing a method that takes arguments.

# Bad. A stale value can be returned when a different argument is passed for subsequent calls
def expensive_result(input)
  @expensive_result ||= ExpensiveResult.calculate(input)
end

# Better. Recalculates when a different argument is passed.
# Consider using an LRU or not memoizing at all if the argument cardinality
# is high or the object is long-lived.
def expensive_result(input)
  @expensive_results ||= {}
  @expensive_results[input] ||= ExpensiveResult.calculate(input)
end

# If the expensive result might be falsey
def expensive_result(input)
  @expensive_results ||= {}
  return @expensive_results[input] if @expensive_results.key?(input)

  @expensive_results[input] = ExpensiveResult.calculate(input)
end

Footnotes

  1. along with naming things and off-by-one errors, famously

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't support forbidding rose memoization, so I'll advocate for the nuance in our discussion of it.

I completely agree we shouldn't forbid anything. The way I think about these style guides are not as rules, but as suggestions that apply to at least 90% of cases. As I see it, in the vast majority of cases, there's no harm to "mindlessly repeat" the guard statement - but I wouldn't advocate for using it in more complex caching situations.

Given my understanding of what these guides are for, I do think adding too much nuance to the instruction makes it sort of void, though.

What is your view on how these guides should be used?


~~~ ruby
# bad - if @merchant is assigned a falsy value (`nil` or `false`) and is called N times, the query will also run N times
def merchant
@merchant ||= Merchant.find_by(id: merchant_id)
end

# good - A single query, even when @merchant is nil
def merchant
return @merchant if defined?(@merchant)

@merchant = Merchant.find_by(id: merchant_id)
end
~~~