Merge pull request #41937 from eileencodes/has_many_through_skipping_joins

Add option to skip joins for associations.
This commit is contained in:
Eileen M. Uchitelle 2021-04-19 12:08:23 -04:00 committed by GitHub
commit 6ebd134a9a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
17 changed files with 425 additions and 11 deletions

@ -1,3 +1,27 @@
* Add option to disable joins for associations.
In a multiple database application, associations can't join across
databases. When set, this option instructs Rails to generate 2 or
more queries rather than generating joins for associations.
Set the option on a has many through association:
```ruby
class Dog
has_many :treats, through: :humans, disable_joins: true
has_many :humans
end
```
Then instead of generating join SQL, two queries are used for `@dog.treats`:
```
SELECT "humans"."id" FROM "humans" WHERE "humans"."dog_id" = ? [["dog_id", 1]]
SELECT "treats".* FROM "treats" WHERE "treats"."human_id" IN (?, ?, ?) [["human_id", 1], ["human_id", 2], ["human_id", 3]]
```
*Eileen M. Uchitelle*, *Aaron Patterson*, *Lee Quarella*
* Add setting for enumerating column names in SELECT statements.
Adding a column to a PostgresSQL database, for example, while the application is running can

@ -93,6 +93,7 @@ module ActiveRecord
autoload :Relation
autoload :AssociationRelation
autoload :DisableJoinsAssociationRelation
autoload :NullRelation
autoload_under "relation" do

@ -293,6 +293,7 @@ module Builder #:nodoc:
autoload :Preloader
autoload :JoinDependency
autoload :AssociationScope
autoload :DisableJoinsAssociationScope
autoload :AliasTracker
end
@ -1396,6 +1397,11 @@ module ClassMethods
# of association, including other <tt>:through</tt> associations. Options for <tt>:class_name</tt>,
# <tt>:primary_key</tt> and <tt>:foreign_key</tt> are ignored, as the association uses the
# source reflection.
# [:disable_joins]
# Specifies whether joins should be skipped for an association. If set to true, two or more queries
# will be generated. Note that in some cases, if order or limit is applied, it will be done in-memory
# due to database limitions. This option is only applicable on `has_many :through` associations as
# `has_many` alone do not perform a join.
#
# If the association on the join model is a #belongs_to, the collection can be modified
# and the records on the <tt>:through</tt> model will be automatically created and removed
@ -1451,6 +1457,7 @@ module ClassMethods
# has_many :tags, as: :taggable
# has_many :reports, -> { readonly }
# has_many :subscribers, through: :subscriptions, source: :user
# has_many :subscribers, through: :subscriptions, disable_joins: true
# has_many :comments, strict_loading: true
def has_many(name, scope = nil, **options, &extension)
reflection = Builder::HasMany.build(self, name, scope, options, &extension)

@ -33,7 +33,7 @@ module Associations
# <tt>owner</tt>, the collection of its posts as <tt>target</tt>, and
# the <tt>reflection</tt> object represents a <tt>:has_many</tt> macro.
class Association #:nodoc:
attr_reader :owner, :target, :reflection
attr_reader :owner, :target, :reflection, :disable_joins
delegate :options, to: :reflection
@ -41,6 +41,7 @@ def initialize(owner, reflection)
reflection.check_validity!
@owner, @reflection = owner, reflection
@disable_joins = @reflection.options[:disable_joins] || false
reset
reset_scope
@ -97,7 +98,9 @@ def target=(target)
end
def scope
if (scope = klass.current_scope) && scope.try(:proxy_association) == self
if disable_joins
DisableJoinsAssociationScope.create.scope(self)
elsif (scope = klass.current_scope) && scope.try(:proxy_association) == self
scope.spawn
elsif scope = klass.global_current_scope
target_scope.merge!(association_scope).merge!(scope)
@ -250,7 +253,11 @@ def violates_strict_loading?
# actually gets built.
def association_scope
if klass
@association_scope ||= AssociationScope.scope(self)
@association_scope ||= if disable_joins
DisableJoinsAssociationScope.scope(self)
else
AssociationScope.scope(self)
end
end
end

@ -11,6 +11,7 @@ def self.valid_options(options)
valid += [:as, :foreign_type] if options[:as]
valid += [:through, :source, :source_type] if options[:through]
valid += [:ensuring_owner_was] if options[:dependent] == :destroy_async
valid += [:disable_joins] if options[:disable_joins] && options[:through]
valid
end

@ -0,0 +1,52 @@
# frozen_string_literal: true
module ActiveRecord
module Associations
class DisableJoinsAssociationScope < AssociationScope # :nodoc:
def scope(association)
source_reflection = association.reflection
owner = association.owner
unscoped = association.klass.unscoped
reverse_chain = get_chain(source_reflection, association, unscoped.alias_tracker).reverse
last_reflection, last_ordered, last_join_ids = last_scope_chain(reverse_chain, owner)
add_constraints(last_reflection, last_reflection.join_primary_key, last_join_ids, owner, last_ordered)
end
private
def last_scope_chain(reverse_chain, owner)
first_scope = [reverse_chain.shift, false, [owner.id]]
reverse_chain.inject(first_scope) do |(reflection, ordered, join_ids), next_reflection|
key = reflection.join_primary_key
records = add_constraints(reflection, key, join_ids, owner, ordered)
foreign_key = next_reflection.join_foreign_key
record_ids = records.pluck(foreign_key)
records_ordered = records && records.order_values.any?
[next_reflection, records_ordered, record_ids]
end
end
def add_constraints(reflection, key, join_ids, owner, ordered)
scope = reflection.build_scope(reflection.aliased_table).where(key => join_ids)
scope = reflection.constraints.inject(scope) do |memo, scope_chain_item|
item = eval_scope(reflection, scope_chain_item, owner)
scope.unscope!(*item.unscope_values)
scope.where_clause += item.where_clause
scope.order_values = item.order_values | scope.order_values
scope
end
if scope.order_values.empty? && ordered
split_scope = DisableJoinsAssociationRelation.create(scope.klass, key, join_ids)
split_scope.where_clause += scope.where_clause
split_scope
else
scope
end
end
end
end
end

@ -214,6 +214,7 @@ def delete_through_records(records)
def find_target
return [] unless target_reflection_has_associated_record?
return scope.to_a if disable_joins
super
end

@ -97,6 +97,8 @@ def through_scope
scope = through_reflection.klass.unscoped
options = reflection.options
return scope if options[:disable_joins]
values = reflection_scope.values
if annotations = values[:annotate]
scope.annotate!(*annotations)

@ -0,0 +1,41 @@
# frozen_string_literal: true
module ActiveRecord
class DisableJoinsAssociationRelation < Relation # :nodoc:
TOO_MANY_RECORDS = 5000
attr_reader :ids, :key
def initialize(klass, key, ids)
@ids = ids.uniq
@key = key
super(klass)
end
def limit(value)
records.take(value)
end
def first(limit = nil)
if limit
records.limit(limit).first
else
records.first
end
end
def load
super
records = @records
records_by_id = records.group_by do |record|
record[key]
end
records = ids.flat_map { |id| records_by_id[id.to_i] }
records.compact!
@records = records
end
end
end

@ -15,7 +15,8 @@ def initialize_relation_delegate_cache
[
ActiveRecord::Relation,
ActiveRecord::Associations::CollectionProxy,
ActiveRecord::AssociationRelation
ActiveRecord::AssociationRelation,
ActiveRecord::DisableJoinsAssociationRelation
].each do |klass|
delegate = Class.new(klass) {
include ClassSpecificRelation

@ -0,0 +1,196 @@
# frozen_string_literal: true
require "cases/helper"
require "models/post"
require "models/author"
require "models/comment"
require "models/rating"
require "models/member"
require "models/member_type"
require "models/pirate"
require "models/treasure"
require "models/hotel"
require "models/department"
class HasManyThroughDisableJoinsAssociationsTest < ActiveRecord::TestCase
fixtures :posts, :authors, :comments, :pirates
def setup
@author = authors(:mary)
@post = @author.posts.create(title: "title", body: "body")
@member_type = MemberType.create(name: "club")
@member = Member.create(member_type: @member_type)
@comment = @post.comments.create(body: "text", origin: @member)
@post2 = @author.posts.create(title: "title", body: "body")
@member2 = Member.create(member_type: @member_type)
@comment2 = @post2.comments.create(body: "text", origin: @member2)
@rating1 = @comment.ratings.create(value: 8)
@rating2 = @comment.ratings.create(value: 9)
end
def test_counting_on_disable_joins_through
assert_equal @author.comments.count, @author.no_joins_comments.count
assert_queries(2) { @author.no_joins_comments.count }
assert_queries(1) { @author.comments.count }
end
def test_counting_on_disable_joins_through_using_custom_foreign_key
assert_equal @author.comments_with_foreign_key.count, @author.no_joins_comments_with_foreign_key.count
assert_queries(2) { @author.no_joins_comments_with_foreign_key.count }
assert_queries(1) { @author.comments_with_foreign_key.count }
end
def test_pluck_on_disable_joins_through
assert_equal @author.comments.pluck(:id), @author.no_joins_comments.pluck(:id)
assert_queries(2) { @author.no_joins_comments.pluck(:id) }
assert_queries(1) { @author.comments.pluck(:id) }
end
def test_pluck_on_disable_joins_through_using_custom_foreign_key
assert_equal @author.comments_with_foreign_key.pluck(:id), @author.no_joins_comments_with_foreign_key.pluck(:id)
assert_queries(2) { @author.no_joins_comments_with_foreign_key.pluck(:id) }
assert_queries(1) { @author.comments_with_foreign_key.pluck(:id) }
end
def test_fetching_on_disable_joins_through
assert_equal @author.comments.first.id, @author.no_joins_comments.first.id
assert_queries(2) { @author.no_joins_comments.first.id }
assert_queries(1) { @author.comments.first.id }
end
def test_fetching_on_disable_joins_through_using_custom_foreign_key
assert_equal @author.comments_with_foreign_key.first.id, @author.no_joins_comments_with_foreign_key.first.id
assert_queries(2) { @author.no_joins_comments_with_foreign_key.first.id }
assert_queries(1) { @author.comments_with_foreign_key.first.id }
end
def test_to_a_on_disable_joins_through
assert_equal @author.comments.to_a, @author.no_joins_comments.to_a
@author.reload
assert_queries(2) { @author.no_joins_comments.to_a }
assert_queries(1) { @author.comments.to_a }
end
def test_appending_on_disable_joins_through
assert_difference(->() { @author.no_joins_comments.reload.size }) do
@post.comments.create(body: "text")
end
assert_queries(2) { @author.no_joins_comments.reload.size }
assert_queries(1) { @author.comments.reload.size }
end
def test_appending_on_disable_joins_through_using_custom_foreign_key
assert_difference(->() { @author.no_joins_comments_with_foreign_key.reload.size }) do
@post.comments.create(body: "text")
end
assert_queries(2) { @author.no_joins_comments_with_foreign_key.reload.size }
assert_queries(1) { @author.comments_with_foreign_key.reload.size }
end
def test_empty_on_disable_joins_through
empty_author = authors(:bob)
assert_equal [], assert_queries(0) { empty_author.comments.all }
assert_equal [], assert_queries(1) { empty_author.no_joins_comments.all }
end
def test_empty_on_disable_joins_through_using_custom_foreign_key
empty_author = authors(:bob)
assert_equal [], assert_queries(0) { empty_author.comments_with_foreign_key.all }
assert_equal [], assert_queries(1) { empty_author.no_joins_comments_with_foreign_key.all }
end
def test_pluck_on_disable_joins_through_a_through
rating_ids = Rating.where(comment: @comment).pluck(:id)
assert_equal rating_ids, assert_queries(1) { @author.ratings.pluck(:id) }
assert_equal rating_ids, assert_queries(3) { @author.no_joins_ratings.pluck(:id) }
end
def test_count_on_disable_joins_through_a_through
ratings_count = Rating.where(comment: @comment).count
assert_equal ratings_count, assert_queries(1) { @author.ratings.count }
assert_equal ratings_count, assert_queries(3) { @author.no_joins_ratings.count }
end
def test_count_on_disable_joins_using_relation_with_scope
assert_equal 2, assert_queries(1) { @author.good_ratings.count }
assert_equal 2, assert_queries(3) { @author.no_joins_good_ratings.count }
end
def test_to_a_on_disable_joins_with_multiple_scopes
assert_equal [@rating1, @rating2], assert_queries(1) { @author.good_ratings.to_a }
assert_equal [@rating1, @rating2], assert_queries(3) { @author.no_joins_good_ratings.to_a }
end
def test_preloading_has_many_through_disable_joins
assert_queries(3) { Author.all.preload(:good_ratings).map(&:good_ratings) }
assert_queries(4) { Author.all.preload(:no_joins_good_ratings).map(&:good_ratings) }
end
def test_polymophic_disable_joins_through_counting
assert_equal 2, assert_queries(1) { @author.ordered_members.count }
assert_equal 2, assert_queries(3) { @author.no_joins_ordered_members.count }
end
def test_polymophic_disable_joins_through_ordering
assert_equal [@member2, @member], assert_queries(1) { @author.ordered_members.to_a }
assert_equal [@member2, @member], assert_queries(3) { @author.no_joins_ordered_members.to_a }
end
def test_polymorphic_disable_joins_through_reordering
assert_equal [@member, @member2], assert_queries(1) { @author.ordered_members.reorder(id: :asc).to_a }
assert_equal [@member, @member2], assert_queries(3) { @author.no_joins_ordered_members.reorder(id: :asc).to_a }
end
def test_polymorphic_disable_joins_through_ordered_scopes
assert_equal [@member2, @member], assert_queries(1) { @author.ordered_members.unnamed.to_a }
assert_equal [@member2, @member], assert_queries(3) { @author.no_joins_ordered_members.unnamed.to_a }
end
def test_polymorphic_disable_joins_through_ordered_chained_scopes
member3 = Member.create(member_type: @member_type)
member4 = Member.create(member_type: @member_type, name: "named")
@post2.comments.create(body: "text", origin: member3)
@post2.comments.create(body: "text", origin: member4)
assert_equal [member3, @member2, @member], assert_queries(1) { @author.ordered_members.unnamed.with_member_type_id(@member_type.id).to_a }
assert_equal [member3, @member2, @member], assert_queries(3) { @author.no_joins_ordered_members.unnamed.with_member_type_id(@member_type.id).to_a }
end
def test_polymorphic_disable_joins_through_ordered_scope_limits
assert_equal [@member2], assert_queries(1) { @author.ordered_members.unnamed.limit(1).to_a }
assert_equal [@member2], assert_queries(3) { @author.no_joins_ordered_members.unnamed.limit(1).to_a }
end
def test_polymorphic_disable_joins_through_ordered_scope_first
assert_equal @member2, assert_queries(1) { @author.ordered_members.unnamed.first }
assert_equal @member2, assert_queries(3) { @author.no_joins_ordered_members.unnamed.first }
end
def test_order_applied_in_double_join
assert_equal [@member2, @member], assert_queries(1) { @author.members.to_a }
assert_equal [@member2, @member], assert_queries(3) { @author.no_joins_members.to_a }
end
def test_first_and_scope_applied_in_double_join
assert_equal @member2, assert_queries(1) { @author.members.unnamed.first }
assert_equal @member2, assert_queries(3) { @author.no_joins_members.unnamed.first }
end
def test_first_and_scope_in_double_join_applies_order_in_memory
disable_joins_sql = capture_sql { @author.no_joins_members.unnamed.first }
assert_no_match(/ORDER BY/, disable_joins_sql.last)
end
def test_limit_and_scope_applied_in_double_join
assert_equal [@member2], assert_queries(1) { @author.members.unnamed.limit(1).to_a }
assert_equal [@member2], assert_queries(3) { @author.no_joins_members.unnamed.limit(1) }
end
def test_limit_and_scope_in_double_join_applies_limit_in_memory
disable_joins_sql = capture_sql { @author.no_joins_members.unnamed.first }
assert_no_match(/LIMIT 1/, disable_joins_sql.last)
end
end

@ -20,6 +20,50 @@ def ratings
Rating.joins(:comment).merge(self)
end
end
has_many :comments_with_order, -> { ordered_by_post_id }, through: :posts, source: :comments
has_many :no_joins_comments, through: :posts, disable_joins: :true, source: :comments
has_many :comments_with_foreign_key, through: :posts, source: :comments, foreign_key: :post_id
has_many :no_joins_comments_with_foreign_key, through: :posts, disable_joins: :true, source: :comments, foreign_key: :post_id
has_many :members,
through: :comments_with_order,
source: :origin,
source_type: "Member"
has_many :no_joins_members,
through: :comments_with_order,
source: :origin,
source_type: "Member",
disable_joins: true
has_many :ordered_members,
-> { order(id: :desc) },
through: :comments_with_order,
source: :origin,
source_type: "Member"
has_many :no_joins_ordered_members,
-> { order(id: :desc) },
through: :comments_with_order,
source: :origin,
source_type: "Member",
disable_joins: true
has_many :ratings, through: :comments
has_many :good_ratings,
-> { where("ratings.value > 5") },
through: :comments,
source: :ratings
has_many :no_joins_ratings, through: :no_joins_comments, disable_joins: :true, source: :ratings
has_many :no_joins_good_ratings,
-> { where("ratings.value > 5") },
through: :comments,
source: :ratings,
disable_joins: true
has_many :comments_containing_the_letter_e, through: :posts, source: :comments
has_many :comments_with_order_and_conditions, -> { order("comments.body").where("comments.body like 'Thank%'") }, through: :posts, source: :comments
has_many :comments_with_include, -> { includes(:post).where(posts: { type: "Post" }) }, through: :posts, source: :comments

@ -10,10 +10,12 @@ class Comment < ActiveRecord::Base
scope :for_first_post, -> { where(post_id: 1) }
scope :for_first_author, -> { joins(:post).where("posts.author_id" => 1) }
scope :created, -> { all }
scope :ordered_by_post_id, -> { order("comments.post_id DESC") }
belongs_to :post, counter_cache: true
belongs_to :author, polymorphic: true
belongs_to :resource, polymorphic: true
belongs_to :origin, polymorphic: true
belongs_to :company, foreign_key: "company"
has_many :ratings

@ -37,6 +37,9 @@ class Member < ActiveRecord::Base
belongs_to :admittable, polymorphic: true
has_one :premium_club, through: :admittable
scope :unnamed, -> { where(name: nil) }
scope :with_member_type_id, -> (id) { where(member_type_id: id) }
end
class SelfMember < ActiveRecord::Base

@ -30,6 +30,7 @@ def greeting
scope :containing_the_letter_a, -> { where("body LIKE '%a%'") }
scope :titled_with_an_apostrophe, -> { where("title LIKE '%''%'") }
scope :ranked_by_comments, -> { order(table[:comments_count].desc) }
scope :ordered_by_post_id, -> { order("posts.post_id ASC") }
scope :limit_by, lambda { |l| limit(l) }
scope :locked, -> { lock }

@ -235,6 +235,8 @@
# See #14855.
t.string :resource_id
t.string :resource_type
t.integer :origin_id
t.string :origin_type
t.integer :developer_id
t.datetime :updated_at
t.datetime :deleted_at

@ -32,7 +32,6 @@ databases
The following features are not (yet) supported:
* Automatic swapping for horizontal sharding
* Joining across clusters
* Load balancing replicas
* Dumping schema caches for multiple databases
@ -460,6 +459,42 @@ end
`ActiveRecord::Base.connected_to` maintains the ability to switch
connections globally.
### Handling associations with joins across databases
As of Rails 7.0+, Active Record has an option for handling associations that would perform
a join across multiple databases. If you have a has many through association that you want to
disable joining and perform 2 or more queries, pass the `disable_joins: true` option.
For example:
```ruby
class Dog < AnimalsRecord
has_many :treats, through: :humans, disable_joins: true
has_many :humans
end
```
Previously calling `@dog.treats` without `disable_joins` would raise an error because databases are unable
to handle joins across clusters. With the `disable_joins` option, Rails will generate multiple select queries
to avoid attempting joining across clusters. For the above association `@dog.treats` would generate the
following SQL:
```sql
SELECT "humans"."id" FROM "humans" WHERE "humans"."dog_id" = ? [["dog_id", 1]]
SELECT "treats".* FROM "treats" WHERE "treats"."human_id" IN (?, ?, ?) [["human_id", 1], ["human_id", 2], ["human_id", 3]]
```
There are some important things to be aware of with this option:
1) There may be performance implications since now two or more queries will be performed (depending
on the association) rather than a join. If the select for `humans` returned a high number of IDs
the select for `treats` may send too many IDs.
2) Since we are no longer performing joins a query with an order or limit is now sorted in-memory since
order from one table cannot be applied to another table.
3) This setting must be added to all associations that you want joining to be disabled.
Rails can't guess this for you because association loading is lazy, to load `treats` in `@dog.treats`
Rails already needs to know what SQL should be generated.
## Caveats
### Automatic swapping for horizontal sharding
@ -475,12 +510,6 @@ dependent on your infrastructure. We may implement basic, primitive load balanci
in the future, but for an application at scale this should be something your application
handles outside of Rails.
### Joining Across Databases
Applications cannot join across databases. At the moment applications will need to
manually write two selects and split the joins themselves. In a future version Rails
will split the joins for you.
### Schema Cache
If you use a schema cache and multiple databases, you'll need to write an initializer