The model global `read_attribute_for_validation` is not fit for some
validators (specifically for numericality validator).
To allow per-validator customization for attribute value, it extracts
`read_attribute_for_validation` in `EachValidator`.
This makes `datetime.serialize` about 10% faster.
```ruby
type = ActiveRecord::Type.lookup(:datetime)
time = Time.now.utc
Benchmark.ips do |x|
x.report("type.serialize(time)") do
type.serialize(time)
type.serialize(time)
type.serialize(time)
type.serialize(time)
end
end
```
Before:
```
Warming up --------------------------------------
type.serialize(time) 12.899k i/100ms
Calculating -------------------------------------
type.serialize(time) 131.293k (± 1.6%) i/s - 657.849k in 5.011870s
```
After:
```
Warming up --------------------------------------
type.serialize(time) 14.603k i/100ms
Calculating -------------------------------------
type.serialize(time) 145.941k (± 1.1%) i/s - 730.150k in 5.003639s
```
Follow up to c07dff72278fb7f2a3c4c71212a0773a2b25c790.
Actually it is not the cop's fault, but we mistakenly use `^`, `$`, and
`\Z` in much places, the cop doesn't correct those conservatively.
I've checked all those usage and replaced all safe ones.
Instantiating attributes hash from raw database values is one of the
slower part of attributes.
Why that is necessary is to detect mutations. In other words, that isn't
necessary until mutations are happened.
`LazyAttributeHash` which was introduced at 0f29c21 is to instantiate
attribute lazily until first accessing the attribute (i.e.
`Model.find(1)` isn't slow yet, but `Model.find(1).attr_name` is still
slow).
This introduces `LazyAttributeSet` to instantiate attribute more lazily,
it doesn't instantiate attribute until first assigning/dirty checking
the attribute (i.e. `Model.find(1).attr_name` is no longer slow).
It makes attributes access about 35% faster for readonly (non-mutation)
usage.
https://gist.github.com/kamipo/4002c96a02859d8fe6503e26d7be4ad8
Before:
```
IPS
Warming up --------------------------------------
attribute access 1.000 i/100ms
Calculating -------------------------------------
attribute access 3.444 (± 0.0%) i/s - 18.000 in 5.259030s
MEMORY
Calculating -------------------------------------
attribute access 38.902M memsize ( 0.000 retained)
350.044k objects ( 0.000 retained)
15.000 strings ( 0.000 retained)
```
After (with `immutable_strings_by_default = true`):
```
IPS
Warming up --------------------------------------
attribute access 1.000 i/100ms
Calculating -------------------------------------
attribute access 4.652 (±21.5%) i/s - 23.000 in 5.034853s
MEMORY
Calculating -------------------------------------
attribute access 27.782M memsize ( 0.000 retained)
170.044k objects ( 0.000 retained)
15.000 strings ( 0.000 retained)
```
And benchmark with this branch for immutable string type:
```ruby
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string :name
t.string :fast_name
end
end
class User < ActiveRecord::Base
attribute :fast_name, :immutable_string
end
user = User.new
Benchmark.ips do |x|
x.report("user.name") do
user.name = "foo"
user.name_changed?
end
x.report("user.fast_name") do
user.fast_name = "foo"
user.fast_name_changed?
end
end
```
```
Warming up --------------------------------------
user.name 34.811k i/100ms
user.fast_name 39.505k i/100ms
Calculating -------------------------------------
user.name 343.864k (± 3.6%) i/s - 1.741M in 5.068576s
user.fast_name 384.033k (± 2.7%) i/s - 1.936M in 5.044425s
```
In Rails 4.2 we introduced mutation detection, to remove the need to
call `attribute_will_change!` after modifying a field. One side effect
of that change was that we needed to enforce that the
`_before_type_cast` form of a value is a different object than the post
type cast value, if the value is mutable. That contract is really only
relevant for strings, but it meant we needed to dup them.
In Rails 5 we added the `ImmutableString` type, to allow people to opt
out of this duping in places where the memory usage was causing
problems, and they don't need to mutate that field.
This takes that a step further, and adds a class-level setting to
specify whether strings should be frozen by default or not. The default
value of this setting is `false` (strings are mutable), and I do not
plan on changing that.
While I think that immutable strings by default would be a reasonable
default for new applications, I do not think that we would be able to
document the value of this setting in a place that people will be
looking when they can't figure out why some string is frozen.
Realistically, the number of applications where this setting is relevant
is fairly small, so I don't think it would make sense to ever enable it
by default.
Delegating to just one line method is to not be worth it.
Avoiding the delegation makes `read_attribute` about 15% faster.
```ruby
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
t.string :name
end
end
class User < ActiveRecord::Base
def fast_read_attribute(attr_name, &block)
name = attr_name.to_s
name = self.class.attribute_aliases[name] || name
name = @primary_key if name == "id" && @primary_key
@attributes.fetch_value(name, &block)
end
end
user = User.create!(name: "user name")
Benchmark.ips do |x|
x.report("read_attribute('id')") { user.read_attribute('id') }
x.report("read_attribute('name')") { user.read_attribute('name') }
x.report("fast_read_attribute('id')") { user.fast_read_attribute('id') }
x.report("fast_read_attribute('name')") { user.fast_read_attribute('name') }
end
```
```
Warming up --------------------------------------
read_attribute('id') 165.744k i/100ms
read_attribute('name')
162.229k i/100ms
fast_read_attribute('id')
192.543k i/100ms
fast_read_attribute('name')
191.209k i/100ms
Calculating -------------------------------------
read_attribute('id') 1.648M (± 1.7%) i/s - 8.287M in 5.030170s
read_attribute('name')
1.636M (± 3.9%) i/s - 8.274M in 5.065356s
fast_read_attribute('id')
1.918M (± 1.8%) i/s - 9.627M in 5.021271s
fast_read_attribute('name')
1.928M (± 0.9%) i/s - 9.752M in 5.058820s
```
Redundant `to_s` has a few overhead. Especially private methods are not
intend to be passed user input directly so it should be passed always
string.
Removing redundant `to_s` makes attribute methods about 10% faster.
```ruby
ActiveRecord::Schema.define do
create_table :users, force: true do |t|
end
end
class User < ActiveRecord::Base
def fast_read_attribute(attr_name, &block)
@attributes.fetch_value(attr_name, &block)
end
end
user = User.create!
Benchmark.ips do |x|
x.report("user._read_attribute('id')") { user._read_attribute("id") }
x.report("user.fast_read_attribute('id')") { user.fast_read_attribute("id") }
end
```
```
Warming up --------------------------------------
user._read_attribute('id')
272.151k i/100ms
user.fast_read_attribute('id')
283.518k i/100ms
Calculating -------------------------------------
user._read_attribute('id')
2.699M (± 1.3%) i/s - 13.608M in 5.042846s
user.fast_read_attribute('id')
2.988M (± 1.2%) i/s - 15.026M in 5.029056s
```
For now, `increment` with aliased attribute does work, but `increment!`
with aliased attribute does not work, due to `clear_attribute_change` is
not aware of attribute aliases.
We sometimes partially updates specific attributes in dirties, at that
time it relies on `clear_attribute_change` to clear partially updated
attribute dirties. If `clear_attribute_change` is not attribute method
unlike others, we need to resolve attribute aliases manually only for
`clear_attribute_change`, it is a little inconvinient for me.
From another point of view, we have `restore_attributes`,
`restore_attribute!`, `clear_attribute_changes`, and
`clear_attribute_change`. Despite almost similar features
`restore_attribute!` is an attribute method but `clear_attribute_change`
is not.
Given the above, I'd like to promote `clear_attribute_change` as
attribute methods to fix issues caused by the inconsisteny.
This reverts 8538dfdc084555673d18cfc3479ebef09f325c9c, which broke the
activemodel-serializers-xml gem.
We can still get most of the benefit by applying the optimisation from
7b3919774252f99e55e6b6ec370aafc42adca2b2 to empty hashes as well as nil.
This has the additional benefit of retaining the optimisation when the
user passes an empty options hash.
Since we're checking `serializable?` in the new `HomogeneousIn`
`serialize` will no longer raise an exception. We implemented
`unchecked_serialize` to avoid raising in these cases, but with some of
our refactoring we no longer need it.
I discovered this while trying to fix a query in our application that
was not properly serializing binary columns. I discovered that in at
least 2 of our active model types we were not calling the correct
serialization. Since `serialize` wasn't aliased to `unchecked_serialize`
in `ActiveModel::Type::Binary` and `ActiveModel::Type::Boolean` (I
didn't check others but pretty sure all the AM Types are broken) the SQL
was being treated as a `String` and not the correct type.
This caused Rails to incorrectly query by string values. This is
problematic for columns storing binary data like our emoji columns at
GitHub. The test added here is an example of how the Binary type was
broken previously. The SQL should be using the hex values, not the
string value of "🥦" or other emoji.
We still have the problem `unchecked_serialize` was supposed to fix -
that `serialize` shouldn't validate data, just convert it. We'll be
fixing that in a followup PR so for now we should use `serialize` so we
know all the values are going through the right serialization for their
SQL.
Since #31827, marshalling attributes hash format is changed to improve
performance because materializing lazy attribute hash is too expensive.
In that time, we had kept an ability to load from legacy attributes
format, since that performance improvement is backported to 5-1-stable
and 5-0-stable.
Now all supported versions will dump attributes as new format, the
backward compatibity should no longer be needed.
Rails has a monkey patch on `range.cover?` that is slower than Ruby's
`range.cover?`. We don't need Range support in this case because the SQL
creates a `BETWEEN` not an `IN` statement.
A coworker at GitHub found a few months back that if we used
`santitize_sql` over `where` when we knew the values going into `where`
it was a lot faster than `where`.
This PR adds a new Arel node type called `HomogenousIn` that will be
used when Rails knows the values are all homogenous and can therefore
pick a faster codepath. This new codepath skips some of the required
processing by `where` to make `wheres` with homogenous arrays faster
without requiring the application author to know when to use which query
type.
Using our benchmark code:
```ruby
ids = (1..1000).each.map do |n|
Post.create!.id
end
Benchmark.ips do |x|
x.report("where with ids") do
Post.where(id: ids).to_a
end
x.report("where with sanitize") do
Post.where(ActiveRecord::Base.sanitize_sql(["id IN (?)", ids])).to_a
end
x.compare!
end
```
Before this PR comparing where with a list of IDs to santitize sql:
```
Warming up --------------------------------------
where with ids 11.000 i/100ms
where with sanitize 17.000 i/100ms
Calculating -------------------------------------
where with ids 115.733 (± 4.3%) i/s - 583.000 in 5.045828s
where with sanitize 174.231 (± 4.0%) i/s - 884.000 in 5.081495s
Comparison:
where with sanitize: 174.2 i/s
where with ids: 115.7 i/s - 1.51x slower
```
After this PR comparing where with a list of IDs to santitize sql:
```
Warming up --------------------------------------
where with ids 16.000 i/100ms
where with sanitize 19.000 i/100ms
Calculating -------------------------------------
where with ids 158.293 (± 6.3%) i/s - 800.000 in 5.072208s
where with sanitize 169.141 (± 3.5%) i/s - 855.000 in 5.060878s
Comparison:
where with sanitize: 169.1 i/s
where with ids: 158.3 i/s - same-ish: difference falls within error
```
Co-authored-by: Aaron Patterson <aaron.patterson@gmail.com>
This deprecation is useless since the result is still an Error object
and there is no way to fix the code to remove the deprecation.
Let's just accept as a breaking change.