https://github.com/rails/rails/pull/39935 changed the behavior of
`Path::Pattern#spec` and `Route#ast` to return an `Ast` rather than the
root `Node`. After eager loading, however, we clear out the `Ast` to
limit retained memory and these methods return `nil`.
While these methods are not public and they aren't used internally after
eager loading, having them return `nil` makes it difficult for
applications that had been using these methods to get access to the
root `Node`.
This commit restores the behavior of these two methods to return the
root `Node`. The `Ast` is still available via `Path::Pattern#ast`, and
we still clear it out after eager loading.
Now that spec is a `Node` and not an `Ast` masquerading as one, we can
get rid of the delegate methods on `Ast.
Since `Route#ast` now returns the root `Node`, the newly added
`Route#ast_root` is no longer necessary so I've removed it.
I also removed the unused `@decorated_ast` method, which should have
been removed in https://github.com/rails/rails/pull/39935.
This commit introduces a new `Journey::Ast` class that wraps the root
node of an ast. The purpose of this class is to reduce the number of
walks through the ast by taking a single pass through each node and
caching values for later use.
To avoid retaining additional memory, we clear out these ast objects
after eager loading.
Benefits
---
* Performance improvements (see benchmarks below)
* Keeps various ast manipulations together in a single class, rather
than scattered throughout
* Adding some names to things will hopefully make this code a little
easier to follow for future readers
Benchmarks
---
We benchmarked loading a real routes file with > 3500 routes.
master was at a9336a67b0 when we ran these. Note that these benchmarks
also include a few small changes that we didn't include in this commit,
but that we will follow up with after this gets merged - these
additional changes do not change the benchmarks significantly.
Time:
```
master - 0.798 ± 0.024 (500 runs)
this branch - 0.695 ± 0.029 (500 runs)
```
Allocations:
```
master - 980149 total allocated objects
this branch - 931357 total allocated objects
```
Stackprof:
Seeing `ActionDispatch::Journey::Visitors::Each#visit` more frequently
on the stack is what led us down this path in the first place. These
changes seem to have done the trick.
```
master:
TOTAL (pct) SAMPLES (pct) FRAME
52 (0.5%) 52 (0.5%) ActionDispatch::Journey::Nodes::Node#symbol?
58 (0.5%) 45 (0.4%) ActionDispatch::Journey::Scanner#scan
45 (0.4%) 45 (0.4%) ActionDispatch::Journey::Nodes::Cat#type
43 (0.4%) 43 (0.4%) ActionDispatch::Journey::Visitors::FunctionalVisitor#terminal
303 (2.7%) 43 (0.4%) ActionDispatch::Journey::Visitors::Each#visit
69 (0.6%) 40 (0.4%) ActionDispatch::Routing::Mapper::Scope#each
this commit:
TOTAL (pct) SAMPLES (pct) FRAME
82 (0.6%) 42 (0.3%) ActionDispatch::Journey::Scanner#next_token
31 (0.2%) 31 (0.2%) ActionDispatch::Journey::Nodes::Node#symbol?
30 (0.2%) 30 (0.2%) ActionDispatch::Journey::Nodes::Node#initialize
```
See also the benchmark script in https://github.com/rails/rails/pull/39935#issuecomment-887791294
Co-authored-by: Eric Milford <ericmilford@gmail.com>
This was originally added in
8d26f875f7,
and then updated to something closer to it's current form in
https://github.com/rails/rails/pull/23140
The purpose was to alter the regexp for symbols surrounded by literals
(for path parameters within path segments like "/foo-:bar" or
"/:foo-bar") so that `default_regexp?` would return `false` for those
symbols. `default_regexp?` was then used to partition the routes (see
6ab985da28/actionpack/lib/action_dispatch/journey/routes.rb (L44)).
But after https://github.com/rails/rails/pull/41024, we're no longer
partitioning routes using `default_regexp?` (in fact we're no longer
using the method at all, so I've removed it). So I think it may now be
possible to remove this.
Prior to https://github.com/rails/rails/pull/41024, removing this code
would have broken the tests added in
https://github.com/rails/rails/pull/23140, but those tests now pass
without it. I also tried this patch with the GitHub monolith and didn't
get any test failures.
The FSM used to find matching routes was previously limited to patterns
that contained parameters with the default regexp / no constraints. In
large route sets where many parameters are constrained by custom regexp,
these routes all fall back on a slow linear search over the route list.
These custom regexes were not previously able to be included in the FSM
because it transitioned between nodes using only fragments of the URI,
or path separators [/.?], but a custom regex may cross a path separator
boundary. To work around this, the TransitionTable is improved to
support remembering a point within the matching string that we started,
and continuing to attempt to match from that point up to the end of each
token. Only parameters not on a path separator boundary must still match
with a linear search after this change (e.g. `/foo-:bar/`).
This results in performance for constrainted routes that matches that of
ones using the default regexp.
Benchmark:
https://gist.github.com/theojulienne/e91fc338d180e1710e29c81a5d701fab
Before:
```
Calculating -------------------------------------
without params 6.466k (±12.7%) i/s - 31.648k in 5.009453s
params without constraints
5.867k (±12.9%) i/s - 28.842k in 5.032637s
params with constraints
909.661 (± 7.9%) i/s - 4.536k in 5.023534s
```
After:
```
Calculating -------------------------------------
without params 6.387k (±11.9%) i/s - 31.728k in 5.068939s
params without constraints
5.824k (±13.2%) i/s - 28.650k in 5.043701s
params with constraints
5.406k (±11.7%) i/s - 26.931k in 5.076412s
```
For github.com which has many constrainted parameters, a random sampling
of 10 URL patterns can be matched approximately 2-4x faster than before.
This commit fixes symbols as constrains as tested in
6ab985da28
The FSM used to find matching routes was previously limited to patterns
that contained parameters with the default regexp / no constraints. In
large route sets where many parameters are constrained by custom regexp,
these routes all fall back on a slow linear search over the route list.
These custom regexes were not previously able to be included in the FSM
because it transitioned between nodes using only fragments of the URI,
or path separators [/.?], but a custom regex may cross a path separator
boundary. To work around this, the TransitionTable is improved to
support remembering a point within the matching string that we started,
and continuing to attempt to match from that point up to the end of each
token. Only parameters not on a path separator boundary must still match
with a linear search after this change (e.g. `/foo-:bar/`).
This results in performance for constrainted routes that matches that of
ones using the default regexp.
Benchmark:
https://gist.github.com/theojulienne/e91fc338d180e1710e29c81a5d701fab
Before:
```
Calculating -------------------------------------
without params 6.466k (±12.7%) i/s - 31.648k in 5.009453s
params without constraints
5.867k (±12.9%) i/s - 28.842k in 5.032637s
params with constraints
909.661 (± 7.9%) i/s - 4.536k in 5.023534s
```
After:
```
Calculating -------------------------------------
without params 6.387k (±11.9%) i/s - 31.728k in 5.068939s
params without constraints
5.824k (±13.2%) i/s - 28.650k in 5.043701s
params with constraints
5.406k (±11.7%) i/s - 26.931k in 5.076412s
```
For github.com which has many constrainted parameters, a random sampling
of 10 URL patterns can be matched approximately 2-4x faster than before.
Before this commit we initialized all Symbols with the default regexp,
then later on reassigned any symbols descending from stars with either
their regexp from `@requirements` or the default greedy regexp.
With this commit we initialize all Symbols descending from Stars with
the greedy regexp at parse time. This allows us to get rid of the star
branch in path/pattern, since any regexps from `@requirements` will
already have been set in the symbol branch of this code.
This is essentially an alternate version of #38901. Getting rid of the
extra branch makes some performance work I am doing a bit easier, plus
it saves us a few method calls. Also the constant saves us from creating
the same regexp multiple times, and it is nice to give that regexp a
name.
This first started with moving type method inside
`ActionDispatch::Journey::Nodes::Symbol`.
`AD::Journey::Nodes::Symbol#type` was generated dynamically with an
`each` block. While this is OK for classes like `AD::Journey::Nodes::Slash`
or `AD::Journey::Nodes::Dot` which don't have further implementation, all
other classes containing more logic have this method defined in their class
body. This patch does the same in this case.
On code review process @kamipo suggested to fully expand over
metaprogramming for Slash and Dot classes, a topic on which I agree with him.
Before:
Total allocated: 209050523 bytes (2219202 objects)
Total retained: 36580305 bytes (323462 objects)
After:
Total allocated: 209180253 bytes (2222455 objects)
Total retained: 36515599 bytes (321850 objects)
---
Modest saving of 1612 RVALUEs in the heap on Discourse boot
The larger the route file the better the results.
Saving will only be visible on Ruby 2.5 and up.
This basically reverts e9fca7668b9eba82bcc832cb0061459703368397, d08da958b9ae17d4bbe4c9d7db497ece2450db5f,
d1fe1dcf8ab1c0210a37c2a78c1ee52cf199a66d, and 68eaf7b4d5f2bb56d939f71c5ece2d61cf6680a3
```ruby
require 'action_pack'
require 'action_dispatch'
require 'benchmark/ips'
route_set = ActionDispatch::Routing::RouteSet.new
routes = ActionDispatch::Routing::Mapper.new route_set
ObjectSpace::AllocationTracer.setup(%i{path line type})
result = ObjectSpace::AllocationTracer.trace do
500.times do
routes.resources :foo
end
end
sorted = ObjectSpace::AllocationTracer.allocated_count_table.sort_by(&:last)
sorted.each do |k,v|
next if v == 0
p k => v
end
__END__
Before:
{:T_SYMBOL=>11}
{:T_REGEXP=>17}
{:T_STRUCT=>6500}
{:T_MATCH=>12004}
{:T_OBJECT=>99009}
{:T_DATA=>100088}
{:T_HASH=>122015}
{:T_STRING=>159637}
{:T_IMEMO=>363134}
{:T_ARRAY=>433056}
After:
{:T_SYMBOL=>11}
{:T_REGEXP=>17}
{:T_STRUCT=>6500}
{:T_MATCH=>12004}
{:T_OBJECT=>91009}
{:T_DATA=>100088}
{:T_HASH=>114013}
{:T_STRING=>159637}
{:T_ARRAY=>321056}
{:T_IMEMO=>351133}
```
Eagerly calculate and cache the name of Symbol objects in the path AST.
This drops about 26 string allocations per resource:
```ruby
require 'action_pack'
require 'action_dispatch'
require 'benchmark/ips'
route_set = ActionDispatch::Routing::RouteSet.new
routes = ActionDispatch::Routing::Mapper.new route_set
ObjectSpace::AllocationTracer.setup(%i{path line type})
result = ObjectSpace::AllocationTracer.trace do
500.times do
routes.resources :foo
end
end
sorted = ObjectSpace::AllocationTracer.allocated_count_table.sort_by(&:last)
sorted.each do |k,v|
next if v == 0
p k => v
end
__END__
Before:
{:T_SYMBOL=>11}
{:T_REGEXP=>17}
{:T_STRUCT=>6500}
{:T_MATCH=>12004}
{:T_OBJECT=>99009}
{:T_DATA=>116084}
{:T_HASH=>122015}
{:T_STRING=>172647}
{:T_IMEMO=>371132}
{:T_ARRAY=>433056}
After:
{:T_SYMBOL=>11}
{:T_REGEXP=>17}
{:T_STRUCT=>6500}
{:T_MATCH=>12004}
{:T_OBJECT=>99009}
{:T_DATA=>100088}
{:T_HASH=>122015}
{:T_STRING=>159637}
{:T_IMEMO=>363134}
{:T_ARRAY=>433056}
```
This commit introduces a functional Path AST visitor and implements
`each` on the AST in terms of the functional visitor. The functional
visitor doesn't maintain state, so we only need to allocate one of them.
Given this benchmark route file:
```ruby
require 'action_pack'
require 'action_dispatch'
route_set = ActionDispatch::Routing::RouteSet.new
routes = ActionDispatch::Routing::Mapper.new route_set
ObjectSpace::AllocationTracer.setup(%i{path line type})
result = ObjectSpace::AllocationTracer.trace do
500.times{|i|
routes.resource :omglol
}
end
result.find_all { |k,v| k.first =~ /git\/rails/ }.sort_by { |k,v|
v.first
}.each { |k,v|
p k => v
}
```
node.rb line 17 was in our top 3 allocation spot:
```
{["/Users/aaron/git/rails/actionpack/lib/action_dispatch/journey/nodes/node.rb", 17, :T_OBJECT]=>[31526, 0, 28329, 0, 2, 1123160]}
{["/Users/aaron/git/rails/actionpack/lib/action_dispatch/routing/mapper.rb", 2080, :T_IMEMO]=>[34002, 0, 30563, 0, 2, 1211480]}
{["/Users/aaron/git/rails/actionpack/lib/action_dispatch/routing/mapper.rb", 2071, :T_IMEMO]=>[121934, 1, 109608, 0, 7, 4344400]}
```
This commit eliminates allocations at that place.
I wrote a utility that helps find areas where you could optimize your program using a frozen string instead of a string literal, it's called [let_it_go](https://github.com/schneems/let_it_go). After going through the output and adding `.freeze` I was able to eliminate the creation of 1,114 string objects on EVERY request to [codetriage](codetriage.com). How does this impact execution?
To look at memory:
```ruby
require 'get_process_mem'
mem = GetProcessMem.new
GC.start
GC.disable
1_114.times { " " }
before = mem.mb
after = mem.mb
GC.enable
puts "Diff: #{after - before} mb"
```
Creating 1,114 string objects results in `Diff: 0.03125 mb` of RAM allocated on every request. Or 1mb every 32 requests.
To look at raw speed:
```ruby
require 'benchmark/ips'
number_of_objects_reduced = 1_114
Benchmark.ips do |x|
x.report("freeze") { number_of_objects_reduced.times { " ".freeze } }
x.report("no-freeze") { number_of_objects_reduced.times { " " } }
end
```
We get the results
```
Calculating -------------------------------------
freeze 1.428k i/100ms
no-freeze 609.000 i/100ms
-------------------------------------------------
freeze 14.363k (± 8.5%) i/s - 71.400k
no-freeze 6.084k (± 8.1%) i/s - 30.450k
```
Now we can do some maths:
```ruby
ips = 6_226k # iterations / 1 second
call_time_before = 1.0 / ips # seconds per iteration
ips = 15_254 # iterations / 1 second
call_time_after = 1.0 / ips # seconds per iteration
diff = call_time_before - call_time_after
number_of_objects_reduced * diff * 100
# => 0.4530373333993266 miliseconds saved per request
```
So we're shaving off 1 second of execution time for every 220 requests.
Is this going to be an insane speed boost to any Rails app: nope. Should we merge it: yep.
p.s. If you know of a method call that doesn't modify a string input such as [String#gsub](b0e2da69f0/lib/let_it_go/core_ext/string.rb (L37)) please [give me a pull request to the appropriate file](b0e2da69f0/lib/let_it_go/core_ext/string.rb (L37)), or open an issue in LetItGo so we can track and freeze more strings.
Keep those strings Frozen
![](https://www.dropbox.com/s/z4dj9fdsv213r4v/let-it-go.gif?dl=1)
Move the Journey code underneath the ActionDispatch namespace so
that we don't pollute the global namespace with names that may
be used for models.
Fixesrails/journey#49.