Improve cache performance for bare string values

This commit introduces a performance optimization for cache entries with
bare string values such as view fragments.

A new `7.1` cache format has been added which includes the optimization,
and the `:message_pack` cache format now includes the optimization as
well.  (A new cache format is necessary because, during a rolling
deploy, unupgraded servers must be able to read cache entries from
upgraded servers, which means the optimization cannot be enabled for
existing apps by default.)

New apps will use the `7.1` cache format by default, and existing apps
can enable the format by setting `config.load_defaults 7.1`.  Cache
entries written using the `6.1` or `7.0` cache formats can be read when
using the `7.1` format.

**Benchmark**

  ```ruby
  # frozen_string_literal: true
  require "benchmark/ips"

  serializer_7_0 = ActiveSupport::Cache::SerializerWithFallback[:marshal_7_0]
  serializer_7_1 = ActiveSupport::Cache::SerializerWithFallback[:marshal_7_1]
  entry = ActiveSupport::Cache::Entry.new(Random.bytes(10_000), version: "123")

  Benchmark.ips do |x|
    x.report("dump 7.0") do
      $dumped_7_0 = serializer_7_0.dump(entry)
    end

    x.report("dump 7.1") do
      $dumped_7_1 = serializer_7_1.dump(entry)
    end

    x.compare!
  end

  Benchmark.ips do |x|
    x.report("load 7.0") do
      serializer_7_0.load($dumped_7_0)
    end

    x.report("load 7.1") do
      serializer_7_1.load($dumped_7_1)
    end

    x.compare!
  end
  ```

  ```
  Warming up --------------------------------------
              dump 7.0     5.482k i/100ms
              dump 7.1    10.987k i/100ms
  Calculating -------------------------------------
              dump 7.0     73.966k (± 6.9%) i/s -    367.294k in   5.005176s
              dump 7.1    127.193k (±17.8%) i/s -    615.272k in   5.081387s

  Comparison:
              dump 7.1:   127192.9 i/s
              dump 7.0:    73966.5 i/s - 1.72x  (± 0.00) slower

  Warming up --------------------------------------
              load 7.0     7.425k i/100ms
              load 7.1    26.237k i/100ms
  Calculating -------------------------------------
              load 7.0     85.574k (± 1.7%) i/s -    430.650k in   5.034065s
              load 7.1    264.877k (± 1.6%) i/s -      1.338M in   5.052976s

  Comparison:
              load 7.1:   264876.7 i/s
              load 7.0:    85573.7 i/s - 3.10x  (± 0.00) slower
  ```

Co-authored-by: Jean Boussier <jean.boussier@gmail.com>
This commit is contained in:
Jonathan Hefner 2023-05-02 16:34:51 -05:00
parent e2524e574b
commit daa0cb80db
8 changed files with 153 additions and 16 deletions

@ -1,12 +1,29 @@
* A new `7.1` cache format is available which includes an optimization for
bare string values such as view fragments. The `:message_pack` cache format
has also been modified to include this optimization.
The `7.1` cache format is used by default for new apps, and existing apps
can enable the format by setting `config.load_defaults 7.1` or by setting
`config.active_support.cache_format_version = 7.1` in `config/application.rb`
or a `config/environments/*.rb` file.
Cache entries written using the `6.1` or `7.0` cache formats can be read
when using the `7.1` format. To perform a rolling deploy of a Rails 7.1
upgrade, wherein servers that have not yet been upgraded must be able to
read caches from upgraded servers, leave the cache format unchanged on the
first deploy, then enable the `7.1` cache format on a subsequent deploy.
*Jonathan Hefner*
* `config.active_support.cache_format_version` now accepts `:message_pack` as
an option. `:message_pack` can reduce cache entry sizes and improve
performance, but requires the [`msgpack` gem](https://rubygems.org/gems/msgpack)
(>= 1.7.0).
Cache entries written using the `6.1` or `7.0` cache formats can be read
Cache entries written using the `6.1`, `7.0`, or `7.1` cache formats can be read
when using the `:message_pack` cache format. Additionally, cache entries
written using the `:message_pack` cache format can now be read when using
the `6.1` or `7.0` cache formats. These behaviors makes it easy to migrate
the `6.1`, `7.0`, or `7.1` cache formats. These behaviors makes it easy to migrate
between formats without invalidating the entire cache.
*Jonathan Hefner*

@ -651,6 +651,8 @@ def default_coder
Cache::SerializerWithFallback[:marshal_6_1]
when 7.0
Cache::SerializerWithFallback[:marshal_7_0]
when 7.1
Cache::SerializerWithFallback[:marshal_7_1]
else
Cache::SerializerWithFallback[Cache.format_version]
end

@ -11,6 +11,10 @@ def self.[](format)
SERIALIZERS.fetch(format)
end
def dump(entry)
try_dump_bare_string(entry) || _dump(entry)
end
def dump_compressed(entry, threshold)
dumped = dump(entry)
try_compress(dumped, threshold) || dumped
@ -21,10 +25,12 @@ def load(dumped)
dumped = decompress(dumped) if compressed?(dumped)
case
when loaded = try_load_bare_string(dumped)
loaded
when MessagePackWithFallback.dumped?(dumped)
MessagePackWithFallback._load(dumped)
when Marshal70WithFallback.dumped?(dumped)
Marshal70WithFallback._load(dumped)
when Marshal71WithFallback.dumped?(dumped)
Marshal71WithFallback._load(dumped)
when Marshal61WithFallback.dumped?(dumped)
Marshal61WithFallback._load(dumped)
else
@ -40,6 +46,45 @@ def load(dumped)
end
private
BARE_STRING_SIGNATURES = {
255 => Encoding::UTF_8,
254 => Encoding::BINARY,
253 => Encoding::US_ASCII,
}
BARE_STRING_TEMPLATE = "CEl<"
BARE_STRING_EXPIRES_AT_TEMPLATE = "@1E"
BARE_STRING_VERSION_LENGTH_TEMPLATE = "@#{[0].pack(BARE_STRING_EXPIRES_AT_TEMPLATE).bytesize}l<"
BARE_STRING_VERSION_INDEX = [0].pack(BARE_STRING_VERSION_LENGTH_TEMPLATE).bytesize
def try_dump_bare_string(entry)
value = entry.value
return if !value.instance_of?(String)
version = entry.version
return if version && version.encoding != Encoding::UTF_8
signature = BARE_STRING_SIGNATURES.key(value.encoding)
return if !signature
packed = [signature, entry.expires_at || -1.0, version&.bytesize || -1].pack(BARE_STRING_TEMPLATE)
packed << version if version
packed << value
end
def try_load_bare_string(dumped)
encoding = BARE_STRING_SIGNATURES[dumped.getbyte(0)]
return if !encoding
expires_at = dumped.unpack1(BARE_STRING_EXPIRES_AT_TEMPLATE)
version_length = dumped.unpack1(BARE_STRING_VERSION_LENGTH_TEMPLATE)
value_index = BARE_STRING_VERSION_INDEX + [version_length, 0].max
Cache::Entry.new(
dumped.byteslice(value_index..-1).force_encoding(encoding),
version: dumped.byteslice(BARE_STRING_VERSION_INDEX, version_length)&.force_encoding(Encoding::UTF_8),
expires_at: (expires_at unless expires_at < 0),
)
end
ZLIB_HEADER = "\x78".b.freeze
def compressed?(dumped)
@ -105,14 +150,14 @@ def dumped?(dumped)
end
end
module Marshal70WithFallback
module Marshal71WithFallback
include SerializerWithFallback
extend self
MARK_UNCOMPRESSED = "\x00".b.freeze
MARK_COMPRESSED = "\x01".b.freeze
def dump(entry)
def _dump(entry)
MARK_UNCOMPRESSED + Marshal.dump(entry.pack)
end
@ -136,11 +181,17 @@ def dumped?(dumped)
end
end
module Marshal70WithFallback
include Marshal71WithFallback
extend self
alias :dump :_dump # Prevent dumping bare strings.
end
module MessagePackWithFallback
include SerializerWithFallback
extend self
def dump(entry)
def _dump(entry)
ActiveSupport::MessagePack::CacheSerializer.dump(entry.pack)
end
@ -167,6 +218,7 @@ def available?
passthrough: PassthroughWithFallback,
marshal_6_1: Marshal61WithFallback,
marshal_7_0: Marshal70WithFallback,
marshal_7_1: Marshal71WithFallback,
message_pack: MessagePackWithFallback,
}
end

@ -5,7 +5,7 @@
module CacheStoreFormatVersionBehavior
extend ActiveSupport::Concern
FORMAT_VERSIONS = [6.1, 7.0, :message_pack]
FORMAT_VERSIONS = [6.1, 7.0, 7.1, :message_pack]
included do
test "format version affects default coder" do

@ -45,6 +45,64 @@ class CacheSerializerWithFallbackTest < ActiveSupport::TestCase
end
end
(FORMATS - [:passthrough, :marshal_6_1, :marshal_7_0]).each do |format|
test "#{format.inspect} serializer preserves version with bare string" do
entry = ActiveSupport::Cache::Entry.new("abc", version: "123")
assert_entry entry, roundtrip(format, entry)
end
test "#{format.inspect} serializer preserves expiration with bare string" do
entry = ActiveSupport::Cache::Entry.new("abc", expires_in: 123)
assert_entry entry, roundtrip(format, entry)
end
test "#{format.inspect} serializer preserves encoding of version with bare string" do
[Encoding::UTF_8, Encoding::BINARY].each do |encoding|
version = "123".encode(encoding)
roundtripped = roundtrip(format, ActiveSupport::Cache::Entry.new("abc", version: version))
assert_equal version.encoding, roundtripped.version.encoding
end
end
test "#{format.inspect} serializer preserves encoding of bare string" do
[Encoding::UTF_8, Encoding::BINARY, Encoding::US_ASCII].each do |encoding|
string = "abc".encode(encoding)
roundtripped = roundtrip(format, ActiveSupport::Cache::Entry.new(string))
assert_equal string.encoding, roundtripped.value.encoding
end
end
test "#{format.inspect} serializer dumps bare string with reduced overhead when possible" do
string = "abc"
options = { version: "123", expires_in: 123 }
unsupported = string.encode(Encoding::WINDOWS_1252)
unoptimized = serializer(format).dump(ActiveSupport::Cache::Entry.new(unsupported, **options))
[Encoding::UTF_8, Encoding::BINARY, Encoding::US_ASCII].each do |encoding|
supported = string.encode(encoding)
optimized = serializer(format).dump(ActiveSupport::Cache::Entry.new(supported, **options))
assert_operator optimized.size, :<, unoptimized.size
end
end
test "#{format.inspect} serializer can compress bare strings" do
entry = ActiveSupport::Cache::Entry.new("abc" * 100, version: "123", expires_in: 123)
compressed = serializer(format).dump_compressed(entry, 1)
uncompressed = serializer(format).dump_compressed(entry, 100_000)
assert_operator compressed.bytesize, :<, uncompressed.bytesize
end
end
[:passthrough, :marshal_6_1, :marshal_7_0].each do |format|
test "#{format.inspect} serializer dumps bare string in a backward compatible way" do
string = +"abc"
string.instance_variable_set(:@baz, true)
roundtripped = roundtrip(format, ActiveSupport::Cache::Entry.new(string))
assert roundtripped.value.instance_variable_get(:@baz)
end
end
test ":message_pack serializer handles missing class gracefully" do
klass = Class.new do
def self.name; "DoesNotActuallyExist"; end
@ -68,10 +126,14 @@ def serializer(format)
ActiveSupport::Cache::SerializerWithFallback[format]
end
def roundtrip(format, entry)
serializer(format).load(serializer(format).dump(entry))
end
def assert_entry(expected, actual)
assert_equal expected.value, actual.value
assert_equal expected.version, actual.version
assert_equal expected.expires_at, actual.expires_at
assert_equal \
[expected.value, expected.version, expected.expires_at],
[actual.value, actual.version, actual.expires_at]
end
def assert_logs(pattern, &block)

@ -74,6 +74,7 @@ Below are the default values associated with each target version. In cases of co
- [`config.active_record.run_after_transaction_callbacks_in_order_defined`](#config-active-record-run-after-transaction-callbacks-in-order-defined): `true`
- [`config.active_record.run_commit_callbacks_on_first_saved_instances_in_transaction`](#config-active-record-run-commit-callbacks-on-first-saved-instances-in-transaction): `false`
- [`config.active_record.sqlite3_adapter_strict_strings_by_default`](#config-active-record-sqlite3-adapter-strict-strings-by-default): `true`
- [`config.active_support.cache_format_version`](#config-active-support-cache-format-version): `7.1`
- [`config.active_support.default_message_encryptor_serializer`](#config-active-support-default-message-encryptor-serializer): `:json`
- [`config.active_support.default_message_verifier_serializer`](#config-active-support-default-message-verifier-serializer): `:json`
- [`config.active_support.raise_on_invalid_cache_expiration_time`](#config-active-support-raise-on-invalid-cache-expiration-time): `true`
@ -2266,10 +2267,11 @@ The default value depends on the `config.load_defaults` target version:
#### `config.active_support.cache_format_version`
Specifies which serialization format to use for the cache. Possible values are
`6.1`, `7.0`, and `:message_pack`.
`6.1`, `7.0`, `7.1`, and `:message_pack`.
The `6.1` and `7.0` formats both use `Marshal`, but the latter uses a more
efficient cache entry representation.
The `6.1`, `7.0`, and `7.1` formats all use `Marshal`, but `7.0` uses a more
efficient representation for cache entries, and `7.1` includes an additional
optimization for bare string values such as view fragments.
The `:message_pack` format uses `ActiveSupport::MessagePack`, and may further
reduce cache entry sizes and improve performance, but requires the
@ -2285,6 +2287,7 @@ The default value depends on the `config.load_defaults` target version:
| --------------------- | -------------------- |
| (original) | `6.1` |
| 7.0 | `7.0` |
| 7.1 | `7.1` |
#### `config.active_support.deprecation`

@ -302,6 +302,7 @@ def load_defaults(target_version)
end
if respond_to?(:active_support)
active_support.cache_format_version = 7.1
active_support.default_message_encryptor_serializer = :json
active_support.default_message_verifier_serializer = :json
active_support.use_message_serializer_for_metadata = true

@ -4125,10 +4125,10 @@ def new(app); self; end
assert_equal :fiber, ActiveSupport::IsolatedExecutionState.isolation_level
end
test "ActiveSupport::Cache.format_version is 7.0 by default for new apps" do
test "ActiveSupport::Cache.format_version is 7.1 by default for new apps" do
app "development"
assert_equal 7.0, ActiveSupport::Cache.format_version
assert_equal 7.1, ActiveSupport::Cache.format_version
end
test "ActiveSupport::Cache.format_version is 6.1 by default for upgraded apps" do