Avoid double serialization of message data

Prior to this commit, messages with metadata were always serialized in
the following way:

  ```ruby
  Base64.strict_encode64(
    ActiveSupport::JSON.encode({
      "_rails" => {
        "message" => Base64.strict_encode64(
          serializer.dump(data)
        ),
        "pur" => "the purpose",
        "exp" => "the expiration"
      },
    })
  )
  ```

in which the message data is serialized and URL-encoded twice.

This commit changes message serialization such that, when possible, the
data is serialized and URL-encoded only once:

  ```ruby
  Base64.strict_encode64(
    serializer.dump({
      "_rails" => {
        "data" => data,
        "pur" => "the purpose",
        "exp" => "the expiration"
      },
    })
  )
  ```

This improves performance in proportion to the size of the data:

**Benchmark**

  ```ruby
  # frozen_string_literal: true
  require "benchmark/ips"
  require "active_support/all"

  verifier = ActiveSupport::MessageVerifier.new("secret", serializer: JSON)

  payloads = [
    { "content" => "x" * 100 },
    { "content" => "x" * 2000 },
    { "content" => "x" * 1_000_000 },
  ]

  if ActiveSupport::Messages::Metadata.respond_to?(:use_message_serializer_for_metadata)
    ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata = true
  end

  Benchmark.ips do |x|
    payloads.each do |payload|
      x.report("generate ~#{payload["content"].size}B") do
        $generated_message = verifier.generate(payload, purpose: "x")
      end

      x.report("verify ~#{payload["content"].size}B") do
        verifier.verify($generated_message, purpose: "x")
      end
    end
  end

  puts

  puts "Message size:"
  payloads.each do |payload|
    puts "  ~#{payload["content"].size} bytes of data => " \
      "#{verifier.generate(payload, purpose: "x").size} byte message"
  end
  ```

**Before**

  ```
  Warming up --------------------------------------
        generate ~100B     1.578k i/100ms
          verify ~100B     2.506k i/100ms
       generate ~2000B   447.000  i/100ms
         verify ~2000B     1.409k i/100ms
    generate ~1000000B     1.000  i/100ms
      verify ~1000000B     6.000  i/100ms
  Calculating -------------------------------------
        generate ~100B     15.807k (± 1.8%) i/s -     80.478k in   5.093161s
          verify ~100B     25.240k (± 2.1%) i/s -    127.806k in   5.066096s
       generate ~2000B      4.530k (± 2.4%) i/s -     22.797k in   5.035398s
         verify ~2000B     14.136k (± 2.3%) i/s -     71.859k in   5.086267s
    generate ~1000000B     11.673  (± 0.0%) i/s -     59.000  in   5.060598s
      verify ~1000000B     64.372  (± 6.2%) i/s -    324.000  in   5.053304s

  Message size:
    ~100 bytes of data => 306 byte message
    ~2000 bytes of data => 3690 byte message
    ~1000000 bytes of data => 1777906 byte message
  ```

**After**

  ```
  Warming up --------------------------------------
        generate ~100B     4.689k i/100ms
          verify ~100B     3.183k i/100ms
       generate ~2000B     2.722k i/100ms
         verify ~2000B     2.066k i/100ms
    generate ~1000000B    12.000  i/100ms
      verify ~1000000B    11.000  i/100ms
  Calculating -------------------------------------
        generate ~100B     46.984k (± 1.2%) i/s -    239.139k in   5.090540s
          verify ~100B     32.043k (± 1.2%) i/s -    162.333k in   5.066903s
       generate ~2000B     27.163k (± 1.2%) i/s -    136.100k in   5.011254s
         verify ~2000B     20.726k (± 1.7%) i/s -    105.366k in   5.085442s
    generate ~1000000B    125.600  (± 1.6%) i/s -    636.000  in   5.064607s
      verify ~1000000B    122.039  (± 4.1%) i/s -    616.000  in   5.058386s

  Message size:
    ~100 bytes of data => 234 byte message
    ~2000 bytes of data => 2770 byte message
    ~1000000 bytes of data => 1333434 byte message
  ```

This optimization is only applied for recognized serializers that are
capable of serializing a `Hash`.

Additionally, because the optimization changes the message format, a
`config.active_support.use_message_serializer_for_metadata` option has
been added to disable it.  The optimization is disabled by default, but
enabled with `config.load_defaults 7.1`.

Regardless of whether the optimization is enabled, messages using either
format can still be read.

In the case of a rolling deploy of a Rails upgrade, wherein servers that
have not yet been upgraded must be able to read messages from upgraded
servers, the optimization can be disabled on first deploy, then safely
enabled on a subsequent deploy.
This commit is contained in:
Jonathan Hefner 2022-12-26 14:49:43 -06:00
parent ebc3b660e5
commit 91bb5da5fc
10 changed files with 194 additions and 80 deletions

@ -85,6 +85,7 @@ module ActiveSupport
#
# crypt.rotate old_secret, cipher: "aes-256-cbc"
class MessageEncryptor
include Messages::Metadata
prepend Messages::Rotator::Encryptor
cattr_accessor :use_authenticated_message_encryption, instance_accessor: false, default: false
@ -221,13 +222,7 @@ def self.key_len(cipher = default_cipher)
end
private
def serialize(value)
@serializer.dump(value)
end
def deserialize(value)
@serializer.load(value)
end
attr_reader :serializer
def encode(data)
@url_safe ? ::Base64.urlsafe_encode64(data, padding: false) : ::Base64.strict_encode64(data)
@ -246,7 +241,7 @@ def _encrypt(value, **metadata_options)
iv = cipher.random_iv
cipher.auth_data = "" if aead_mode?
encrypted_data = cipher.update(Messages::Metadata.wrap(serialize(value), **metadata_options))
encrypted_data = cipher.update(serialize_with_metadata(value, **metadata_options))
encrypted_data << cipher.final
parts = [encrypted_data, iv]
@ -275,8 +270,7 @@ def _decrypt(encrypted_message, purpose)
decrypted_data = cipher.update(encrypted_data)
decrypted_data << cipher.final
message = Messages::Metadata.verify(decrypted_data, purpose)
deserialize(message) if message
deserialize_with_metadata(decrypted_data, purpose: purpose)
rescue OpenSSLCipherError, TypeError, ArgumentError, ::JSON::ParserError
raise InvalidMessage
end

@ -119,6 +119,7 @@ module ActiveSupport
# @verifier = ActiveSupport::MessageVerifier.new("secret", url_safe: true)
# @verifier.generate("signed message") #=> URL-safe string
class MessageVerifier
include Messages::Metadata
prepend Messages::Rotator::Verifier
class InvalidSignature < StandardError; end
@ -198,8 +199,7 @@ def verified(signed_message, purpose: nil, **)
data, digest = get_data_and_digest_from(signed_message)
if digest_matches_data?(digest, data)
begin
message = Messages::Metadata.verify(decode(data), purpose)
@serializer.load(message) if message
deserialize_with_metadata(decode(data), purpose: purpose)
rescue ArgumentError => argument_error
return if argument_error.message.include?("invalid base64")
raise
@ -274,11 +274,14 @@ def verify(*args, **options)
# specified when verifying the message; otherwise, verification will fail.
# (See #verified and #verify.)
def generate(value, expires_at: nil, expires_in: nil, purpose: nil)
data = encode(Messages::Metadata.wrap(@serializer.dump(value), expires_at: expires_at, expires_in: expires_in, purpose: purpose))
"#{data}#{SEPARATOR}#{generate_digest(data)}"
data = encode(serialize_with_metadata(value, expires_at: expires_at, expires_in: expires_in, purpose: purpose))
digest = generate_digest(data)
data << SEPARATOR << digest
end
private
attr_reader :serializer
def encode(data)
@url_safe ? Base64.urlsafe_encode64(data, padding: false) : Base64.strict_encode64(data)
end

@ -1,83 +1,101 @@
# frozen_string_literal: true
require "time"
require "active_support/json"
module ActiveSupport
module Messages # :nodoc:
class Metadata # :nodoc:
def initialize(message, expires_at = nil, purpose = nil)
@message, @purpose = message, purpose
@expires_at = expires_at.is_a?(String) ? parse_expires_at(expires_at) : expires_at
end
module Metadata # :nodoc:
singleton_class.attr_accessor :use_message_serializer_for_metadata
def as_json(options = {})
{ _rails: { message: @message, exp: @expires_at, pur: @purpose } }
end
class << self
def wrap(message, expires_at: nil, expires_in: nil, purpose: nil)
if expires_at || expires_in || purpose
JSON.encode new(encode(message), pick_expiry(expires_at, expires_in), purpose)
else
message
end
end
def verify(message, purpose)
extract_metadata(message).verify(purpose)
end
private
def pick_expiry(expires_at, expires_in)
if expires_at
expires_at.utc.iso8601(3)
elsif expires_in
Time.now.utc.advance(seconds: expires_in).iso8601(3)
end
end
def extract_metadata(message)
begin
data = JSON.decode(message) if message.start_with?('{"_rails":')
rescue ::JSON::JSONError
end
if data
new(decode(data["_rails"]["message"]), data["_rails"]["exp"], data["_rails"]["pur"])
else
new(message)
end
end
def encode(message)
::Base64.strict_encode64(message)
end
def decode(message)
::Base64.strict_decode64(message)
end
end
def verify(purpose)
@message if match?(purpose) && fresh?
end
ENVELOPE_SERIALIZERS = [
::JSON,
ActiveSupport::JSON,
ActiveSupport::JsonWithMarshalFallback,
Marshal,
]
private
def match?(purpose)
@purpose.to_s == purpose.to_s
def serialize_with_metadata(data, **metadata)
has_metadata = metadata.any? { |k, v| v }
if has_metadata && !use_message_serializer_for_metadata?
data_string = serialize_to_json_safe_string(data)
envelope = wrap_in_metadata_envelope({ "message" => data_string }, **metadata)
ActiveSupport::JSON.encode(envelope)
else
data = wrap_in_metadata_envelope({ "data" => data }, **metadata) if has_metadata
serializer.dump(data)
end
end
def fresh?
@expires_at.nil? || Time.now.utc < @expires_at
def deserialize_with_metadata(message, **expected_metadata)
if dual_serialized_metadata_envelope_json?(message)
envelope = ActiveSupport::JSON.decode(message)
extracted = extract_from_metadata_envelope(envelope, **expected_metadata)
deserialize_from_json_safe_string(extracted["message"]) if extracted
else
deserialized = serializer.load(message)
if metadata_envelope?(deserialized)
extracted = extract_from_metadata_envelope(deserialized, **expected_metadata)
extracted["data"] if extracted
else
deserialized if expected_metadata.none? { |k, v| v }
end
end
end
def parse_expires_at(expires_at)
if ActiveSupport.use_standard_json_time_format
def use_message_serializer_for_metadata?
Metadata.use_message_serializer_for_metadata && Metadata::ENVELOPE_SERIALIZERS.include?(serializer)
end
def wrap_in_metadata_envelope(hash, expires_at: nil, expires_in: nil, purpose: nil)
expiry = pick_expiry(expires_at, expires_in)
hash["exp"] = expiry if expiry
hash["pur"] = purpose.to_s if purpose
{ "_rails" => hash }
end
def extract_from_metadata_envelope(envelope, purpose: nil)
hash = envelope["_rails"]
return if hash["exp"] && Time.now.utc >= parse_expiry(hash["exp"])
return if hash["pur"] != purpose&.to_s
hash
end
def metadata_envelope?(object)
object.is_a?(Hash) && object.key?("_rails")
end
def dual_serialized_metadata_envelope_json?(string)
string.start_with?('{"_rails":{"message":')
end
def pick_expiry(expires_at, expires_in)
if expires_at
expires_at.utc.iso8601(3)
elsif expires_in
Time.now.utc.advance(seconds: expires_in).iso8601(3)
end
end
def parse_expiry(expires_at)
if !expires_at.is_a?(String)
expires_at
elsif ActiveSupport.use_standard_json_time_format
Time.iso8601(expires_at)
else
Time.parse(expires_at)
end
end
def serialize_to_json_safe_string(data)
::Base64.strict_encode64(serializer.dump(data))
end
def deserialize_from_json_safe_string(string)
serializer.load(::Base64.strict_decode64(string))
end
end
end
end

@ -192,5 +192,12 @@ class Railtie < Rails::Railtie # :nodoc:
end
end
end
initializer "active_support.set_use_message_serializer_for_metadata" do |app|
config.after_initialize do
ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata =
app.config.active_support.use_message_serializer_for_metadata
end
end
end
end

@ -2,6 +2,7 @@
require "active_support/json"
require "active_support/time"
require "active_support/messages/metadata"
module MessageMetadataTests
extend ActiveSupport::Concern
@ -89,6 +90,17 @@ module MessageMetadataTests
codec = make_codec(serializer: ActiveSupport::MessageEncryptor::NullSerializer)
assert_roundtrip "a string", codec, { purpose: "x", expires_in: 1.year }, { purpose: "x" }
end
test "messages are readable regardless of use_message_serializer_for_metadata" do
each_scenario do |data, codec|
message = encode(data, codec, purpose: "x")
message_setting = ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata
using_message_serializer_for_metadata(!message_setting) do
assert_equal data, decode(message, codec, purpose: "x")
end
end
end
end
private
@ -116,11 +128,23 @@ def self.load(value)
["a string", 123, Time.local(2004), { "key" => "value" }],
]
def using_message_serializer_for_metadata(value = true)
original = ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata
ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata = value
yield
ensure
ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata = original
end
def each_scenario
SERIALIZERS.each do |serializer|
codec = make_codec(serializer: serializer)
DATA.each do |data|
yield data, codec
[false, true].each do |use_message_serializer_for_metadata|
using_message_serializer_for_metadata(use_message_serializer_for_metadata) do
SERIALIZERS.each do |serializer|
codec = make_codec(serializer: serializer)
DATA.each do |data|
yield data, codec
end
end
end
end
end

@ -56,6 +56,21 @@ class MessageVerifierMetadataTest < ActiveSupport::TestCase
end
end
test "messages are readable by legacy versions when use_message_serializer_for_metadata = false" do
# Message generated by Rails 7.0 using:
#
# verifier = ActiveSupport::MessageVerifier.new("secret", serializer: JSON)
# legacy_message = verifier.generate("legacy", purpose: "test", expires_at: Time.utc(3000))
#
legacy_message = "eyJfcmFpbHMiOnsibWVzc2FnZSI6IklteGxaMkZqZVNJPSIsImV4cCI6IjMwMDAtMDEtMDFUMDA6MDA6MDAuMDAwWiIsInB1ciI6InRlc3QifX0=--81b11c317dba91cedd86ab79b7d7e68de8d290b3"
verifier = ActiveSupport::MessageVerifier.new("secret", serializer: JSON)
using_message_serializer_for_metadata(false) do
assert_equal legacy_message, verifier.generate("legacy", purpose: "test", expires_at: Time.utc(3000))
end
end
private
def make_codec(**options)
ActiveSupport::MessageVerifier.new("secret", **options)

@ -2195,6 +2195,21 @@ The default value depends on the `config.load_defaults` target version:
| (original) | `false` |
| 5.2 | `true` |
#### `config.active_support.use_message_serializer_for_metadata`
When `true`, enables a performance optimization that serializes message data and
metadata together. This changes the message format, so messages serialized this
way cannot be read by older (< 7.1) versions of Rails. However, messages that
use the old format can still be read, regardless of whether this optimization is
enabled.
The default value depends on the `config.load_defaults` target version:
| Starting with version | The default value is |
| --------------------- | -------------------- |
| (original) | `false` |
| 7.1 | `true` |
#### `config.active_support.cache_format_version`
Specifies which version of the cache serializer to use. Possible values are `6.1` and `7.0`.

@ -306,6 +306,7 @@ def load_defaults(target_version)
if respond_to?(:active_support)
active_support.default_message_encryptor_serializer = :json
active_support.default_message_verifier_serializer = :json
active_support.use_message_serializer_for_metadata = true
active_support.raise_on_invalid_cache_expiration_time = true
end

@ -99,6 +99,17 @@
#
# For detailed migration steps, check out https://guides.rubyonrails.org/v7.1/upgrading_ruby_on_rails.html#new-activesupport-messageverifier-default-serializer
# Enable a performance optimization that serializes message data and metadata
# together. This changes the message format, so messages serialized this way
# cannot be read by older versions of Rails. However, messages that use the old
# format can still be read, regardless of whether this optimization is enabled.
#
# To perform a rolling deploy of a Rails 7.1 upgrade, wherein servers that have
# not yet been upgraded must be able to read messages from upgraded servers,
# leave this optimization off on the first deploy, then enable it on a
# subsequent deploy.
# Rails.application.config.active_support.use_message_serializer_for_metadata = true
# Set the maximum size for Rails log files.
#
# `config.load_defaults 7.1` does not set this value for environments other than

@ -3793,6 +3793,32 @@ class Post < ActiveRecord::Base
assert_equal :hybrid, ActiveSupport::MessageVerifier.default_message_verifier_serializer
end
test "ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata is true by default for new apps" do
app "development"
assert ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata
end
test "ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata is false by default for upgraded apps" do
remove_from_config '.*config\.load_defaults.*\n'
app "development"
assert_not ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata
end
test "ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata can be configured via config.active_support.use_message_serializer_for_metadata" do
remove_from_config '.*config\.load_defaults.*\n'
app_file "config/initializers/new_framework_defaults_7_1.rb", <<~RUBY
Rails.application.config.active_support.use_message_serializer_for_metadata = true
RUBY
app "development"
assert ActiveSupport::Messages::Metadata.use_message_serializer_for_metadata
end
test "unknown_asset_fallback is false by default" do
app "development"