Add support for HTML5 sanitizers

* new config value: action_view.sanitizer_vendor
* SanitizerHelper defaults to Rails::HTML4::Sanitizer
* 7.1 config defaults to Rails::HTML5::Sanitizer if it's supported
This commit is contained in:
Mike Dalessio 2023-05-19 12:18:11 -04:00
parent 500ccaaeea
commit ce43ac6088
No known key found for this signature in database
GPG Key ID: 6291CAA755BBD80D
7 changed files with 117 additions and 15 deletions

@ -1,3 +1,16 @@
* Add support for HTML5 standards-compliant sanitizers, and default to `Rails::HTML5::Sanitizer`
in the Rails 7.1 configuration if it is supported.
Action View's HTML sanitizers can be configured by setting
`config.action_view.sanitizer_vendor`. Supported values are `Rails::HTML4::Sanitizer` or
`Rails::HTML5::Sanitizer`.
The Rails 7.1 configuration will set this to `Rails::HTML5::Sanitizer` when it is supported, and
fall back to `Rails::HTML4::Sanitizer`. Previous configurations default to
`Rails::HTML4::Sanitizer`.
*Mike Dalessio*
* Add support for the HTML picture tag. It supports passing a String, an Array or a Block.
Supports passing properties directly to the img tag via the `:image` key.
Since the picture tag requires an img tag, the last element you provide will be used for the img tag.

@ -9,18 +9,17 @@ module Helpers # :nodoc:
# The SanitizeHelper module provides a set of methods for scrubbing text of undesired HTML elements.
# These helper methods extend Action View making them callable within your template files.
module SanitizeHelper
mattr_accessor :sanitizer_vendor, default: Rails::Html::Sanitizer
mattr_accessor :sanitizer_vendor, default: Rails::HTML4::Sanitizer
extend ActiveSupport::Concern
# Sanitizes HTML input, stripping all but known-safe tags and attributes.
#
# It also strips href/src attributes with unsafe protocols like
# <tt>javascript:</tt>, while also protecting against attempts to use Unicode,
# ASCII, and hex character references to work around these protocol filters.
# All special characters will be escaped.
# It also strips href/src attributes with unsafe protocols like <tt>javascript:</tt>, while
# also protecting against attempts to use Unicode, ASCII, and hex character references to work
# around these protocol filters.
#
# The default sanitizer is Rails::Html::SafeListSanitizer. See {Rails HTML
# The default sanitizer is Rails::HTML5::SafeListSanitizer. See {Rails HTML
# Sanitizers}[https://github.com/rails/rails-html-sanitizer] for more information.
#
# Custom sanitization rules can also be provided.
@ -32,7 +31,7 @@ module SanitizeHelper
#
# * <tt>:tags</tt> - An array of allowed tags.
# * <tt>:attributes</tt> - An array of allowed attributes.
# * <tt>:scrubber</tt> - A {Rails::Html scrubber}[https://github.com/rails/rails-html-sanitizer]
# * <tt>:scrubber</tt> - A {Rails::HTML scrubber}[https://github.com/rails/rails-html-sanitizer]
# or {Loofah::Scrubber}[https://github.com/flavorjones/loofah] object that
# defines custom sanitization rules. A custom scrubber takes precedence over
# custom tags and attributes.
@ -47,9 +46,9 @@ module SanitizeHelper
#
# <%= sanitize @comment.body, tags: %w(strong em a), attributes: %w(href) %>
#
# Providing a custom Rails::Html scrubber:
# Providing a custom Rails::HTML scrubber:
#
# class CommentScrubber < Rails::Html::PermitScrubber
# class CommentScrubber < Rails::HTML::PermitScrubber
# def initialize
# super
# self.tags = %w( form script comment blockquote )
@ -64,7 +63,7 @@ module SanitizeHelper
# <%= sanitize @comment.body, scrubber: CommentScrubber.new %>
#
# See {Rails HTML Sanitizer}[https://github.com/rails/rails-html-sanitizer] for
# documentation about Rails::Html scrubbers.
# documentation about Rails::HTML scrubbers.
#
# Providing a custom Loofah::Scrubber:
#
@ -82,6 +81,22 @@ module SanitizeHelper
# # In config/application.rb
# config.action_view.sanitized_allowed_tags = ['strong', 'em', 'a']
# config.action_view.sanitized_allowed_attributes = ['href', 'title']
#
# The default, starting in \Rails 7.1, is to use an HTML5 parser for sanitization (if it is
# available, see NOTE below). If you wish to revert back to the previous HTML4 behavior, you
# can do so by setting the following in your application configuration:
#
# # In config/application.rb
# config.action_view.sanitizer_vendor = Rails::HTML4::Sanitizer
#
# Or, if you're upgrading from a previous version of \Rails and wish to opt into the HTML5
# behavior:
#
# # In config/application.rb
# config.action_view.sanitizer_vendor = Rails::HTML5::Sanitizer
#
# NOTE: Rails::HTML5::Sanitizer is not supported on JRuby, so on JRuby platforms \Rails will
# fall back to use Rails::HTML4::Sanitizer.
def sanitize(html, options = {})
self.class.safe_list_sanitizer.sanitize(html, options)&.html_safe
end
@ -140,7 +155,7 @@ def sanitized_allowed_attributes
sanitizer_vendor.safe_list_sanitizer.allowed_attributes
end
# Gets the Rails::Html::FullSanitizer instance used by +strip_tags+. Replace with
# Gets the Rails::HTML::FullSanitizer instance used by +strip_tags+. Replace with
# any object that responds to +sanitize+.
#
# class Application < Rails::Application
@ -150,7 +165,7 @@ def full_sanitizer
@full_sanitizer ||= sanitizer_vendor.full_sanitizer.new
end
# Gets the Rails::Html::LinkSanitizer instance used by +strip_links+.
# Gets the Rails::HTML::LinkSanitizer instance used by +strip_links+.
# Replace with any object that responds to +sanitize+.
#
# class Application < Rails::Application
@ -160,7 +175,7 @@ def link_sanitizer
@link_sanitizer ||= sanitizer_vendor.link_sanitizer.new
end
# Gets the Rails::Html::SafeListSanitizer instance used by sanitize and +sanitize_css+.
# Gets the Rails::HTML::SafeListSanitizer instance used by sanitize and +sanitize_css+.
# Replace with any object that responds to +sanitize+.
#
# class Application < Rails::Application

@ -46,6 +46,12 @@ class Railtie < Rails::Engine # :nodoc:
ActionView::Helpers::ContentExfiltrationPreventionHelper.prepend_content_exfiltration_prevention = prepend_content_exfiltration_prevention
end
initializer "action_view.sanitizer_vendor" do |app|
if klass = app.config.action_view.delete(:sanitizer_vendor)
ActionView::Helpers::SanitizeHelper.sanitizer_vendor = klass
end
end
config.after_initialize do |app|
button_to_generates_button_tag = app.config.action_view.delete(:button_to_generates_button_tag)
unless button_to_generates_button_tag.nil?

@ -179,6 +179,10 @@ def sanitize(html, options = {})
# We don't want to do exhaustive HTML sanitization testing here. Let's assume it's already being
# done upstream by the vendor.
#
# Note that Rails::Html::Sanitizer and Rails::HTML4::Sanitizer are identical vendors (but aren't
# the same class). Eventually we will move away from using Rails::Html (a.k.a Rails::HTML), but
# for now we should make sure everything works as expected by testing it.
#
module SanitizeHelperVendorTests
def setup
super
@ -311,7 +315,7 @@ def test_sanitize_with_loofah_scrubber_option
end
def test_sanitize_with_custom_scrubber_option
scrubber = Class.new(Rails::Html::PermitScrubber) do
scrubber = Class.new(Rails::HTML::PermitScrubber) do
def initialize
super
self.tags = ["div"]
@ -345,6 +349,26 @@ def test_strip_links
assert_equal("<div>Example of a fragment</div>", result)
end
def test_we_get_the_expected_HTML_parser
# see https://html.spec.whatwg.org/multipage/parsing.html#misnested-tags:-b-i-/b-/i
input = %(<p>1<b>2<i>3</b>4</i>5</p>)
scrubber = Loofah::Scrubber.new { |_| } # no-op, we're checking the underlying parser here
expected = if vendor == Rails::Html::Sanitizer || vendor == Rails::HTML4::Sanitizer
if RUBY_ENGINE == "jruby"
"<p>1<b>2<i>3</i></b><i>4</i>5</p>" # nekohtml parser
else
"<p>1<b>2<i>3</i></b>45</p>" # libxml2 html4 parser
end
elsif vendor == Rails::HTML5::Sanitizer
"<p>1<b>2<i>3</i></b><i>4</i>5</p>" # libgumbo html5 parser
else
flunk "Unknown vendor #{vendor}"
end
assert_equal(expected, @subject.sanitize(input, scrubber: scrubber))
end
end
class SanitizeHelperVendorHtmlTest < ActiveSupport::TestCase
@ -354,3 +378,19 @@ def vendor
Rails::Html::Sanitizer
end
end
class SanitizeHelperVendorHTML4Test < ActiveSupport::TestCase
include SanitizeHelperVendorTests
def vendor
Rails::HTML4::Sanitizer
end
end
class SanitizeHelperVendorHTML5Test < ActiveSupport::TestCase
include SanitizeHelperVendorTests
def vendor
Rails::HTML5::Sanitizer
end
end if Rails::HTML::Sanitizer.html5_support?

@ -62,6 +62,7 @@ Below are the default values associated with each target version. In cases of co
- [`config.action_controller.allow_deprecated_parameters_hash_equality`](#config-action-controller-allow-deprecated-parameters-hash-equality): `false`
- [`config.action_dispatch.default_headers`](#config-action-dispatch-default-headers): `{ "X-Frame-Options" => "SAMEORIGIN", "X-XSS-Protection" => "0", "X-Content-Type-Options" => "nosniff", "X-Permitted-Cross-Domain-Policies" => "none", "Referrer-Policy" => "strict-origin-when-cross-origin" }`
- [`config.action_view.sanitizer_vendor`](#config-action-view-sanitizer-vendor): `Rails::HTML::Sanitizer.best_supported_vendor`
- [`config.active_job.use_big_decimal_serializer`](#config-active-job-use-big-decimal-serializer): `true`
- [`config.active_record.allow_deprecated_singular_associations_name`](#config-active-record-allow-deprecated-singular-associations-name): `false`
- [`config.active_record.before_committed_on_all_records`](#config-active-record-before-committed-on-all-records): `true`
@ -1994,6 +1995,17 @@ The default value depends on the `config.load_defaults` target version:
Determines whether or not the `form_tag` and `button_to` helpers will produce HTML tags prepended with browser-safe (but technically invalid) HTML that guarantees their contents cannot be captured by any preceding unclosed tags. The default value is `false`.
#### `config.action_view.sanitizer_vendor`
Configures the set of HTML sanitizers used by Action View by setting `ActionView::Helpers::SanitizeHelper.sanitizer_vendor`. The default value depends on the `config.load_defaults` target version:
| Starting with version | The default value is | Which parses markup as |
|-----------------------|--------------------------------------|------------------------|
| (original) | `Rails::HTML4::Sanitizer` | HTML4 |
| 7.1 | `Rails::HTML5::Sanitizer` (see NOTE) | HTML5 |
NOTE: `Rails::HTML5::Sanitizer` is not supported on JRuby, so on JRuby platforms Rails will fall back to use `Rails::HTML4::Sanitizer`.
### Configuring Action Mailbox
`config.action_mailbox` provides the following configuration options:

@ -96,7 +96,7 @@ def load_defaults(target_version)
# configure the default value.
# 5. Add a commented out section in the `new_framework_defaults` to
# configure the default value again.
# 6. Update the guide in `configuration.md`.
# 6. Update the guide in `configuring.md`.
# To remove configurable deprecated behavior, follow these steps:
# 1. Update or remove the entry in the guides.
@ -311,6 +311,12 @@ def load_defaults(target_version)
if respond_to?(:action_controller)
action_controller.allow_deprecated_parameters_hash_equality = false
end
if defined?(Rails::HTML::Sanitizer) # nested ifs to avoid linter errors
if respond_to?(:action_view)
action_view.sanitizer_vendor = Rails::HTML::Sanitizer.best_supported_vendor
end
end
else
raise "Unknown version #{target_version.to_s.inspect}"
end

@ -175,3 +175,13 @@
# When you're ready to change format, add this to `config/application.rb` (NOT
# this file):
# config.active_support.cache_format_version = 7.1
# Configure Action View to use HTML5 standards-compliant sanitizers when they are supported on your
# platform.
#
# `Rails::HTML::Sanitizer.best_supported_vendor` will return `Rails::HTML5::Sanitizer` if it's
# supported, else fall back to `Rails::HTML4::Sanitizer`.
#
# In previous versions of Rails, Action View always used `Rails::HTML4::Sanitizer`.
#
# Rails.application.config.action_view.sanitizer_vendor = Rails::HTML::Sanitizer.best_supported_vendor