gitea

History

Detect truncated utf-8 characters at the end of content as still representing utf-8 (#19773 ) (#19774 )

Backport #19773

Our character detection algorithm can potentially incorrectly detect utf-8 as iso-8859-x
if there is a truncated character at the end of the partially read file.

This PR changes the detection algorithm to truncated utf8 characters at the end of the
buffer.

Fix #19743

Signed-off-by: Andrew Thornton <art27@cantab.net>

2022-05-21 22:26:08 +08:00

charset_test.go

Detect truncated utf-8 characters at the end of content as still representing utf-8 (#19773 ) (#19774 )

2022-05-21 22:26:08 +08:00

charset.go

Detect truncated utf-8 characters at the end of content as still representing utf-8 (#19773 ) (#19774 )

2022-05-21 22:26:08 +08:00

escape_test.go

Don't treat BOM escape sequence as hidden character. (#18909 ) (#18910 )

2022-02-26 23:15:04 +01:00

escape.go

Don't treat BOM escape sequence as hidden character. (#18909 ) (#18910 )

2022-02-26 23:15:04 +01:00