zeripath be99eb26a2 Detect truncated utf-8 characters at the end of content as still representing utf-8 (#19773) (#19774)
Backport #19773

Our character detection algorithm can potentially incorrectly detect utf-8 as iso-8859-x
if there is a truncated character at the end of the partially read file.

This PR changes the detection algorithm to truncated utf8 characters at the end of the
buffer.

Fix #19743

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-05-21 22:26:08 +08:00
..
2022-01-10 17:32:37 +08:00
2021-12-20 04:41:31 +00:00
2021-12-06 00:24:57 +08:00
2022-01-10 17:32:37 +08:00
2022-01-14 16:03:31 +01:00
2020-10-02 23:37:53 -04:00
2021-12-20 04:41:31 +00:00
2022-01-02 21:12:35 +08:00
2022-01-10 17:32:37 +08:00
2022-01-10 17:32:37 +08:00
2021-12-20 04:41:31 +00:00
2022-01-02 21:12:35 +08:00
2022-01-10 17:32:37 +08:00
2021-12-20 04:41:31 +00:00
2020-04-05 07:20:50 +01:00
2021-10-13 22:50:23 -04:00
2021-12-20 04:41:31 +00:00