V8 5.5 changed how invalid characters are handled and it now appears
to follow the WHATWG Encoding standard, where all of an invalid
character's bytes are replaced by a single replacement character
(\ufffd) instead of replacing each invalid byte with separate
replacement characters.
Example: the byte sequence 0xF0,0xB8,0x41 is decoded as '\ufffdA' in
V8 5.5, but is decoded as '\ufffd\ufffdA' in previous versions of V8.
PR-URL: https://github.com/nodejs/node/pull/9618
Reviewed-By: Ali Ijaz Sheikh <ofrobots@google.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.
PR-URL: https://github.com/nodejs/node/pull/7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
When node began using the OneByte API (f150d56) it also switched to
officially supporting ISO-8859-1. Though at the time no new encoding
string was introduced.
Introduce the new encoding string 'latin1' to be more explicit. The
previous 'binary' and documented as an alias to 'latin1'. While many
tests have switched to use 'latin1', there are still plenty that do both
'binary' and 'latin1' checks side-by-side to ensure there is no
regression.
PR-URL: https://github.com/nodejs/node/pull/7111
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: James M Snell <jasnell@gmail.com>
This commit provides a rewrite of StringDecoder that both improves
performance (for non-single-byte encodings) and understandability.
Additionally, StringDecoder instantiation performance has increased
considerably due to inlinability and more efficient encoding name
checking.
PR-URL: https://github.com/nodejs/node/pull/6777
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Several changes:
* Soft-Deprecate Buffer() constructors
* Add `Buffer.from()`, `Buffer.alloc()`, and `Buffer.allocUnsafe()`
* Add `--zero-fill-buffers` command line option
* Add byteOffset and length to `new Buffer(arrayBuffer)` constructor
* buffer.fill('') previously had no effect, now zero-fills
* Update the docs
PR-URL: https://github.com/nodejs/node/pull/4682
Reviewed-By: Сковорода Никита Андреевич <chalkerx@gmail.com>
Reviewed-By: Stephen Belanger <admin@stephenbelanger.com>
This commit reverts the const usage introduced by 68a6abc
because v8 currently cannot optimize functions that contain
these uses of const (unsupported phi use of const variable).
The performance difference in this case can be up to ~130%
for non-ascii/binary string encodings.
PR-URL: https://github.com/nodejs/node/pull/5134
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
The copyright and license notice is already in the LICENSE file. There
is no justifiable reason to also require that it be included in every
file, since the individual files are not individually distributed except
as part of the entire package.
This patch simplifies the implementation of StringDecoder, fixes the
failures from the new test cases, and also no longer relies on v8's
WriteUtf8 function to encode individual surrogates.