This is a _significant_ perf improvement as we no longer have to think
about tracking state transitions from empty <-> nonempty.
(corresponds to a ~20% perf improvement in LibWasm)
Instead of repeatedly removing elements off the vector, this allows for
specifying all the removed indices at once, and does not perform any
extra reallocations or unnecessary moves.
Copy parse() method from LibCore::DateTime::parse(). Augment the method
to handle parsing from GMT time. Fix incorrect handling of year in '%D'
format specifier. Remove all format specifiers related to time zones.
Copy relevant tests and add additional ones.
Buckets being iterated by pointer instead of reference was causing a
compilation error when calling clear_with_capacity() on a HashTable
containing a non-trivially-destructible type.
When T in HashTable<T> has a potentially slow equality check, it can be
very profitable to check for a matching hash before full equality.
This patch adds may_have_slow_equality_check() to AK::Traits and
defaults it to true. For trivial types (pointers, integers, etc) we
default it to false. This means we skip the hash check when the equality
check would be a single-CPU-word compare anyway.
This synergizes really well with things like HashMap<String, V> where
collisions previously meant we may have to churn through multiple O(n)
equality checks.
We often use Optional<T> in a cache pattern such as:
if (!m_cache.has_value())
m_cache = slow_thing();
return m_cache.value();
The new ::ensure() makes it a bit simpler:
return m_cache.ensure([&] { return slow_thing(); });
This first pass only applies to the following two cases:
- Public functions returning a view type into an object they own
- Public ctors storing a view type
This catches a grand total of one (1) issue, which is fixed in
the previous commit.
We added this for String some time ago, so let's give Utf16String the
same optimization. Note that Utf16String was already handling its data
member potentially being null as of 5af99f4dd0.
For the web, we allow a wobbly UTF-16 encoding (i.e. lonely surrogates
are permitted). Only in a few exceptional cases do we strictly require
valid UTF-16. As such, our `validate(AllowLonelySurrogates::Yes)` calls
will always succeed. It's a wasted effort to ever make such a check.
This patch eliminates such invocations. The validation methods will now
only check for strict UTF-16, and are only invoked when needed.
When we build a UTF-16 string, we currently always switch to the UTF-16
storage mode inside StringBuilder. Then when it comes time to create the
string, we switch the storage to ASCII if possible (by shifting the
underlying bytes up).
Instead, let's start out with ASCII storage and then switch to UTF-16
storage once we see a non-ASCII code point. For most strings, this will
avoid allocating 2x the memory, and avoids many ASCII validation calls.
We now define GenericLexer as a template to allow using it with UTF-16
strings. To keep existing users happy, the template is defined in the
Detail namespace. Then AK::GenericLexer is an alias for a char-based
view, and AK::Utf16GenericLexer is an alias for a char16-based view.
* Remove completely unused methods.
* Deduplicate methods that were overloaded with both StringView and
char const* parameters.
A future commit will templatize GenericLexer by char type. This patch
serves to make that a tiny bit easier.
We were handling the special cases of NaN and Infinity in basically the
same way across both functions - we can reduce code duplication by
moving this to before we branch.
This is also required as we will be moving the logic to encode in
scientific notation above the branch in a later commit and the
`convert_floating_point_to_decimal_exponential_form` method doesn't work
with non-finite values.
In the following synthetic benchmark, the simdutf version is 4x faster:
BENCHMARK_CASE(find)
{
auto string = u"😀Foo😀Bar"sv;
for (size_t i = 0; i < 100'000'000; ++i)
(void)string.find_code_unit_offset('a');
}
...for the first byte.
This function only really needs to read a single byte at that point, so
read_until_filled() is useless and read_value<u8> is functionally
equivalent to just a read.
This showed up hot in a wasm parse benchmark.
These are quite bottlenecky in wasm, the next commit will try to make
use of this by calling them directly instead of doing a vcall, and
having them inlineable helps the compiler a bit.