От: | Serginio1 | https://habrahabr.ru/users/serginio1/topics/ | |
Дата: | 13.04.23 14:36 | ||
Оценка: | 54 (4) +1 |
Optimized ThreadStatic field access for GC-type
Field accesses that are marked as ThreadStaticLocal are now optimized for primitive types. With PR#85619, we have optimized reference type field access as well. These changes have led some really good improvements in a number of benchmarks: (133 on windows/arm64, 23 on windows/x64, 16, 13, 11 improvements).
Arm64
Preview 5 also brings a number of few peephole optimizations:
With PR#85032, we enabled peephole optimization to replace str pair with stp.
With PR#85657, we enabled peephole optimization of replacing pair of ldr/str with ldp/stp inside prolog.
General optimizations
Our team has released a number of general optimizations that include:
x64 instructions such as movzx, movsx and movsxd have been optimized in PR#85780, which slightly improved code-gen by eliminating more redundant mov instructions.
PR#86318 improved constant folding for some frozen objects (non-GC objects). It reduced the size of the generated code by almost 10 times (e.g., 424 bytes to 41 byte).
AVX-512
PR#85389 enabled AVX-512 for block unrollings, which increases ranges where it previously used to fallback to memcpy/memset and reduced the execution time by half.
Various integer intrinsics are enabled for AVX512F, AVX512BW and AVX512CD, PR#85833.