commit | 0f54180b512e3ddfffcf0f9ba2968655747b4230 | [log] [tgz] |
---|---|---|
author | Ömer Sinan Ağacan <omersa@google.com> | Thu Jul 20 09:47:39 2023 +0000 |
committer | Commit Queue <dart-scoped@luci-project-accounts.iam.gserviceaccount.com> | Thu Jul 20 09:47:39 2023 +0000 |
tree | dcc72a4f52f2165a2ca70447ea44cb101ab8da3f | |
parent | 9692a9dfefd7324c4fa9a3c3e86e37cff3486bd9 [diff] |
[dart2wasm] New typed data implementation New typed data implementation that optimizes the common cases. This uses the best possible representation for the fast case with a representation like: class _I32List implements Int32List { final WasmIntArray<WasmI32> _data; int operator [](int index) { // range check return _data.read(index); } void operator []=(int index, int value) { // range check _data.writeSigned(index, value); } ... } This gives us the best possible runtime performance in the common cases of: - The list is used directly. - The list is used via a view of the same Wasm element type (e.g. a `Uint32List` view of a `Int32List`) and with aligned byte offset. All other classes (`ByteBuffer`, `ByteData`, and view classes) implemented to be able to support this representation. Summary of classes: - One list class per Dart typed data list, with the matching Wasm array as the buffer (as shown in the example above): `_I8List`, `_U8List`, `_U8ClampedList`, `_I16List`, `_U16List`, ... - One list class per Dart typed data list, with mismatching Wasm array as the buffer. These classes are used when a view is created from a list, and the original list has a Wasm array with different element type than the view needs. `_SlowI8List`, `_SlowU8List`, ... These classes use `ByteData` interface to update the buffer. - One list class for each of the classes listed above, for immutable views. `_UnmodifiableI32List`, `_UnmodifiableSlowU64List`, ... These classes inherit from their modifiable list classes and override update methods using a mixin. - One `ByteData` class for each Wasm array type: `_I8ByteData`, `_I16ByteData`, ... - One immutable `ByteData` view for each `ByteData` class. - One `ByteBuffer` class for each Wasm array type: `_I8ByteBuffer`, `_I16ByteBuffer`, ... - A single `ByteBuffer` class for the immutable view of a byte buffer. We don't need one immutable `ByteBuffer` view class per Wasm array type as `ByteBuffer` API does not provide direct access to the buffer. Other optimizations: - `setRange` now uses `array.copy` when possible, which causes a huge performance win in some benchmarks. - The new implementation is pure Dart and needs no support or special cases from the compiler other than the Wasm array type support and intrinsics like `array.copy`. As a result this removes a bunch of `entry-point` pragmas and significantly reduces code size in some cases. Other changes: - Patch and implementation files for typed data and SIMD types are split into separate files. `typed_data_patch.dart` and `simd_patch.dart` now only contains patched factories. Implementation classes are moved to `typed_data.dart` and `simd.dart` as libraries `dart:_typed_data` and `dart:_simd`. Benchmark results: This CL significantly improves common cases. New implementation is only slower than the current implementation when a view uses a Wasm array type with incompatible element type (for example, `Uint32List` created from a `Uint64List`). These cases can still be improved by overriding the relevant `ByteData` methods. For example, in the example of `Uint32List` view of a `Uint64List`, by overriding `_I64ByteData.getUint32` to do a single read then requested bytes don't cross element boundaries in the Wasm array. These optimizations are left as future work. Some sample benchmarks: vector_math matrix_bench before: Binary size: 133,104 bytes. MatrixMultiply(RunTime): 201 us. SIMDMatrixMultiply(RunTime): 3,608 us. VectorTransform(RunTime): 94 us. SIMDVectorTransform(RunTime): 833 us. setViewMatrix(RunTime): 506 us. aabb2Transform(RunTime): 987 us. aabb2Rotate(RunTime): 721 us. aabb3Transform(RunTime): 1,710 us. aabb3Rotate(RunTime): 1,156 us. Matrix3.determinant(RunTime): 171 us. Matrix3.transform(Vector3)(RunTime): 8,550 us. Matrix3.transform(Vector2)(RunTime): 3924 us. Matrix3.transposeMultiply(RunTime): 201 us. vector_math matrix_bench after: Binary size: 135,198 bytes. MatrixMultiply(RunTime): 42 us. SIMDMatrixMultiply(RunTime): 2,068 us. VectorTransform(RunTime): 12 us. SIMDVectorTransform(RunTime): 272 us. setViewMatrix(RunTime): 82 us. aabb2Transform(RunTime): 167 us. aabb2Rotate(RunTime): 147 us. aabb3Transform(RunTime): 194 us. aabb3Rotate(RunTime): 199 us. Matrix3.determinant(RunTime): 70 us. Matrix3.transform(Vector3)(RunTime): 726 us. Matrix3.transform(Vector2)(RunTime): 504 us. Matrix3.transposeMultiply(RunTime): 53 us. FluidMotion before: Binary size: 121,130 bytes. FluidMotion(RunTime): 270,625 us. FluidMotion after: Binary size: 110,674 bytes. FluidMotion(RunTime): 71,357 us. With bound checks omitted (not in this CL), FluidMotion becomes competitive with `dart2js -O4`: FluidMotion dart2js -O4: FluidMotion(RunTime): 47,813 us. FluidMotion this CL + boud checks omitted: FluidMotion(RunTime): 51,289 us. Fixes #52710. Tested: With existing tests. Change-Id: I33bf5585c3be5d3919a99af857659cf7d9393df0 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/312907 Reviewed-by: Joshua Litt <joshualitt@google.com> Commit-Queue: Ömer Ağacan <omersa@google.com>
Dart is:
Optimized for UI: Develop with a programming language specialized around the needs of user interface creation.
Productive: Make changes iteratively: use hot reload to see the result instantly in your running app.
Fast on all platforms: Compile to ARM & x64 machine code for mobile, desktop, and backend. Or compile to JavaScript for the web.
Dart's flexible compiler technology lets you run Dart code in different ways, depending on your target platform and goals:
Dart Native: For programs targeting devices (mobile, desktop, server, and more), Dart Native includes both a Dart VM with JIT (just-in-time) compilation and an AOT (ahead-of-time) compiler for producing machine code.
Dart Web: For programs targeting the web, Dart Web includes both a development time compiler (dartdevc) and a production time compiler (dart2js).
Dart is free and open source.
See LICENSE and PATENT_GRANT.
Visit dart.dev to learn more about the language, tools, and to find codelabs.
Browse pub.dev for more packages and libraries contributed by the community and the Dart team.
Our API reference documentation is published at api.dart.dev, based on the stable release. (We also publish docs from our beta and dev channels, as well as from the primary development branch).
If you want to build Dart yourself, here is a guide to getting the source, preparing your machine to build the SDK, and building.
There are more documents on our wiki.
The easiest way to contribute to Dart is to file issues.
You can also contribute patches, as described in Contributing.