[dart2wasm] New typed data implementation

New typed data implementation that optimizes the common cases.

This uses the best possible representation for the fast case with a
representation like:

    class _I32List implements Int32List {
      final WasmIntArray<WasmI32> _data;

      int operator [](int index) {
        // range check
        return _data.read(index);
      }

      void operator []=(int index, int value) {
        // range check
        _data.writeSigned(index, value);
      }

      ...
    }

This gives us the best possible runtime performance in the common cases
of:

- The list is used directly.
- The list is used via a view of the same Wasm element type (e.g. a
  `Uint32List` view of a `Int32List`) and with aligned byte offset.

All other classes (`ByteBuffer`, `ByteData`, and view classes)
implemented to be able to support this representation.

Summary of classes:

- One list class per Dart typed data list, with the matching Wasm array
  as the buffer (as shown in the example above): `_I8List`, `_U8List`,
  `_U8ClampedList`, `_I16List`, `_U16List`, ...

- One list class per Dart typed data list, with mismatching Wasm array
  as the buffer. These classes are used when a view is created from a
  list, and the original list has a Wasm array with different element
  type than the view needs. `_SlowI8List`, `_SlowU8List`, ...

  These classes use `ByteData` interface to update the buffer.

- One list class for each of the classes listed above, for immutable
  views. `_UnmodifiableI32List`, `_UnmodifiableSlowU64List`, ...

  These classes inherit from their modifiable list classes and override
  update methods using a mixin.

- One `ByteData` class for each Wasm array type: `_I8ByteData`,
  `_I16ByteData`,
  ...

- One immutable `ByteData` view for each `ByteData` class.

- One `ByteBuffer` class for each Wasm array type: `_I8ByteBuffer`,
  `_I16ByteBuffer`, ...

- A single `ByteBuffer` class for the immutable view of a byte buffer.

  We don't need one immutable `ByteBuffer` view class per Wasm array
  type as `ByteBuffer` API does not provide direct access to the buffer.

Other optimizations:

- `setRange` now uses `array.copy` when possible, which causes a huge
  performance win in some benchmarks.

- The new implementation is pure Dart and needs no support or special
  cases from the compiler other than the Wasm array type support and
  intrinsics like `array.copy`. As a result this removes a bunch of
  `entry-point` pragmas and significantly reduces code size in some
  cases.

Other changes:

- Patch and implementation files for typed data and SIMD types are split
  into separate files. `typed_data_patch.dart` and `simd_patch.dart` now
  only contains patched factories. Implementation classes are moved to
  `typed_data.dart` and `simd.dart` as libraries `dart:_typed_data` and
  `dart:_simd`.

Benchmark results:

This CL significantly improves common cases. New implementation is only
slower than the current implementation when a view uses a Wasm array
type with incompatible element type (for example, `Uint32List` created
from a `Uint64List`).

These cases can still be improved by overriding the relevant `ByteData`
methods. For example, in the example of `Uint32List` view of a
`Uint64List`, by overriding `_I64ByteData.getUint32` to do a single read
then requested bytes don't cross element boundaries in the Wasm array.
These optimizations are left as future work.

Some sample benchmarks:

vector_math matrix_bench before:

    Binary size: 133,104 bytes.
    MatrixMultiply(RunTime): 201 us.
    SIMDMatrixMultiply(RunTime): 3,608 us.
    VectorTransform(RunTime): 94 us.
    SIMDVectorTransform(RunTime): 833 us.
    setViewMatrix(RunTime): 506 us.
    aabb2Transform(RunTime): 987 us.
    aabb2Rotate(RunTime): 721 us.
    aabb3Transform(RunTime): 1,710 us.
    aabb3Rotate(RunTime): 1,156 us.
    Matrix3.determinant(RunTime): 171 us.
    Matrix3.transform(Vector3)(RunTime): 8,550 us.
    Matrix3.transform(Vector2)(RunTime): 3924 us.
    Matrix3.transposeMultiply(RunTime): 201 us.

vector_math matrix_bench after:

    Binary size: 135,198 bytes.
    MatrixMultiply(RunTime): 42 us.
    SIMDMatrixMultiply(RunTime): 2,068 us.
    VectorTransform(RunTime): 12 us.
    SIMDVectorTransform(RunTime): 272 us.
    setViewMatrix(RunTime): 82 us.
    aabb2Transform(RunTime): 167 us.
    aabb2Rotate(RunTime): 147 us.
    aabb3Transform(RunTime): 194 us.
    aabb3Rotate(RunTime): 199 us.
    Matrix3.determinant(RunTime): 70 us.
    Matrix3.transform(Vector3)(RunTime): 726 us.
    Matrix3.transform(Vector2)(RunTime): 504 us.
    Matrix3.transposeMultiply(RunTime): 53 us.

FluidMotion before:

    Binary size: 121,130 bytes.
    FluidMotion(RunTime): 270,625 us.

FluidMotion after:

    Binary size: 110,674 bytes.
    FluidMotion(RunTime): 71,357 us.

With bound checks omitted (not in this CL), FluidMotion becomes
competitive with `dart2js -O4`:

FluidMotion dart2js -O4:

    FluidMotion(RunTime): 47,813 us.

FluidMotion this CL + boud checks omitted:

    FluidMotion(RunTime): 51,289 us.

Fixes #52710.

Tested: With existing tests.
Change-Id: I33bf5585c3be5d3919a99af857659cf7d9393df0
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/312907
Reviewed-by: Joshua Litt <joshualitt@google.com>
Commit-Queue: Ömer Ağacan <omersa@google.com>
12 files changed
tree: dcc72a4f52f2165a2ca70447ea44cb101ab8da3f
  1. .dart_tool/
  2. .github/
  3. benchmarks/
  4. build/
  5. docs/
  6. pkg/
  7. runtime/
  8. samples/
  9. sdk/
  10. tests/
  11. third_party/
  12. tools/
  13. utils/
  14. .clang-format
  15. .gitattributes
  16. .gitconfig
  17. .gitignore
  18. .gn
  19. .mailmap
  20. .style.yapf
  21. .vpython
  22. AUTHORS
  23. BUILD.gn
  24. CHANGELOG.md
  25. codereview.settings
  26. CONTRIBUTING.md
  27. DEPS
  28. LICENSE
  29. OWNERS
  30. PATENT_GRANT
  31. PRESUBMIT.py
  32. README.dart-sdk
  33. README.md
  34. sdk.code-workspace
  35. sdk_args.gni
  36. SECURITY.md
  37. WATCHLISTS
README.md

Dart

A client-optimized language for fast apps on any platform

Dart is:

  • Optimized for UI: Develop with a programming language specialized around the needs of user interface creation.

  • Productive: Make changes iteratively: use hot reload to see the result instantly in your running app.

  • Fast on all platforms: Compile to ARM & x64 machine code for mobile, desktop, and backend. Or compile to JavaScript for the web.

Dart's flexible compiler technology lets you run Dart code in different ways, depending on your target platform and goals:

  • Dart Native: For programs targeting devices (mobile, desktop, server, and more), Dart Native includes both a Dart VM with JIT (just-in-time) compilation and an AOT (ahead-of-time) compiler for producing machine code.

  • Dart Web: For programs targeting the web, Dart Web includes both a development time compiler (dartdevc) and a production time compiler (dart2js).

Dart platforms illustration

License & patents

Dart is free and open source.

See LICENSE and PATENT_GRANT.

Using Dart

Visit dart.dev to learn more about the language, tools, and to find codelabs.

Browse pub.dev for more packages and libraries contributed by the community and the Dart team.

Our API reference documentation is published at api.dart.dev, based on the stable release. (We also publish docs from our beta and dev channels, as well as from the primary development branch).

Building Dart

If you want to build Dart yourself, here is a guide to getting the source, preparing your machine to build the SDK, and building.

There are more documents on our wiki.

Contributing to Dart

The easiest way to contribute to Dart is to file issues.

You can also contribute patches, as described in Contributing.