How to assess performance of Dart and native code, and how to improve it.
| Tool | Platform | Primary Use Case | Measures (Dart CPU) | Measures (Native CPU) | Measures (Dart Heap) | Measures (Native Heap) |
|---|---|---|---|---|---|---|
| Dart DevTools | All | Profiles Dart VM, UI jank, Dart heap | Yes | Opaque “Native” block | Yes | Tracks “External” VM-aware memory only; Misses native-heap leaks |
| Xcode Instruments (Time Profiler) | iOS/macOS | Profiles native CPU call stacks | No | Yes (full symbolication) | No | No |
| Xcode Instruments (Leaks/Allocations) | iOS/macOS | Profiles native heap (malloc, mmap) | No | No | No | Yes |
| Android Studio Profiler (CPU) | Android | Profiles native C/C++ CPU execution | No | Yes (traces C++ calls) | No | No |
| Perfetto (heapprofd) | Android | Advanced native heap profiling | No | No | No | Yes (traces malloc/free call stacks) |
| Linux perf | Linux | Unified Dart AOT + Native CPU profiling | Yes (requires special flags) | Yes | No | No |
| Visual Studio CPU Usage Profiler | Windows | Profiles native C/C++ CPU execution | No | Yes (traces C++ calls) | No | No |
| WPA (Heap Analysis) | Windows | Advanced native heap profiling | No | No | No | Yes (traces malloc/free call stacks) |
For only assessing the performance of the Dart code, and treating native code as a black box, use the Dart performance tooling.
See the documentation on https://dart.dev/tools/dart-devtools and https://docs.flutter.dev/perf. For FFI, most specifically, you can use https://docs.flutter.dev/tools/devtools/cpu-profiler and https://docs.flutter.dev/tools/devtools/performance#timeline-events-tab. For synchronous FFI calls you can add synchronous timeline events, and for asynchronous code (using async callbacks or helper isolates) you can use async events.
perf on LinuxTo see both Dart and native symbols in a flame graph, you can use perf on Linux.
To run the FfiCall benchmark in JIT mode with perf:
$ perf record -g dart --generate-perf-events-symbols benchmarks/FfiCall/dart/FfiCall.dart && \ perf report --hierarchy
Note that Flutter apps are deployed in AOT mode. So prefer profiling in AOT mode.
For AOT, we currently don't have a single command yet. You need to use precompiler2 command from the Dart SDK. See building the Dart SDK for how to build the Dart SDK.
$ pkg/vm/tool/precompiler2 benchmarks/FfiCall/dart/FfiCall.dart benchmarks/FfiCall/dart/FfiCall.dart.bin && \ perf record -g pkg/vm/tool/dart_precompiled_runtime2 --generate-perf-events-symbols benchmarks/FfiCall/dart/FfiCall.dart.bin && \ perf report --hierarchy
To analyze a performance issue in Flutter, it is best to reproduce the issue in Dart standalone.
There are some typical patterns to improve performance:
To avoid dropped frames, move long-running FFI calls to a helper isolate.
To avoid copying data where possible:
Pointers and using asTypedList to convert the pointers into TypedData.isLeaf, isLeaf (2), isLeaf (3)) and address. (Leaf calls prevent the Dart GC from running on all isolates, which allows giving a pointer to native code of an object in Dart.)Isolate.exit to send large data from a helper isolate to the main isolate after a large computation.For many small calls, limit the overhead per call. This makes a significant difference for calls shorter than 1 us (one millionth of a second), and can be considered for calls of up to 10 us.
isLeaf, isLeaf (2), isLeaf (3)).Native external functions over DynamicLibrary.lookupFunction and Pointer.asFunction.For reference, the FfiCall benchmark reports 1000 FFI calls in AOT on Linux x64:
FfiCall.Uint8x01(RunTime): 234.61104068226345 us. FfiCall.Uint8x01Leaf(RunTime): 71.9994712538334 us. FfiCall.Uint8x01Native(RunTime): 216.07292770828917 us. FfiCall.Uint8x01NativeLeaf(RunTime): 27.64136415181509 us.
A single call that is native-leaf takes 28 ns, while an asFunction-non-leaf takes 235 ns. So for calls taking ~1000 ns that's a 20% speedup.