Simplify subtree merging. (#1500)

A key optimization in the new formatting is formatting entire branches of the piece tree separately and then merging in the results. This lets us reuse the results of solving and formatting that subtree across multiple solutions.

When merging the results back, it's important that the resulting solution has the same cost and other metrics as if the subtree had not been formatted separately. Otherwise, the optimization can change the formatting behavior.

When I first implemented subtree merging, I ran into some issues where it seemed like the resulting cost wasn't right. I wasn't able to figure out exactly why. To fix it, I made sure we didn't double count the cost of any given bound piece by merging each piece state in the subtree back into the parent solution and only counting its cost if the parent didn't already have that piece bound.

After a lot of debugging, I figured out why bound pieces would sometimes get double counted otherwise. The problem goes like this:

1.  We create a solution. When creating it, we format it which traverses the piece tree and ends up separately formatting some subtrees and merging those into this solution.

2.  Later, we expand this solution into a series of child solutions. When we do, we copy the parent solution's bound pieces and its cost.

3.  When we go to format each of these expanded child solutions, we again traverse the piece tree which ends up separately formatting many of the same subtrees. For each one, we add in the cost of the subtree solution. But we've already added in that cost from the parent solution, which the expanded child inherited, so now we've double counted.

This PR fixes that: For each solution, we separately track the cost of the pieces directly bound by the solution and the cost it accumulated from merging subtrees. When expanding a solution, we inherit the former cost, but not the latter. That way, when we go to format the expanded solution and merge the subtrees in again, we only count those subtree costs once, when the child merges them.

This is a small but noticeable speed improvement:

```
Benchmark (tall)                fastest   median  slowest  average  baseline
-----------------------------  --------  -------  -------  -------  --------
block                             0.063    0.068    0.120    0.071    109.4%
chain                             0.627    0.641    0.674    0.643    103.1%
collection                        0.161    0.173    0.197    0.173    108.9%
collection_large                  0.856    0.908    0.975    0.908    107.7%
conditional                       0.064    0.066    0.076    0.067    103.6%
curry                             0.585    0.596    0.652    0.600    101.7%
flutter_popup_menu_test           0.271    0.280    0.294    0.279    108.2%
flutter_scrollbar_test            0.128    0.129    0.146    0.132    124.3%
function_call                     1.307    1.370    1.486    1.369    110.4%
infix_large                       0.659    0.696    0.727    0.694    106.3%
infix_small                       0.168    0.179    0.204    0.179    103.6%
interpolation                     0.091    0.095    0.115    0.097     99.5%
large                             3.431    3.598    3.761    3.610    102.4%
top_level                         0.140    0.142    0.169    0.144    104.0%
```

And when formatting flutter/packages/flutter:

```
Current formatter     8.624 ========================================
Optimized             8.233 ======================================
Old formatter         4.455 ====================
The current formatter is 48.34% slower than the old formatter.
The optimized is 4.75% faster than the current formatter.
The optimized is 45.89% slower than the old formatter.
The optimization gets the formatter 9.38% of the way to the old one.
```

But, maybe more importantly, my hope is that this is a step towards untangling cost calculation from formatting, so that I can move towards calculating cost eagerly and then only formatting a solution once we know it's the winner.
1 file changed
tree: 60906a0e6f87affa7dbfb2028309f632bf01cc98
  1. .github/
  2. benchmark/
  3. bin/
  4. dist/
  5. example/
  6. lib/
  7. test/
  8. tool/
  9. .gitignore
  10. analysis_options.yaml
  11. AUTHORS
  12. CHANGELOG.md
  13. LICENSE
  14. pubspec.yaml
  15. README.md
README.md

The dart_style package defines an automatic, opinionated formatter for Dart code. It replaces the whitespace in your program with what it deems to be the best formatting for it. Resulting code should follow the Dart style guide but, moreso, should look nice to most human readers, most of the time.

The formatter handles indentation, inline whitespace, and (by far the most difficult) intelligent line wrapping. It has no problems with nested collections, function expressions, long argument lists, or otherwise tricky code.

The formatter turns code like this:

// BEFORE formatting
if (tag=='style'||tag=='script'&&(type==null||type == TYPE_JS
      ||type==TYPE_DART)||
  tag=='link'&&(rel=='stylesheet'||rel=='import')) {}

into:

// AFTER formatting
if (tag == 'style' ||
  tag == 'script' &&
      (type == null || type == TYPE_JS || type == TYPE_DART) ||
  tag == 'link' && (rel == 'stylesheet' || rel == 'import')) {}

The formatter will never break your code—you can safely invoke it automatically from build and presubmit scripts.

Style fixes

The formatter can also apply non-whitespace changes to make your code consistently idiomatic. You must opt into these by passing either --fix which applies all style fixes, or any of the --fix--prefixed flags to apply specific fixes.

For example, running with --fix-named-default-separator changes this:

greet(String name, {String title: "Captain"}) {
  print("Greetings, $title $name!");
}

into:

greet(String name, {String title = "Captain"}) {
  print("Greetings, $title $name!");
}

Using the formatter

The formatter is part of the unified dart developer tool included in the Dart SDK, so most users get it directly from there. That has the latest version of the formatter that was available when the SDK was released.

IDEs and editors that support Dart usually provide easy ways to run the formatter. For example, in WebStorm you can right-click a .dart file and then choose Reformat with Dart Style.

Here's a simple example of using the formatter on the command line:

$ dart format test.dart

This command formats the test.dart file and writes the result to the file.

dart format takes a list of paths, which can point to directories or files. If the path is a directory, it processes every .dart file in that directory or any of its subdirectories.

By default, it formats each file and write the formatting changes to the files. If you pass --output show, it prints the formatted code to stdout.

You may pass a -l option to control the width of the page that it wraps lines to fit within, but you're strongly encouraged to keep the default line length of 80 columns.

Validating files

If you want to use the formatter in something like a presubmit script or commit hook, you can pass flags to omit writing formatting changes to disk and to update the exit code to indicate success/failure:

$ dart format --output=none --set-exit-if-changed .

Running other versions of the formatter CLI command

If you need to run a different version of the formatter, you can globally activate the package from the dart_style package on pub.dev:

$ pub global activate dart_style
$ pub global run dart_style:format ...

Using the dart_style API

The package also exposes a single dart_style library containing a programmatic API for formatting code. Simple usage looks like this:

import 'package:dart_style/dart_style.dart';

main() {
  final formatter = DartFormatter();

  try {
    print(formatter.format("""
    library an_entire_compilation_unit;

    class SomeClass {}
    """));

    print(formatter.formatStatement("aSingle(statement);"));
  } on FormatterException catch (ex) {
    print(ex);
  }
}

Other resources

  • Before sending an email, see if you are asking a frequently asked question.

  • Before filing a bug, or if you want to understand how work on the formatter is managed, see how we track issues.