7ae652ec8fa727fbc30c9a2debe8ac3d0f95ab08 - dart_style

commit	7ae652ec8fa727fbc30c9a2debe8ac3d0f95ab08	[log] [tgz]
author	Bob Nystrom <rnystrom@google.com>	Tue Dec 05 16:12:05 2023 -0800
committer	GitHub <noreply@github.com>	Tue Dec 05 16:12:05 2023 -0800
tree	c792f754daf859cd5944b64467cf3e26cc1a7b94
parent	214f28982c325bbaa0b2079eff8e60e534fefd47 [diff]

Revamp the way tokens and comments are built into pieces. (#1336)

Revamp the way tokens and comments are built into pieces.

I recently ran into bugs where a line comment after some AST node will cause the node to split incorrectly. A simple example is:

```
var x = 1 + 2; // comment
```

Currently, the formatter splits that to:

```
var x =
    1 +
        2; // comment
```

It does that because the Piece tree it creates doesn't line up with the AST node boundaries. In particular, the current design appends tokens and comments to a preceding piece. So in this example, the piece tree looks like:

```
Var(
  `var`
  Assign(
    `x =`
    Infix(
      `1 +`
      `2; // comment`
    )
  )
)
```

Note how the `;` and line comment are attached as part of the RHS of the `+`. That's why the formatter thinks the line comment's newline is inside the + expression and forces it to split.

We could fix this specific bug by making ExpressionStatement treat the `;` as a separate piece, but I suspect that we will be playing whack-a-mole if we keep the current design. Instead, this unfortunately giant PR revamps the piece API. It has a couple of intermingled changes:

### Split pieces at all AST boundaries

Whenever a `visit___()` returns, an implicit split is inserted so that no single `TextPiece` contains tokens from a parent and child AST node. This directly fixes the above bug and all similar bugs in that category.

Note that while we split the tokens into separate pieces, that doesn't mean they may split in the "line splitting" sense. The TextPieces go into AdjacentPiece objects that don't insert actual splits between the pieces. This means this change shouldn't significantly impact the performance of line splitting.

It's just about ensuring that the nesting structure of the piece tree mirrors the nesting structure of the AST. That way, when a newline in a child piece node invalidates an outer piece, that invalidation respecst the original syntax.

### Revamp the API for creating pieces

The previous API had a DSL-like "push" API where the pieces created by PieceWriter were stored internally and exposed by a fairly confusing `give()`/`take()`/`split()` API. That was necessary because any given `visit___()` method might not be *able* to return a Piece for its node if that node just concatenated its tokens into some surrounding piece.

With the previous change where every AST node corresponds to a piece, we have that option. So this PR also makes that change. Every `visit___()` method is now required to return a piece. Likewise, all of the `create___()` methods in PieceFactory return the pieces they create. This avoids the need for a weird `take()` API.

### Add an AdjacentBuilder and buildPiece() API

Getting rid of the implicit storage and dataflow for pieces is good for being able to easily reason about how the piece tree gets created out of child pieces. But it can come at the cost of making code that creates pieces very verbose with lots of local variables and `List<Piece>` objects to store the intermediate pieces being built.

To make that nicer, I wrote an AdjacentBuilder class with an imperative API for building an AdjacentPiece out of a series of tokens, nodes, and spaces. This API closely mirrors the original DSL-like API. Except now you know exactly what object the nodes and tokens are pushing their pieces into.

To make that even nicer, I added a `buildPiece()` method that takes a callback, invokes it with a new AdjacentBuilder, and return the built result. This gets most code for building pieces fairly close to the original push-based API but with hopefully clearer more explicit dataflow.

I'm really sorry for the giant size of this PR. If you want, I can try to break it into a series of smaller commits (but likely still one PR), but doing so is pretty challenging given how intertwined these changes are. It's hard to change the return type of the visit methods without also getting rid of the implicit dataflow and at that point, almost all the changes are there.

Also, I added more tests to cover the cases around comments that were broken.

22 files changed

tree: c792f754daf859cd5944b64467cf3e26cc1a7b94

README.md

The dart_style package defines an automatic, opinionated formatter for Dart code. It replaces the whitespace in your program with what it deems to be the best formatting for it. Resulting code should follow the Dart style guide but, moreso, should look nice to most human readers, most of the time.

The formatter handles indentation, inline whitespace, and (by far the most difficult) intelligent line wrapping. It has no problems with nested collections, function expressions, long argument lists, or otherwise tricky code.

The formatter turns code like this:

// BEFORE formatting
if (tag=='style'||tag=='script'&&(type==null||type == TYPE_JS
      ||type==TYPE_DART)||
  tag=='link'&&(rel=='stylesheet'||rel=='import')) {}

into:

// AFTER formatting
if (tag == 'style' ||
  tag == 'script' &&
      (type == null || type == TYPE_JS || type == TYPE_DART) ||
  tag == 'link' && (rel == 'stylesheet' || rel == 'import')) {}

The formatter will never break your code—you can safely invoke it automatically from build and presubmit scripts.

Style fixes

The formatter can also apply non-whitespace changes to make your code consistently idiomatic. You must opt into these by passing either --fix which applies all style fixes, or any of the --fix--prefixed flags to apply specific fixes.

For example, running with --fix-named-default-separator changes this:

greet(String name, {String title: "Captain"}) {
  print("Greetings, $title $name!");
}

into:

greet(String name, {String title = "Captain"}) {
  print("Greetings, $title $name!");
}

Using the formatter

The formatter is part of the unified dart developer tool included in the Dart SDK, so most users get it directly from there. That has the latest version of the formatter that was available when the SDK was released.

IDEs and editors that support Dart usually provide easy ways to run the formatter. For example, in WebStorm you can right-click a .dart file and then choose Reformat with Dart Style.

Here's a simple example of using the formatter on the command line:

$ dart format test.dart

This command formats the test.dart file and writes the result to the file.

dart format takes a list of paths, which can point to directories or files. If the path is a directory, it processes every .dart file in that directory or any of its subdirectories.

By default, it formats each file and write the formatting changes to the files. If you pass --output show, it prints the formatted code to stdout.

You may pass a -l option to control the width of the page that it wraps lines to fit within, but you're strongly encouraged to keep the default line length of 80 columns.

Validating files

If you want to use the formatter in something like a presubmit script or commit hook, you can pass flags to omit writing formatting changes to disk and to update the exit code to indicate success/failure:

$ dart format --output=none --set-exit-if-changed .

Running other versions of the formatter CLI command

If you need to run a different version of the formatter, you can globally activate the package from the dart_style package on pub.dev:

$ pub global activate dart_style
$ pub global run dart_style:format ...

Using the dart_style API

The package also exposes a single dart_style library containing a programmatic API for formatting code. Simple usage looks like this:

import 'package:dart_style/dart_style.dart';

main() {
  final formatter = DartFormatter();

  try {
    print(formatter.format("""
    library an_entire_compilation_unit;

    class SomeClass {}
    """));

    print(formatter.formatStatement("aSingle(statement);"));
  } on FormatterException catch (ex) {
    print(ex);
  }
}

Other resources

Before sending an email, see if you are asking a frequently asked question.
Before filing a bug, or if you want to understand how work on the formatter is managed, see how we track issues.