blob: 384b2fecb407661dadca57a74ebf1bb1a72873ba [file] [log] [blame] [view]
# Scrape
The scrape package is sort of a micro-framework to make it easier to write
little scripts that parse and traverse Dart code and gather statistics about the
contents.
For example, say you want to find out how many if statements in a body of Dart
code contain else clauses. A script using this package to measure that is:
```dart
import 'package:analyzer/dart/ast/ast.dart';
import 'package:scrape/scrape.dart';
void main(List<String> arguments) {
Scrape()
..addHistogram("If")
..addVisitor(() => IfVisitor())
..runCommandLine(arguments);
}
class IfVisitor extends ScrapeVisitor {
@override
void visitIfStatement(IfStatement node) {
if (node.elseStatement != null) {
record("If", "else");
} else {
record("If", "no else");
}
super.visitIfStatement(node);
}
}
```
Run that script on the package itself:
```
$ dart example/if.dart .
```
And it prints out:
```
-- If (137 total) --
104 ( 75.912%): no else =======================================
33 ( 24.088%): else =============
Took 262ms to scrape 1349 lines in 12 files.
```
So it looks like if statements without elses are about three times more common
than ones with elses. We use data like this to inform how we design and evolve
the language.
## A scrape script
I wanted to make scrape flexible enough to let you write whatever kinds of
logic to analyze code. That meant that instead of scrape being a *tool you run*,
it is more like a *library you consume*. This way, inside your script, you have
access to the full Dart language. At the same time, I didn't want every script
to have to copy/paste the same boring argument parsing and other code.
The compromise between those is the Scrape class. It is a builder for an
analysis over some code. It has a few methods you call on it to set up an
analysis:
* `addHistogram()` registers a new named histogram. This is the main way you
count occurrences of datapoints you care about in the code you are analyzing.
Each histogram is a named collection of datapoints. When the analysis
completes, scrape prints out each histogram, buckets the datapoints, and
shows how many of each datapoint occurred.
In the example above, we have one histogram named "If" and we count two
different datapoints, "no else", and "else".
* `addVisitor()` registers a callback that creates a visitor. This is the
main way you analyze code. When the analysis runs, scrape parses every
Dart file you specify. For each file and each registered visitor callback,
it invokes the callback to create a visitor and then runs that to walk over
the parsed code.
You call this passing in a callback that creates an instance of your own
visitor class, which should extend `ScrapeVisitor`.
* Then at the end call `runCommandLine()`, passing in your script's command
line arguments. This reads the file paths the user wants to analyze and
a few other command line options and flags that scrape automatically
supports. To learn more, call that with `--help`.
## A visitor class
The way your script analyzes code is through one or more custom subclasses of
`ScrapeVisitor`. That base class itself extends the analyzer package's
[`RecursiveAstVisitor`][visitor] class. It will walk over every single syntax
tree node in the parsed Dart file and invoke visit methods specific to each one.
You override the visit methods for the AST nodes you care about and put
whatever logic you want in there to analyze the code.
An important limitation of scrape is that it only *parses* Dart files. It does
not to any static analysis, name resolution, or type checking. This makes it
lightweight and fast to run (for example you can run it on a pub package
without needing to download its dependencies), but significantly limits the
kinds of analysis you can do.
It's good for syntax and tolerable for things like API usage if you're willing
to assume that certain names do refer to the API you think they do. If you look
in the examples directory, you'll get a sense for what kinds of tasks scrape is
well suited for.
[visitor]: https://pub.dev/documentation/analyzer/0.40.0/dart_ast_visitor/RecursiveAstVisitor-class.html