Dart Language and Library Newsletter

2017-11-03 @floitschG

Welcome to the Dart Language and Library Newsletter.

Did You Know?

Chunked Conversions

All converters (implementing Converter from dart:convert) support three modes of operation:

synchronous
chunked
streamed

The synchronous and streamed conversions are the most commonly used ones. Example:

import 'dart:convert';
import 'dart:io';

main() {
  print(JSON.encode({"my": "map"}));  // => {"my":"map"}
  new File("data.txt")
      .openRead()
      .transform(UTF8.decoder)
      .transform(JSON.decoder)
      .listen(print);
}

The JSON.encode is synchronously encoding the provided map. The transform calls to the decoders are asynchronously decoding the provided data.

The chunked conversion is a mixture between the synchronous and streamed versions: it takes data in pieces (like the stream transformers), but works on the data synchronously. This can be interesting to do bigger transformations while still staying in control of how much data is transformed at every step.

For example, one might have a big string that should be decoded from JSON. Doing the transformation at once could take too long and make the program miss a frame. By chunking it into smaller pieces, the whole process probably takes even longer, but at least there is time do other things (like painting frames) in between.

import 'dart:convert';
import 'dart:io';
import 'dart:math';

main() {
  for (int i = 0; i < 5; i++) {
    var sw = new Stopwatch()..start();
    // Here we just read the contents of a file, but the string could come from
    // anywhere.
    var input = new File("big.json").readAsStringSync();
    print("Reading took: ${sw.elapsedMicroseconds}us");

    // Measure synchronous decoding.
    sw.reset();
    var decoded = JSON.decode(input);
    print("Decoding took: ${sw.elapsedMicroseconds}us");

    // Measure chunked decoding.
    sw.reset();
    const chunkCount = 100;  // Actually one more for simplicity.
    var result;
    // This is where the chunked converter will publish its result.
    var outSink = new ChunkedConversionSink.withCallback((List<dynamic> x) {
      result = x.single;
    });

    var inSink = JSON.decoder.startChunkedConversion(outSink);
    var chunkSw = new Stopwatch()..start();
    var maxChunkTime = 0;
    var chunkSize = input.length ~/ chunkCount;
    int i;
    for (i = 0; i < chunkCount; i++) {
      chunkSw.reset();
      var chunk = input.substring(i * chunkSize, (i + 1) * chunkSize);
      inSink.add(chunk);
      maxChunkTime = max(maxChunkTime, chunkSw.elapsedMicroseconds);
    }
    // Now add the last chunk (which could be non-empty because of the rounding
    // division).
    chunkSw.reset();
    inSink.add(input.substring(i * chunkSize));
    inSink.close();
    maxChunkTime = max(maxChunkTime, chunkSw.elapsedMicroseconds);
    assert(result != null);
    print("Decoding took at most ${maxChunkTime}us per chunk,"
        " and ${sw.elapsedMicroseconds} in total");
  }
}

First note that this example only works on the VM (and not in the browser). This is partially on purpose: JSON decoding in the browser falls back to the existing JSON.parse from JavaScript. As such, it doesn't really support chunked decoding, but just accumulates the input and then finally invokes the native parsing function. In the browser there is thus no advantage in decoding JSON in smaller pieces (streamed or chunked).

There is a lot going on in this sample, but most of it is straightforward:

Read in a big JSON string. Here we just read in the file from the disk. It could have come through the network too. We assume that the string is provided as one. Otherwise, a streaming transformation might be more interesting (although users could still consider splitting the input into smaller pieces).
Measure how long it takes to decode it synchronously. Nothing special here.
Measure a chunked decoding.

The first thing to do for a chunked conversion is to create a Sink where the chunked converter can dump results. For simplicity we use the ChunkedConversionSink.withCallback constructor. It invokes the given callback at the very end with all events that were pushed into the sink.

For example:

var sink = new ChunkedConversionSink.withCallback(print);
sink.add("foo");
sink.add("bar");
sink.close();  // Now invokes the callback and prints "[foo, bar]".

With this sink we start a chunked conversion: startChunkedConversion(sink). The converter returns another sink where we can now add new input. In the example we do this in a loop and measure how long it takes for each chunk to be processed. Finally, after the loop we add the last remaining chunk and close the sink. At this point the callback of our sink is invoked, which sets the result variable.

When running the program we have interesting numbers:

$ dart json.dart
Reading took: 20440us
Decoding took: 47553us
Decoding took at most 1270us per chunk, and 19543 in total
Reading took: 5995us
Decoding took: 21936us
Decoding took at most 472us per chunk, and 11554 in total
Reading took: 4480us
Decoding took: 9702us
Decoding took at most 471us per chunk, and 10974 in total
Reading took: 4084us
Decoding took: 9765us
Decoding took at most 472us per chunk, and 10846 in total
Reading took: 4118us
Decoding took: 9789us
Decoding took at most 14600us per chunk, and 25972 in total

Observe that we ran this benchmark in a loop (5 times). This is to give the Dart VM time to warm up. It's striking how much longer the first iteration took. At this time all methods still need to be compiled and everything takes much longer. Reading in the file takes ~20ms in the first iteration, and becomes four to five times faster.

Similarly, decoding the string first takes around 50ms the first time the converter is exercised and improves to less than 5ms. Despite already being (partially) warm, running the chunked conversion is also much slower the first time.

These times can be improved by using the VM's precompilation mode (or the hybrid one). In that case, a snapshot contains already precompiled executable code to avoid the penalty of the first compilation.

We can also observe that the chunked conversion is generally a bit slower in total, but takes much less time for the individual chunks (and these are just the maximum numbers).

Finally notice that garbage collection (GC) can have a big impact. In the last run the GC kicked in for one of the chunks and made it take ~15ms instead of the usual 500us. To be fair: this whole benchmark is allocating a lot: the JSON objects are allocated and discarded immediately (for the first 4 rounds), and the substring calls on the string aren't helping either.

Improving the GC would definitely help. For example, a GC similar to V8's incremental GC has very low pauses. Improving the GC is, however, a difficult task that takes time. We will eventually make the one in the Dart VM better, but in the meantime we should try to reduce the allocations in our benchmark. Instead of creating all the substrings we can take advantage of the fact that JSON.startChunkedConversion returns a StringConversionSink. This class has a addSlice function which processes a part of a string without requiring to cut it into smaller pieces:

void addSlice(String chunk, int start, int end, bool isLast);

We can rewrite our inner loop as follows:

    for (i = 0; i < 100; i++) {
      chunkSw.reset();
      inSink.addSlice(input, i * chunkSize, (i + 1) * chunkSize, false);
      maxChunkTime = max(maxChunkTime, chunkSw.elapsedMicroseconds);
    }
    // Now add the last chunk (which could be non-empty because of the rounding
    // division).
    chunkSw.reset();
    inSink.addSlice(input, i * chunkSize, input.length, true);

When running this program again I got the following times:

$ dart json2.dart
Reading took: 23607us
Decoding took: 48389us
Decoding took at most 1324us per chunk, and 17893 in total
Reading took: 5594us
Decoding took: 22574us
Decoding took at most 554us per chunk, and 10813 in total
Reading took: 4122us
Decoding took: 10123us
Decoding took at most 561us per chunk, and 9939 in total
Reading took: 4425us
Decoding took: 9177us
Decoding took at most 456us per chunk, and 9658 in total
Reading took: 4450us
Decoding took: 9436us
Decoding took at most 14056us per chunk, and 24294 in total

The numbers don‘t really show an improvement, which isn’t too surprising: the chunks themselves are dwarfed by the created objects. It‘s still good to know one doesn’t need to allocate these strings.

Also note that the VM hasn‘t finished optimizing yet. Running the benchmarks for 50 runs gives much better numbers. The numbers aren’t very stable (which explains the differences for the reading and the synchronous decoding), but it shows that the slicing version keeps up with the substring one. Shown below are just the numbers for the 50th run:

// json.dart (substring)
Reading took: 4041us
Decoding took: 8825us
Decoding took at most 121us per chunk, and 9813 in total

// json2.dart (addSlice)
Reading took: 3311us
Decoding took: 7220us
Decoding took at most 97us per chunk, and 7279 in total

It is important to realize that chunking itself isn't enough to let the framework (like Flutter) do its work. It is, however, a good way of controlling when to yield. Here is, a very crude function that would yield as soon as it has used up too much time:

/// Decodes [input] in a chunked way and yields to the event loop
/// as soon as [maxMicroseconds] have elapsed.
Future<dynamic> decodeJsonChunked(String input, int maxMicroseconds) {
  const chunkCount = 100;  // Actually one more.

  var result;
  var outSink = new ChunkedConversionSink.withCallback((x) { result = x[0]; });
  var inSink = JSON.decoder.startChunkedConversion(outSink);
  var chunkSize = input.length ~/ chunkCount;

  int i = 0;

  Future<dynamic> addChunks() {
    var sw  = new Stopwatch()..start();
    while (i < 100) {
      inSink.addSlice(input, i * chunkSize, (i + 1) * chunkSize, false);
      i++;
      if (sw.elapsedMicroseconds > maxMicroseconds) {
        // Usually one has to pay attention not to chain too many futures,
        // but here we know that there are at most chunkCount linked futures.
        return new Future(addChunks);
      }
    }
    inSink.addSlice(input, i * chunkSize, input.length, true);
    return new Future.value(result);
  }

  return addChunks();
}

The important line is return new Future(addChunks);. It stops the current execution and schedules the remaining chunks in a new event-loop slice. This leaves the frameworks (Flutter or DOM) the time to update the screen.

All example code is available here. The JSON file was generated randomly using https://www.json-generator.com/.

Optional Positional and Named Parameters

One of our oldest language improvement requests is to allow positional and named optional parameters for a function at the same time (issue 7056). This restriction is only rarely limiting, but when it comes into play it can be frustrating. It is then often the reason for inconsistent or clunkier APIs. For example, for some time we considered to make onError the default pattern to handle errors in parsing:

int int.parse(String input, {int radix, int onError(String source)});
DateTime DateTime.parse(String input, {DateTime onError(String source)});
double double.parse(String input, {double onError(String source)});
...

This doesn't work nicely with parse functions that already take an optional positional parameter, like Uri.parse:

Uri Uri.parse(String uri, [int start = 0, int end]);

There is no good way to add onError as named parameter without breaking the signature for existing users. (Fwiw, we would have needed to break the signature of double.parse as well, but there it was because onError was a positional optional parameter, which is inconsistent to start out with).

As a different example, we are discussing to make a StreamController support synchronous and asynchronous dispatch at the same time. We would have liked to do this with just a named argument: controller.add(value, sync: true). However, the StreamController.addError function already takes an optional positional parameter: the stacktrace:

void addError(Object error, [StackTrace stackTrace]);

At the moment there is no way to add a named sync parameter to this signature.

The language team thus has discussed allowing named and positional optional parameters at the same time. We would still need to look at all corner cases, but it looks like, typing-wise, there aren't big roadblocks in supporting named and positional parameters at the same time. Conceptually, a function type is a subtype of another function type if the named parameters and positional parameters match independently.

void foo(int requiredPositional, [int optionalPositional], {int named});

foo is Function(int);  // => true.
foo is Function(int, int);  // => true.
foo is Function(int, {int named});  // => true.
foo is Function(int, int, {int named});  // => true.

foo is Function({int named});  // => false.
foo is Function(int, {int named, int otherNamed});  // => false.

While the language specification thus can be evolved to handle this (non-breaking) feature, we also need to implement it.

The Dart VM itself has full control of the assembly instructions it emits. As such, it can be modify its calling convention to support optional positional and named arguments at the same time. It might be a lot of work, but there are no fundamental limitations.

Dart2js is able to handle positional and named parameters at the same time without changing its calling convention too much. In dart2js, a function that takes optional arguments might be compiled to multiple functions where the majority forwards to the main function. We call these additional methods “stubs”.

Take the following (instance) methods in Dart:

void foo([int x]) { ... }
void bar({int named}) { ... }

o.foo();
o.foo(1);
o.bar();
o.bar(named: 0);

They would be compiled to the following functions in JavaScript:

prototype.foo$0 = function() { return this.foo$1(null); };
prototype.foo$1 = function(x) { ... };

prototype.bar$0 = function() { return this.bar$1(null); };
prototype.bar$1$named = function(named) { ... };

o.foo$0();
o.foo$1(1);
o.bar$0();
o.bar$1$named(0);

In theory this means that dart2js would need to provide an exponential number of stubs (for named parameters), but, in practice, that's not the case. Dart2js uses global information (looking at the whole program) to determine which calls can potentially happen for every method and only provides the ones that are relevant.

Supporting positional and named optional parameters for the same function would just require more stubs. That isn‘t to say that implementing the feature wouldn’t require significant work in dart2js, but it would not require any changes to the calling conventions.

For DDC, supporting this feature is much harder. DDC tries to use JavaScript like calling conventions:

foo(x) {
  if (x === void 0) x = null;
  ...
}
bar(opts) {
  let named = opts && 'named' in opts ? opts.named : null;
  ...
}

o.foo();
o.foo(1);
o.bar();
o.bar({named: 0});

This is a problem when there can be both optional and named at the same time:

void gee([int x], {int named}) { ... }

gee();
gee(1);
gee(named: 0);
gee(2, named: 3);

The direct translation doesn't work:

gee(x, opts) {
  if (x === void 0) x = null;
  let named = opts && 'named' in opts ? opts.named : null;
  ...
}

gee();
gee(1);
gee({named: 0});
gee(2, {named: 3});

If the function was invoked with just a named argument (named: 0), then x receives the {named: 0} argument as value, and named would be set to null.

It's not possible to just always pass in dummy values for the optional positional either, since the caller might not always know whether this parameter is needed:

Function({int named}) fun = o.gee;  // Tear-off.
fun(named: 0);  // Doesn't know that it should provide a dummy argument.

Similar issues arise when a class is subtyped, and the subtype adds new optional parameters. Callers to the original type wouldn't know that there are now new optional parameters they should pass in.

Without global knowledge (which DDC doesn‘t have), it’s just not possible to pass in the arguments at the correct place (positional at their positional places, and named arguments in the opts parameter).

A different approach would tag the named argument object. This way, the callee could just look at the given parameters and figure out which one contains the named arguments:

gee(x, opts) {
  // In real code the tag would need to be a symbol so it wouldn't
  // conflict with user-provided names.
  if (x && 'namedTag' in x) {
    opts = x;
    x = null;
  }
  len named = opts && 'named' in opts ? opts.named : null;
  ...
}

gee();
gee(1);
gee({named: 0, namedTag: true});
gee(2, {named: 3, namedTag: true});

Only functions that have both optional positional and named parameters would need to do this check. The other functions could trust (because of the typing system) that the given arguments are in the correct positions.

However, all callers, even when targeting functions that don't have positional optional parameters, have to add the tag. This is for the same reasons as for the dummy values above.

Function({int named}) fun = gee;
fun(named: 0);  // Needs to tag the named object.

This means that every call site with named arguments pays for a feature that few functions use.

We are still exploring other options to see if we can decrease the cost for this feature.