tl;dr When building a circuit, measuring the RTT a single time could provide better latency and anonymity while not affecting throughput. Multiple measurements could be used for running real-time applications like VoIP or optimizing throughput.
Despite the fact that the Tor network is currently in an unusual state so to say, I have been spending the last weeks looking into stream-RTT data of circuits. I gathered the data shortly before and at the beginning of the huge botnet usage. This is what I have found out: As assumed stream-RTT measurements of a single circuit are not at a fixed value but distributed since they are subject to multiple influences. After comparing stream-RTT distributions of multiple circuits, I found lots of different shapes and I realized that no single distribution fits them all. The Time-To-First-Byte (TTFB) for fetching a small website over HTTP is used to approximate the latency of a certain circuit. I used different methods to check the correlation between the RTT of a circuit and its TTFB - all indicating a very high correlation. Hence, stream-RTTs of a circuit make a good estimator for its TTFB and therefor its latency. In terms of latency, using a single stream-RTT measurement ("First-RTT") performs better than the currently used method CBT. So far I haven't done any testing/calculations on the other metrics: bandwidth and anonymity. I would assume the former to be unaffected by First-RTT. Latter could probably be slightly increased, if the percentage of discarded circuits would be reduced from 20% with CBT to 10% or 15% with First-RTT - while still achieving a minor improvement in latency. Nevertheless I would not recommend using First-RTT as method for providing low latency circuits to applications, because it only gives a small hint about the quality of a circuit and cannot make sure that some latency properties hold for a certain circuit. Nevertheless First-RTT works pretty well comparing to the minimum effort it takes.
Additionally I played around a lot with methods to provide a better estimator for latency properties of a certain circuit. But they all need far more than a single measurement and are therefor out of scope for the common case. Besides they cannot protect against suddenly changing circuit conditions. But they could be used to fulfill a application specific maximum RTT for real-time applications like VoIP. With the use of similar techniques it should be possible to detect circuits that include a node that's within its bandwidth limit. This could be used for providing high bandwidth circuits for applications like BitTorrent.
Best, Robert