A straightforward improvement to BWauth measurement crossed my mind.
Seems likely part of the volatile, bipolar measurement issue is overfast feedback of weighting increases and the increased traffic that results.
For example, a BWauth measures 8 MByte/sec of bandwidth day one and increases the assigned score to 20k. The relay's weight attracts a pile-on of new traffic and now by day three the relay measures 2 Mbyte/sec of available bandwidth due to the presence a huge amount of traffic, and the BWauth crashed the assigned value back to perhaps 10k.
Thus the weight of the relay swinges wildly between two extremes.
Solution is for BWauths to time- average several days of measurements, probably with a decaying weight for older samples. Ten days of samples with the oldest four assigned declining weights comes to mind as a place to start, though of course the number of days and weighting parameters should be easily adjusted.
This will result in gradual shifting of BW weights assigned to relays with an equilibrium outcome rather than wild swings.
Will also compensate for random sample timing where a BWauth may test a relay at a busy time on one day and a light load time the next day.
Probably a downside threshold should exist and trigger the resetting of the accumulated data points to address relays that fail or deteriorate rapidly.