[metrics-team] collecting onionoo outage reason stats

Karsten Loesing karsten at torproject.org
Tue Apr 3 15:38:03 UTC 2018

On 2018-03-19 11:50, Karsten Loesing wrote:
> On 2018-03-16 19:26, nusenu wrote:
>> Hi,
> Hi nusenu,
>> would be great if you could reply to metrics-alerts notifications with the
>> reason for the outage once it is solved.
>> I'd like to collect the reasons, maybe we can use
>> them to improve and reduce the outages.
> In most cases the reason has been that the machine hosting the CollecTor
> virtual machine had an issue at a time when no sysadmin was around.
> One option to improve the situation is to move to a host that is more
> stable than the one we're currently on. This may be as simple as asking
> to move the virtual machine elsewhere, but only if there's another host
> available.
> Another option is to stop relying as much on a single host. We already
> made a huge step into this direction by syncing from a backup CollecTor
> instance, which is also the reason why we're not losing data. But this
> doesn't solve the issue if the primary CollecTor instance goes down.
> Adding even more redundancy requires writing more code, which is
> something we might not have the time for in the next 6 months.
> Maybe there are more other, simpler options.
> I'd say let's discuss this more at this week's team meeting.

Quick update: at last week's team meeting we discussed another option,
which is to have Onionoo fetch descriptors from two Collector instances.
If one of them goes down temporarily, Onionoo won't be affected. I wrote
some code to do this, and early results look promising. See ticket
#25700 for details.


All the best,

>> thanks,
>> nusenu
> All the best,
> Karsten

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 528 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/metrics-team/attachments/20180403/f8b362da/attachment.sig>

More information about the metrics-team mailing list