I wrote a small Pythons script [1] to catch the event.reasons for ORStatus.CLOSED. The output is something like this:
orstatus.py --ctrlport 9051 DONE FAF3236D37B0B18D8438C46317940F642E296924 IOERROR 917A0A924DA50B46CD740924AB42B237A831E182 DONE DCEA2A6D8034E164A4FFDD8AFF997E2F6FB2ECF8 50.7.179.202 443 v4 0.4.4.5 IOERROR CDE4149F0DC65A7BE1AE440340BE1C7A18135E29 192.250.236.130 80 v4 0.4.4.6 DONE 24E2F139121D4394C54B5BCC368B3B411857C413 204.13.164.118 443 v4 0.4.4.6 DONE 24E2F139121D4394C54B5BCC368B3B411857C413 204.13.164.118 443 v4 0.4.4.6 IOERROR 8A51DC1ACBBC411E3B4D124F44A98F5ADC6A8984 IOERROR 57C799B990D61FB49E22E792670F5B28FE2E7FEA
A stat over a day or so showed that the majority of systems have 0-10 IOERRORS whereas there're few dozen systems failing all the time and having 100x more IOERRORS in the same time frame. I do wonder however whether such a statistics is capable to give a senseful answer if a relay itself behaves well or not?
[1] https://github.com/toralf/torutils/blob/master/orstatus.py