Here we compute some statistics on the accuracy of the
Askemos network.
Accuracy is approximately the equivalent of "uptime"
for physical computers.
Within the network we have two types of availability:
Since we run from the same basic hardware and - for the most part - software setup as most web services (PC's loaded with Linux or FreeBSD) average read availability is a fixed value of about 99.999% (less than 5 minutes a year of downtime).
To measure replica synchrony, each node in the network is configured to send a local, public write message with it's idea of current time to it's replica of the clock over there. The nodes try to find an agreement on that common time and if that can be found within 5 seconds, keep it and notify this agent here.
Here in turn we run the agreement protocol again to count how many minutes passed since the last agreed-on time tick. Furthermore we count the missing minutes as errors.
The whole test amounts to two runs of the protocol, which both fail with a certain probability for a variety of reasons like network latency, messages drops, delays from data base access, software updates, broken hardware etc. The time out of 5 seconds is actually sometimes hard to meet for the WAN test setup, though on average we have less than 0.3 seconds per agreement over SSL. The logfile confirms, that messages arrive as late as 20 seconds past the full minute.
The reason for the low availability measured here is that we measure the test network, which has downtime due to bugs.
The average accuracy is with atotal total number of agreements (failed and successful) and afailure number of failures roughly[1]:
| Current Time | Sat, 04 Feb 2012 04:44:07 +0100 |
|---|---|
| Checked Time | Sat, 04 Feb 2012 04:43:59 +0100 |
| Failure Count | 859935 |
| Accuracy | 86.9466% |
You may visit this place in debug mode (read only) or read the source code of the agent.