uptime
The proportion of time a service is reachable and responding correctly, usually expressed as a percentage over a window.
Uptime is a measure of availability: the fraction of time a service is responsive and operating normally, expressed as a percentage of a defined window (a week, a month, a year). 99.9% monthly uptime means the service was down for at most 43 minutes and 49 seconds in a 30-day period.
The number is only as meaningful as the underlying check. A service can post a high uptime percentage by checking from a single region every five minutes against a /healthz endpoint that doesn’t actually exercise the user-facing code path. Pay attention to check frequency, check breadth (regions, endpoints), and what the check is asserting.
uptime monitoring
Periodically probing a URL or endpoint from an external location to detect outages and degraded performance.
downtime
Any period during which a service is unavailable or degraded for end users. The inverse of uptime.
five nines
99.999% availability — about five minutes of downtime per year. The aspirational target for critical infrastructure.
SLA
Service-level agreement — a contractual promise of a target metric, often availability, with consequences for missing it.