Healtch Check Response Format for HTTP APIsirakli@gmail.comhttp://www.freshblurbs.com/
General
Internet-DraftThis document proposes a “health check response” format for API HTTP clients.RFC EDITOR: please remove this section before publicationThe issues list for this draft can be found at https://github.com/inadarei/rfc-healthcheck/issues.The most recent draft is at https://inadarei.github.io/rfc-healthcheck/.Recent changes are listed at https://github.com/inadarei/rfc-healthcheck/commits/master.See also the draft’s current status in the IETF datatracker, at
https://datatracker.ietf.org/doc/draft-inadarei-api-healthcheck/.Vast majority of modern APIs, that drive data to web and mobile applications use
HTTP as a transport protocol. The health and uptime of these APIs
determine availability of the applications themselves. In distributed systems
built with a number of APIs, understanding the health status of the APIs and
making corresponding decisions, for failover or circuit-breaking, are essential
for providing highly available solutions.There exists a wide variety of operational software that relies on the ability
to read health check response of APIs. There is currently no standard for the
health check output response, however, so most applications either rely on the
basic level of information included in HTTP status codes or use
task-specific formats.Usage of task-specific or application-specific rformats creates significant
challenges, disallowing any meaningful interoprerability across different
implementations and between different tooling.Standardizing a format for health checks can provide any of a number of
benefits, including:Flexible deployment - since operational tooling and API clients can rely on
rich, uniform format, they can be safely combined and substituted as needed.Evolvability - new APIs, conforming to the standard, can safely be introduced
in any environment and ecosystem that also conforms to the same standard,
without costly coordination and testing requirements.This document defines a “health check” format using the JSON format
for APIs to use as a standard point for the health information they offer.
Having a well-defined format for this purpose promotes good practice and
tooling.The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”,
“SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be
interpreted as described in .An API Health Response Format (or, interchangeably, “health check response”)
uses the format described in and has the media type
“application/vnd.health+json”.Note: this media type is not final, and will change before final publication.Its content consists of a single mandatory root field and several optional
fields:status: (required) indicates whether the service status is acceptable or not.
API publishers SHOULD use following values for the field: “pass”: healthy,“fail”: unhealthy, and“warn”: healthy, with some concerns.
For “pass” and “warn” statuses HTTP response code in the 2xx - 3xx range MUST
be used. for “fail” status HTTP response code in the 4xx - 5xx range MUST be
used.
In case of “warn” status, additional information SHOULD be provided, utilizing
optional fields of the response.serviceID: (optional) unique identifier of the service, in the application
scope.description: (optional) human-friendly description of the service.memory: (optional) array of sizes for the currently utilized resident memory
(in kilobytes) on each of the logical nodes backing the service. Logical node
can be a physical server, VM, a container or any other logical unit that makes
sense for service publisher.cpu: (optional) array of cpu utiliation percentage on each of the logical
nodes backing the service. Logical node can be a physical server, VM, a
container or any other logical unit that makes sense for service publisher.uptime: (optional) current uptime in seconds since the last restartnotes: (optional) array of notes relevant to current state of healthoutput: (optional) raw error output, in case of “fail” or “warn” states. This
field SHOULD be omitted for “pass” state.details: (optional) an array of objects optionally providing additional information
regarding the various sub-components of the service.links: (optional) an array of objects containing link relations and URIs
for external links that MAY contain more information about the
health of the endpoint. Per web-linking standards a link relationship
SHOULD either be a common/registered one or be indicated as a URI, to avoid
name clashes.For example:Lorem IpsumLorem ipsumClients need to exercise care when reporting health information. Malicious
actors could use this information for orchestrating attacks. In some cases the
health check endpoints may need to be authenticated and institute role-based
access control.TODO: application/vnd.health+json will be submitted for registration per
Key words for use in RFCs to Indicate Requirement LevelsIn many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.Uniform Resource Identifier (URI): Generic SyntaxA Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. This specification defines the generic URI syntax and a process for resolving URI references that might be in relative form, along with guidelines and security considerations for the use of URIs on the Internet. The URI syntax defines a grammar that is a superset of all valid URIs, allowing an implementation to parse the common components of a URI reference without knowing the scheme-specific requirements of every possible identifier. This specification does not define a generative grammar for URIs; that task is performed by the individual specifications of each URI scheme. [STANDARDS-TRACK]Guidelines for Writing an IANA Considerations Section in RFCsMany protocols make use of identifiers consisting of constants and other well-known values. Even after a protocol has been defined and deployment has begun, new values may need to be assigned (e.g., for a new option type in DHCP, or a new encryption or authentication transform for IPsec). To ensure that such quantities have consistent values and interpretations across all implementations, their assignment must be administered by a central authority. For IETF protocols, that role is provided by the Internet Assigned Numbers Authority (IANA).In order for IANA to manage a given namespace prudently, it needs guidelines describing the conditions under which new values can be assigned or when modifications to existing values can be made. If IANA is expected to play a role in the management of a namespace, IANA must be given clear and concise instructions describing that role. This document discusses issues that should be considered in formulating a policy for assigning values to a namespace and provides guidelines for authors on the specific text that must be included in documents that place demands on IANA.This document obsoletes RFC 2434. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.Web LinkingThis document specifies relation types for Web links, and defines a registry for them. It also defines the use of such links in HTTP headers with the Link header field. [STANDARDS-TRACK]The JavaScript Object Notation (JSON) Data Interchange FormatJavaScript Object Notation (JSON) is a lightweight, text-based, language-independent data interchange format. It was derived from the ECMAScript Programming Language Standard. JSON defines a small set of formatting rules for the portable representation of structured data.This document removes inconsistencies with other specifications of JSON, repairs specification errors, and offers experience-based interoperability guidance.Hypertext Transfer Protocol (HTTP/1.1): CachingThe Hypertext Transfer Protocol (HTTP) is a stateless \%application- level protocol for distributed, collaborative, hypertext information systems. This document defines HTTP caches and the associated header fields that control cache behavior or indicate cacheable response messages.Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and RoutingThe Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems. This document provides an overview of HTTP architecture and its associated terminology, defines the "http" and "https" Uniform Resource Identifier (URI) schemes, defines the HTTP/1.1 message syntax and parsing requirements, and describes related security concerns for implementations.Media Type Specifications and Registration ProceduresThis document defines procedures for the specification and registration of media types for use in HTTP, MIME, and other Internet protocols. This memo documents an Internet Best Current Practice.Hypertext Transfer Protocol (HTTP/1.1): Semantics and ContentThe Hypertext Transfer Protocol (HTTP) is a stateless \%application- level protocol for distributed, collaborative, hypertext information systems. This document defines the semantics of HTTP/1.1 messages, as expressed by request methods, request header fields, response status codes, and response header fields, along with the payload of messages (metadata and body content) and mechanisms for content negotiation.Thanks to Mike Amundsen, Erik Wilde, Justin Bachorik and Randall Randall for
their suggestions and feedback. And to Mark Nottingham for blueprint for
authoring RFCs easily.When making an health check endpoint available, there are a few things to keep
in mind:A health response endpoint is best located at a memorable and commonly-used
URI, such as “health” because it will help self-discoverability by clients.Health check responses can be personalized. For example, you could advertise
different URIs, and/or different kinds of link relations, to afford different
clients access to additional health check information.Health check responses must be assigned a freshness lifetime (e.g.,
“Cache-Control: max-age=3600”) so that clients can determine how long they
could cache them, to avoid overly frequent fetching and unintended DDOS-ing of
the service.Custom link relation types, as well as the URIs for variables, should lead to
documentation for those constructs.Clients might use health check responses in a variety of ways.Note that the health check response is a “living” document; links from the
health check response MUST NOT be assumed to be valid beyond the freshness
lifetime of the health check response, as per HTTP’s caching model .As a result, clients ought to cache the health check response (as per
), to avoid fetching it before every interaction (which would
otherwise be required).Likewise, a client encountering a 404 (Not Found) on a link is encouraged obtain
a fresh copy of the health check response, to assure that it is up-to-date.There are a fair number of existing health check formats. However, these formats
have generally been optimised for particular use-cases, and less capable of
fitting into general scenarios, optimized for interoperability.Implementing them would add considerable complexity and the associated
potential for errors (both in the specification and by its users). For the sake
of interoperability and ease of implementation this specification doesn’t
attempt to create the most powerful format possible.