This article was first published on the JSON Schema Blog and is canonically located at https://json-schema.org/blog/posts/bundling-json-schema-compound-documents
I’ve been known to say “If you haven’t rewritten your OpenAPI bundling implementation recently, then you don’t support OpenAPI 3.1”. This observation may be true, but perhaps some more detail would be helpful? When implementing support for OAS 3.1 and JSON Schema draft 2020-12 in oas-kit, reading the sections of the JSON Schema spec on bundling compound documents, I still wasn’t totally clear on what was expected of compliant tooling. Thankfully, Ben Hutton is here to set the record straight with a worked example. – Mike Ralphson, OAI TSC
Bundling has renewed importance
OpenAPI has long since put the spotlight on JSON Schema, and the release of OpenAPI 3.1 has huge implications for the future of both projects. I’m truly excited.
Developers of platforms and libraries that use OpenAPI haven’t had such a shake up before, and my feeling is it may take more than a few releases to correctly implement all the new shiny features full JSON Schema has to offer.
While the number of changes from JSON Schema draft-04 to draft 2020-12 are vast and the subject of more blog posts than are likely interesting, one of the key “features” of draft 2020-12 is a defined bundling process. (draft-04 is the version of JSON Schema that OAS used prior to version 3.1.0; or rather, a subset/superset of it.)
Indeed, bundling, if anything, is going to be more important to get right than ever. OAS 3.1 ushering in full JSON Schema support dramatically increases the likelihood that developers with existing JSON Schema documents will use them by reference in new and updated OpenAPI definitions. Ultimate source of truth matters, and it’s often the JSON Schemas.
Many tools don’t support referencing external resources. Bundling is a convenient way to package up schema resources spread across multiple files in a single file for use elsewhere, such as an OpenAPI document.
Existing solutions? New solutions!
There are several libraries which offer bundling solutions, however they all have caveats, and I haven’t seen any to date which are fully JSON Schema aware. The most popular of these libraries is called json-schema-ref-parser, however it reports that it was not intended to be JSON Schema aware, and is only intended to cover the JSON Reference specification (Which has been bundled back into the JSON Schema specification now).
We are hoping to provide you with a canonical implementation (Right, Mike?!) and enough information to get started building your own in your language of choice. (Although, it’s always best to read the full specification when developing implementations.)
Bundling fundamentals
Firstly, let’s visit some key definitions in JSON Schema draft 2020-12. The $id keyword is used to identify a “schema resource”. In the example below, the $id is https://jsonschema.dev/schemas/mixins/integer for the resource.
{ "$id": "https://jsonschema.dev/schemas/mixins/integer", "$schema": "https://json-schema.org/draft/2020-12/schema", "description": "Must be an integer", "type": "integer" }
A “Compound Schema Document” is a JSON document which has multiple embedded JSON Schema Resources. Below is a simplified example of one we’ll unpack a bit later.
{ "$id": "https://jsonschema.dev/schemas/examples/non-negative-integer-bundle", "$schema": "https://json-schema.org/draft/2020-12/schema", "description": "Must be a non-negative integer", "$comment": "A JSON Schema Compound Document. Aka a bundled schema.", "$defs": { "https://jsonschema.dev/schemas/mixins/integer": { "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://jsonschema.dev/schemas/mixins/integer", "description": "Must be an integer", "type": "integer" }, "https://jsonschema.dev/schemas/mixins/non-negative": { "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://jsonschema.dev/schemas/mixins/non-negative", "description": "Not allowed to be negative", "minimum": 0 }, "nonNegativeInteger": { "allOf": [ { "$ref": "/schemas/mixins/integer" }, { "$ref": "/schemas/mixins/non-negative" } ] } }, "$ref": "#/$defs/nonNegativeInteger" }
Last, let’s look at the carefully crafted definition of “bundling” according to the JSON Schema specification:
“The bundling process for creating a Compound Schema Document is defined as taking references (such as “$ref”) to an external Schema Resource and embedding the referenced Schema Resources within the referring document. Bundling SHOULD be done in such a way that all URIs (used for referencing) in the base document and any referenced/ embedded documents do not require altering.”
With these definitions in mind, now we can look at the defined bundling process for JSON Schema resources! We will only cover the ideal situation in this article. The goal here is to have no external Schema Resources.
Note, this article does NOT cover “total dereferencing”, which is removing all uses of $ref from a schema. This is not advised, and is not always even possible, such as when there are self references.
Bundling Simple External Resources
In our first example, we have an ideal situation for bundling. Each schema has an $id and $schema defined, making the bundling process simple. We’ll cover various other situations and edge cases in further examples, but having each resource define its own identity and dialect is always preferable. Our primary schema resource references two other schema resources using the in-place applicator $ref with the value being a relative URI. The relative URI is resolved against the base URI, which in this instance is found in the primary schema resource’s $id value. By combining “integer” and “non-negative” schemas, we create a “non-negative integer” schema.
{ "$id": "https://jsonschema.dev/schemas/mixins/integer", "$schema": "https://json-schema.org/draft/2020-12/schema", "description": "Must be an integer", "type": "integer" }
{ "$id": "https://jsonschema.dev/schemas/mixins/non-negative", "$schema": "https://json-schema.org/draft/2020-12/schema", "description": "Not allowed to be negative", "minimum": 0 }
{ "$id": "https://jsonschema.dev/schemas/examples/non-negative-integer", "$schema": "https://json-schema.org/draft/2020-12/schema", "description": "Must be a non-negative integer", "$comment": "A JSON Schema that uses multiple external references", "$defs": { "nonNegativeInteger": { "allOf": [ { "$ref": "/schemas/mixins/integer" }, { "$ref": "/schemas/mixins/non-negative" } ] } }, "$ref": "#/$defs/nonNegativeInteger" }
Should “non-negative-integer” schema be used as the primary schema in an implementation, the other schemas would need to be available to the implementation. At this point, exactly how that implementation loads in the schemas doesn’t matter, as they have fully qualified URIs as their identity defined in $id. Any implementation that loads in schemas should build an internal local index of schema URIs defined in $id to schema resources.
Remember, any schema which provides a value for $id is considered a Schema Resource.
Let’s resolve (dereference) one of the references in our primary schema. “$ref”: “/schemas/mixins/integer” resolves to a fully qualified URI of https://jsonschema.dev/schemas/mixins/integer by following the rules for first determining the base URI and then resolving the relative URI against that base URI. The implementation should then check its internal index of schema identifiers and schema resources, finding a match, and using the appropriate previously loaded schema resource.
The bundling process is done. The previously externally referenced schemas are copied into $defs in our primary schema, as is. The keys for the $defs object are the identifying URIs, but they can be anything, as those values won’t be referenced (They could be UUIDs if you like). Looking at our final bundled schema… I mean “Compound Schema Document”, we now have multiple Schema Resources embedded in a single Schema document.
{ "$id": "https://jsonschema.dev/schemas/examples/non-negative-integer-bundle", "$schema": "https://json-schema.org/draft/2020-12/schema", "description": "Must be a non-negative integer", "$comment": "A JSON Schema Compound Document. Aka a bundled schema.", "$defs": { "https://jsonschema.dev/schemas/mixins/integer": { "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://jsonschema.dev/schemas/mixins/integer", "description": "Must be an integer", "type": "integer" }, "https://jsonschema.dev/schemas/mixins/non-negative": { "$schema": "https://json-schema.org/draft/2020-12/schema", "$id": "https://jsonschema.dev/schemas/mixins/non-negative", "description": "Not allowed to be negative", "minimum": 0 }, "nonNegativeInteger": { "allOf": [ { "$ref": "/schemas/mixins/integer" }, { "$ref": "/schemas/mixins/non-negative" } ] } }, "$ref": "#/$defs/nonNegativeInteger" }
When the bundled schema is initially loaded and evaluated, the implementation should create its own internal index of schema identifiers and schema resources, just as before. The relative URIs used to reference those schema resources need not change.
The simplest way to see this bundled schema working as expected is to paste it into https://json-schema.hyperjump.io and then try different values for the instance. I’m hopeful to bring several updates to https://jsonschema.dev over the next few months, but times are busy as we continue to elevate JSON Schema as an organisation.
It’s worth remembering that the example in this article shows the ideal situation, when best practices have been followed. The JSON Schema specification does define additional processes for non-ideal situations and edge cases (such as when $id or $schema are not set), however, some solutions may be indirectly related to Compound JSON Schema Documents. For example, establishing the base URI follows the steps laid out in RFC3986, which JSON Schema does not redefine.
OpenAPI Specification Example
Let’s look at an example of how this might work with an OpenAPI definition.
openapi: 3.1.0 info: title: API version: 1.0.0 components: schemas: non-negative-integer: $ref: 'https://jsonschema.dev/schemas/examples/non-negative-integer'
We start with our input OpenAPI 3.1.0 specification document. For brevity, we’re only showing the components section with a single component, but let’s assume some other part of the document uses the component schema “non-negative-integer”.
“non-negative-integer” has a single reference to a JSON Schema resource. The reference URI is an absolute URI, including domain and path, meaning there’s no need to do any “resolve the relative URI against the base URI” dance.
All the schemas required to resolve and bundle the reference are provided to the bundling tooling. After the schemas are loaded into the implementation, their originating physical location no longer matters.
openapi: 3.1.0 info: title: API version: 1.0.0 components: schemas: # This name has not changed, or been replaced, as it already existed and is likely to be referenced elsewhere non-negative-integer: # This Reference URI hasn't changed $ref: 'https://jsonschema.dev/schemas/examples/non-negative-integer' # The path name already existed. This key doesn't really matter. It could be anything. It's just for human readers. It could be an MD5! non-negative-integer-2: $schema: 'https://json-schema.org/draft/2020-12/schema' $id: 'https://jsonschema.dev/schemas/examples/non-negative-integer' description: Must be a non-negative integer $comment: A JSON Schema that uses multiple external references $defs: nonNegativeInteger: allOf: # These references remain unchanged because they rely on the base URI of this schema resource - $ref: /schemas/mixins/integer - $ref: /schemas/mixins/non-negative $ref: '#/$defs/nonNegativeInteger' integer: $schema: 'https://json-schema.org/draft/2020-12/schema' $id: 'https://jsonschema.dev/schemas/mixins/integer' description: Must be an integer type: integer non-negative: $schema: 'https://json-schema.org/draft/2020-12/schema' $id: 'https://jsonschema.dev/schemas/mixins/non-negative' description: Not allowed to be negative minimum: 0
The schemas are inserted into the components/schemas location of the OAS document. The keys used in the schemas object have no importance for reference resolution, although you will want to avoid potential duplications. References need not change, and a processor of the resulting bundled or Compound Document, should look for the use of embedded Schema Resources within the OAS document, keeping track of the $id values.
But what about…
The astute among you might have noticed that Compound Documents may not be correctly validated using a meta-schema for the dialect defined at the document root. One of our principal contributors distilled a great explanation which he has agreed to let us share with you.
“If an embedded schema has a different $schema than the parent schema, then a Compound Schema Document can’t be validated against a meta-schema without deconstructing it into separate schema resources and applying the appropriate meta-schema to each. That doesn’t mean the Compound Schema Document is not usable without deconstruction, it just means that implementations need to be aware that the $schema can change during evaluation and handle such changes appropriately.” – Jason Desrosiers.
If you’d like a more in-depth look at edge case situations, please do let us know.
You can reach out to us @jsonschema or our Slack server.
I hope you’ll agree, Ben has clarified the process for us all here, and we can use this example to fully meet JSON Schema’s bundling expectations when writing tools which bundle multiple resources into compound OpenAPI documents. Thanks, Ben! – Mike
OpenAPI developer community and JSON Schema community work together to build upgrade that supports 100% compatibility with the latest draft of JSON Schema
SAN FRANCISCO – February 18, 2021 – The OpenAPI Initiative, the consortium of forward-looking industry experts focused on creating, evolving and promoting the OpenAPI Specification (OAS), a vendor-neutral, open description format for HTTP (including RESTful) APIs, announced today that the OpenAPI Specification 3.1.0 has been released. This new version now supports 100% compatibility with the latest draft (2020-12) of JSON Schema.
Along with this release, the OpenAPI Initiative has sponsored the creation of new documentation to make it easier to understand the structure of the specification and its benefits. It is available here: https://oai.github.io/Documentation/
The OpenAPI Specification is a broadly adopted industry standard for describing modern APIs. It defines a standard, programming language-agnostic interface description for HTTP APIs which allows both humans and computers to discover and understand the capabilities of a service without requiring access to source code, additional documentation, or inspection of network traffic.
The OpenAPI Specification (OAS) is used by organizations worldwide including Atlassian, Bloomberg, eBay, Google, IBM, Microsoft, Oracle, Postman, SAP, SmartBear, Vonage, and many more.
“The benefits of using the OpenAPI Specification are broadly applicable, ranging from API lifecycle management, to documentation, to security, to microservices development and much, much more,” said Marsh Gardiner, Product Manager, Google, and Technical Steering Committee, OpenAPI Initiative. “Great care was taken in evolving to version 3.1.0 to ensure it is an incremental upgrade for existing users, while also making it an excellent candidate for immediate evaluation and adoption in corporate environments. We extend our heartfelt gratitude to the diverse group of contributors for all their exceptional skills and effort on our latest achievement.”
“The mismatch between OpenAPI JSON Schema-like structures and JSON Schema itself has long been a problem for users and implementers. Full alignment of OpenAPI 3.1.0 with JSON Schema draft 2020-12 will not only save users much pain, but also ushers in a new standardised approach to schema extensions,” said Ben Hutton, JSON Schema project lead. “We’ve spent the last few years (and release) making sure we can clearly hear and understand issues the community faces. With our time limited volunteer based effort, not only have we fixed many pain points and added new features, but JSON Schema vocabularies allows for standards to be defined which cater for use cases beyond validation, such as the generation of code, UI, and documentation.
On Day One of JSON Schema draft 2020-12 being released, two implementations were ready. It’s humbling to work with such an experienced and skilled team.”
While JSON Schema is still technically a “draft” specification, draft 2020-12 sets a new stable foundation on which 3rd parties can build standardised extensions. The JSON Schema team do not foresee any major changes to the approach of the extension system, like dialects and vocabularies. However, the utility may be improved as feedback is received.
JSON Schema website: https://json-schema.org
JSON Schema Open Collective: https://opencollective.com/json-schema
JSON Schema Twitter: https://twitter.com/jsonschema
Major Changes in OpenAPI Specification 3.1.0
- JSON Schema vocabularies alignment
- New top-level element for describing Webhooks that are registered and managed out of band
- Support for identifying API licenses using the standard SPDX identifier
- PathItems object is now optional to make it simpler to create reusable libraries of components. Reusable PathItems can be described in the components object. There is also support for describing APIs secured using client certificates.
Full OpenAPI Specification 3.1.0 release notes are available here: https://github.com/OAI/OpenAPI-Specification/releases/tag/3.1.0
A Note on Semantic Versioning
The OpenAPI Initiative had adopted semantic versioning to communicate the significance of changes in software upgrades. Semantic versioning is a popular numbering methodology where minor version updates indicate changes to software are backward compatible, whereas major updates are not. Technically, using semantic versioning with the new full alignment with JSON Schema would require this change to be denoted as 4.0.0. However, this update to OpenAPI important improvements, specifically the alignment with JSON Schema, but to force it into a major release numbering would have created a mismatch of expectations.
Special Thanks
A special callout to Henry Andrews, Phil Sturgeon, and Ben Hutton for all their work, support and patient explanations they have provided to better align JSON Schema and the OpenAPI Specification. Many thanks to Lorna Mitchell for driving the Webhooks effort, using our new proposal process. And thanks to the many, many open source developers involved worldwide.
OpenAPI Resources
To learn more about participate in the evolution of the OpenAPI Specification: https://www.openapis.org/participate/how-to-contribute
● OpenAPI Specification Twitter
● OpenAPI Specification GitHub – Get started immediately!
● Share your OpenAPI Spec v3 Implementations
About the OpenAPI Initiative
The OpenAPI Initiative (OAI) was created by a consortium of forward-looking industry experts who recognize the immense value of standardizing on how APIs are described. As an open governance structure under the Linux Foundation, the OAI is focused on creating, evolving and promoting a vendor neutral description format. The OpenAPI Specification was originally based on the Swagger Specification, donated by SmartBear Software. To get involved with the OpenAPI Initiative, please visit https://www.openapis.org
About Linux Foundation
Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation projects like Linux, Kubernetes, Node.js and more are considered critical to the development of the world’s most important infrastructure. Its development methodology leverages established best practices and addresses the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org.
Request to the community! Please review RC1, implement it, and share with us your feedback by November 8th. The final version should come shortly after that.
Release candidate 1 (RC1) of OpenAPI Specification 3.1, the Implementer’s Draft, is available for testing and evaluation.
The enhancements address some of the most requested features from the OpenAPI developer community. Specifically, the OpenAPI Specification is now fully compatible with the latest draft of JSON Schema. This has been a significant effort between the OpenAPI developer community and the members of the JSON Schema community.
Changes include:
- A new top-level element for describing Webhooks that are registered and managed out of band. Many thanks to Lorna Mitchell for driving this effort, using our new proposal process.
- Improved support for identifying API licenses using the standard SPDX identifier.
- The PathItems object is now optional to make it easier to create indexes of reusable components. Reusable PathItems can be described in the components object. There is also support for describing APIs secured using client certificates.
You can learn more about RC1 here.
Special thanks to Henry Andrews, Phil Sturgeon, and Ben Hutton for all their hard work and support.