Service capabilities / error behaviors #1163

benjie · 2025-04-30T11:06:37Z

This is a rewrite of

Allow clients to disable error propagation via request parameter #1153

It is re-implemented on top of the recent editorial work (e.g. renaming _field error_ to _execution error_) and also makes a significant change in that it does not require onError to be included in the response, instead an introspection field is used to indicate:

that onError is supported
what the behavior will be if onError is not present

Replaces:

GraphQL.js implementation:

Implement onError proposal graphql-js#4364

Please see this 60 second video on the motivation for this PR (the last few seconds of the video also covers "transitional non-null" which is a separate concern).

As agreed at the nullability working group, disabling error propagation is the future of error handling in GraphQL. Error propagation causes a number of issues, but chief among them are:

It destroys useful data in the response.
It makes it unsafe to store resulting data in normalized stores.

Clients such as Relay do not want error propagation to be a thing.

This has traditionally resulted in schema design best practices advising using nullable in positions where errors were expected, even if null was never a semantically valid value for that position. And since errors can happen everywhere, this has lead to an explosion of nullability and significant pain on the client side with developers having to do seemingly unnecessary null checks in loads of positions, or worse - unsafely bypassing the type safety.

The reason that GraphQL does error propagation is to keep it's "not null" promise in the event that an error occurs (and is replaced with null due to the way GraphQL responses are structured and limitations in JSON) in a non-nullable position.

It doesn't take much code on the client to prevent the client reading a null that relates to an error, graphql-toe can be used with almost any JavaScript or TypeScript based GraphQL client (not Relay, but it has @throwOnFieldError that you can use instead) and achieves this in 512 bytes gzipped - and that's with a focus on performance rather than bundle size.

This PR allows the client to take responsibility for error handling by specifying onError: "NULL" as part of the GraphQL request, and thereby turns off error propagation behavior. This is also set as the recommended default for future schemas.

With clients responsible for error handling, we no longer need to factor the possibility of whether something can error or not into its nullability, meaning we can use the not-null ! to indicate all the positions in the schema for which null is not a semantically valid value - i.e. the underlying resource will never be a legitimate null.

The end result:

true nullability indicated in schema - no more thinking about where errors are likely
fewer null checks on clients
clients can leverage their native error handling capabilities such as try/catch or <ErrorBoundary />
safe to store errored responses into normalized stores

I've also included onError: "HALT" in this proposal. We've discovered that there's a small but significant class of clients out there, mostly ad-hoc scripts, that throw away the entire response when any error occurs. By codifying this into the spec we make it easier to implement these clients, and we allow the server to halt processing the rest of the request unnecessarily.

As noted by @revangen in this comment:

I've also included onError: ~~"ABORT"~~ "HALT" in this proposal.

Appreciate this being included. For Shopify's public Admin GraphQL API, we have a mix of scenarios that result in a partial success response and only error response. Having been around for 8+ years, we are reluctant at times to change its behaviour to favour partial responses as we don't control majority of clients. Providing clients a way to specify the server's behaviour provides a migration path should clients care about partial responses.

netlify · 2025-04-30T11:06:55Z

✅ Deploy Preview for graphql-spec-draft ready!

Name	Link
🔨 Latest commit	`cc50991`
🔍 Latest deploy log	https://app.netlify.com/projects/graphql-spec-draft/deploys/686f8da66b5e92000889bc4b
😎 Deploy Preview	https://deploy-preview-1163--graphql-spec-draft.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

benjie · 2025-04-30T11:43:22Z

I have applied the RFC1 label to this as inherited from the PR it replaces:

Allow clients to disable error propagation via request parameter #1153

benjie · 2025-04-30T13:52:44Z

A full, CI-passing, implementation of this is available in GraphQL.js now; check it out: graphql@canary-pr-4364

martinbonnin · 2025-04-30T14:20:33Z

spec/Section 4 -- Introspection.md

+enum __ErrorBehavior {
+  NO_PROPAGATE
+  PROPAGATE
+  ABORT


Moving my previous comment here, love this proposal overall but I would love it even more with NULL

enum __ErrorBehavior { NULL PROPAGATE ABORT }

tldr: I believe this is going to be read and discussed a lot more than it's going to be written by users and I would optimize for the conciseness, pronouncability and memorability of NULL.

To summarize my thoughts on it from the previous thread: NULL is what I originally wrote, I agree it looks and initially feels nicer; but ultimately I discounted it for a number of reasons:

Communication issues

If you tell someone on the phone to set onError: NULL, they're very likely to instead set onError: null.

Ambiguity issues

onError: null and onError: NULL are very similar, but the first means PROPAGATE and the latter means NO_PROPAGATE. This is likely to be the source of many user errors and much frustration (both for newbies, and for the people supporting them including library authors).

Errors already become null in the spec:

[an error] raised [...] is handled as though the response position [...] resolved to null
-- https://spec.graphql.org/draft/#sel-FANVNJCAACGW-qR

onError: NULL doesn't implicitly seem to mean something different from this.

Please suggest other alternatives!

I'm very open to changing the word NO_PROPAGATE to something else, I really don't like NO_PROPAGATE, not least because I always want to spell it propOgate rather than propAgate. But I think changing it to NULL would be a mistake.

I've gone through about 30 alternatives when brainstorming it, and none are sufficiently clear whilst also being shorter than NO_PROPAGATE. My favourite alternatives are things like LOCAL/LOCALIZE/ISOLATE/CONTAIN, but I'm not sure they convey the meaning quite as well as NO_PROPAGATE does.

Ooo... Maybe LOCAL_NULL actually is quite compelling. Not thought of that one before.

spec/Section 6 -- Execution.md

spec/Section 3 -- Type System.md

GraphQLConf requires speakers to be familiar with "inclusive naming", and the inclusive naming guides encourage the avoidance of the word "abort" where possible: https://inclusivenaming.org/word-lists/tier-1/abort/

benjie · 2025-05-22T20:07:04Z

This PR description is currently out of date; I've reworked the capabilities infrastructure based on Lee's feedback, see the changes for details.

martinbonnin · 2025-05-25T18:09:27Z

@benjie @leebyron given the previous discussions on feature discovery have been going on for years, I'm doubtful we can land __Capability soon... Yet onError is crucially needed to move the nullability proposals, is there any chance we could use a new introspection field (__Schema.defaultErrorBehaviour) instead? Or should we expedite __Capability in a separate proposal first? (I'm down with that FWIW)

benjie · 2025-05-25T18:35:19Z

@martinbonnin You don’t like this proposal?

martinbonnin · 2025-05-26T08:21:56Z

@benjie Don't get me wrong, I like __Capability! It's just something that has been discussed before in the community. For an example, it's relatively similar to https://github.com/graphql/graphql-wg/blob/main/rfcs/FeatureDiscovery.md. Feels like getting __Capability through the finish line is probably going to require a much wider discussion and wait times (the time for everyone potentially affected to review __Capability).

While I think onResponse is in a much better spot of being fast tracked.

Of course, I'm all for having both of them. If we want to do that, I think it'd make sense to merge __Capability first and then have onError use that as __Capability solves a more general problem.

In a perfect world, we can land both __Capability and onError in the next couple of months.

spec/Section 4 -- Introspection.md

martinbonnin · 2025-05-27T22:18:23Z

spec/Section 4 -- Introspection.md

+must contain only ASCII letters, digits and hyphens. These constraints are
+inspired by reverse domain name notation to encourage global uniqueness and
+collision-resistance. Unlike the domain name system, capability identifiers are


I like this!

spec/Section 4 -- Introspection.md

alex-reilly-dd · 2025-05-28T19:45:57Z

spec/Section 4 -- Introspection.md

+
+**Capability identifier**
+
+A capability identifier is a string value composed of two or more segments


Clarify that this is a suggestion

Governed by the "preserve option value" guiding principle:

It's hard to know what the future brings; whenever possible, decisions should be made that allow for more options in the future. Sometimes this is unintuitive: spec rules often begin more strict than necessary with a future option to loosen when motivated by a real use case.

I think we should keep this as a requirement rather than a suggestion/recommendation. It's quite simple to comply with, and we can always loosen it up in future if need be.

E.g. maybe we might want to add an SDL syntax for this in future, it would be a shame to have to wrap all the keys in strings:

service { capabilities { org.graphql.onError org.graphql.defailtErrorBehavior: "PROPAGATE" org.graphql.operationDescription org.graphql.ws.subscriptions.endpoint: "wss://example.com/graphql" org.graphql.ws.queries org.graphql.ws.mutations com.graphile.env: "Production" com.graphile.hash: "sdf9009s8dg09sdf809" org.graphql.rfc.ccn: "v1" org.graphql.rfc.incrementalDelivery: "June2023" } }

That said, I'm not too bothered and happy to remove it if there's a good reason.

spec/Section 4 -- Introspection.md

BoD · 2025-06-05T18:56:41Z

spec/Section 4 -- Introspection.md

@@ -218,6 +219,15 @@ enum __DirectiveLocation {
  INPUT_OBJECT
  INPUT_FIELD_DEFINITION
 }
+
+type __Service {
+  capabilities: [__Capability!]!


Wondering if this should be a more generic "metadata" rather than "capabilities"? It would be essentially the same thing, just a different name and different expectation of what to expect in the list. For example, if some service would like to advertise some form of versioning they could do {identifier: "com.example.version", value: "4"} but that's not really a "capability" per se?

BoD · 2025-06-06T09:00:40Z

Could this be split in 2 RFCs?

onError on requests
Capabilities

Adoption could look like:

Clients and servers implement onError

Clients send onError on requests without knowing if the server implements it. For servers that don’t, onError is ignored, and errors are propagated as today.
Clients and servers implement capabilities

Clients use introspection to know if onError is supported and what the default value is.

benjie · 2025-06-24T09:51:58Z

It could be split, but I would do the capabilities feature first and then onError so we don't need a period of time where you send onError and hope - better to be concrete from the beginning IMO as client capabilities may depend on it being honoured. It makes sense to ship capabilities alongside something that the spec itself needs it for in my opinion - could be error behaviors, could be fragment arguments, but I think it should be something.

michaelstaib · 2025-06-26T18:08:27Z

We have discussed the service capabilities proposal last week in the Composite Schema Working Group, and I wanted to share our perspective on this.

We'd strongly advocate for having service capabilities represented in the SDL. This is essential for the schema composition tooling as only SDL exposes type system directives and the introspection format does not carry these yet. Also most tooling for composing schemas does not dynamically introspect a source schemas but rather works without needing to start a GraphQL server against schema files.

Concretely, we are discussing for quite some time how we can inspect capabilities like batching support. Depending on if a server supports batching or not the schema composition would allow for features like cross-service data requirements.

There are other things that we think would be great to inspect in that process like the schema name and other metadata.

We did consider introducing our own capabilities directives to the specification, that would attach to the schema definition node. But to be honest this always felt like duck taping these things into GraphQL.

We agree that we should keep it simple with a string-base value for service capabilities. This allows us to start using it immediately without complicating the spec too much. It also leaves room for future enhancements, such as structured metadata or enums, as the ecosystem evolves.

We also think this should be decoupled from the error proposal as this would solve so many other concerns in GraphQL.

benjie · 2025-07-10T09:55:04Z

Have merged the latest main and have changed NO_PROPAGATE to NULL - @martinbonnin will be happy :)

benjie added 2 commits April 30, 2025 11:46

Detail onError request parameter

31c90e7

Detail introspection changes

f4fab96

Define the directive

692d811

benjie marked this pull request as ready for review April 30, 2025 11:41

benjie mentioned this pull request Apr 30, 2025

Allow clients to disable error propagation via request parameter #1153

Closed

benjie added the 💡 Proposal (RFC 1) RFC Stage 1 (See CONTRIBUTING.md) label Apr 30, 2025

This was referenced Apr 30, 2025

Implement onError proposal graphql/graphql-js#4364

Draft

Allow schema @behavior(onError: ...) without explicit root operations #1164

Closed

martinbonnin reviewed Apr 30, 2025

View reviewed changes

benjie mentioned this pull request Apr 30, 2025

Add Transitional Non-Null appendix (@noPropagate directive) #1165

Open

fotoetienne reviewed Apr 30, 2025

View reviewed changes

spec/Section 6 -- Execution.md Show resolved Hide resolved

fotoetienne reviewed Apr 30, 2025

View reviewed changes

spec/Section 3 -- Type System.md Outdated Show resolved Hide resolved

benjie mentioned this pull request May 1, 2025

Enable 'schema' keyword to be provided without root operations #1166

Closed

benjie added 11 commits May 15, 2025 13:27

ABORT -> HALT

94446ab

GraphQLConf requires speakers to be familiar with "inclusive naming", and the inclusive naming guides encourage the avoidance of the word "abort" where possible: https://inclusivenaming.org/word-lists/tier-1/abort/

Start speccing out the capabilities system

3c63355

Add a number of basic capabilities

7056690

Move default error behavior to the service

0fa7a33

Rework capabilities

1f975e4

Use a definition

8c40086

Reorder

641a786

Reword

026982b

Editorial

a7c6ad5

More editorial

fe559ea

More editorial

b5f64ae

benjie changed the title ~~Allow clients to disable error propagation via request parameter (take 2)~~ Service capabilities / error behaviors May 22, 2025

dondonz mentioned this pull request May 27, 2025

Tracking spec change: service capabilities graphql-java/graphql-java#3992

Open

martinbonnin reviewed May 27, 2025

View reviewed changes

martinbonnin mentioned this pull request May 28, 2025

[Draft] update nullability best practices to account for @experimental_disableErrorPropagation graphql/graphql.github.io#1970

Closed

alex-reilly-dd reviewed May 28, 2025

View reviewed changes

benjie commented May 28, 2025

View reviewed changes

spec/Section 4 -- Introspection.md Outdated Show resolved Hide resolved

Update spec/Section 4 -- Introspection.md

1c3f0cd

BoD approved these changes Jun 5, 2025

View reviewed changes

BoD reviewed Jun 5, 2025

View reviewed changes

martinbonnin mentioned this pull request Jun 6, 2025

Add descriptions to executable documents | 2025 Update #1170

Merged

benjie added 2 commits July 10, 2025 10:48

Merge branch 'main' into error-behavior2

7ab36b8

Change NO_PROPAGATE to NULL

cc50991

benjie mentioned this pull request Jul 10, 2025

Error propagation and "following" field executions. #1182

Closed


		Capability identifier

		A capability identifier is a string value composed of two or more segments

Service capabilities / error behaviors #1163

Are you sure you want to change the base?

Service capabilities / error behaviors #1163

Uh oh!

Conversation

benjie commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for graphql-spec-draft ready!

Uh oh!

benjie commented Apr 30, 2025

Uh oh!

benjie commented Apr 30, 2025

Uh oh!

martinbonnin Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benjie Apr 30, 2025

Choose a reason for hiding this comment

Communication issues

Ambiguity issues

Errors already become null in the spec:

Please suggest other alternatives!

Uh oh!

benjie Apr 30, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

benjie commented May 22, 2025

Uh oh!

martinbonnin commented May 25, 2025

Uh oh!

benjie commented May 25, 2025

Uh oh!

martinbonnin commented May 26, 2025

Uh oh!

Uh oh!

martinbonnin May 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alex-reilly-dd May 28, 2025

Choose a reason for hiding this comment

Uh oh!

benjie May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BoD Jun 5, 2025

Choose a reason for hiding this comment

Uh oh!

BoD commented Jun 6, 2025

Uh oh!

benjie commented Jun 24, 2025

Uh oh!

michaelstaib commented Jun 26, 2025

Uh oh!

benjie commented Jul 10, 2025

Uh oh!

Uh oh!

benjie commented Apr 30, 2025 •

edited

Loading

netlify bot commented Apr 30, 2025 •

edited

Loading

martinbonnin Apr 30, 2025 •

edited

Loading