Skip to content

docs: add guide on file uploads #2017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: source
Choose a base branch
from

Conversation

sarahxsanders
Copy link
Contributor

Description

Adds guide on best practices for file uploads w/ examples. Open to feedback/suggestions for clarity

Copy link

vercel bot commented Jun 10, 2025

@sarahxsanders is attempting to deploy a commit to the The GraphQL Foundation Team on Vercel.

A member of the Team first needs to authorize it.

Copy link
Member

@benjie benjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File uploads in GraphQL are a controversial topic. It is very easy to get wrong.

I think the general shape of your example here does not handle arbitrary operations; for example:

mutation UpdateImages($image: Upload!) {
  a: uploadFile(file: $image) { __typename }
  b: uploadFile(file: $image) { __typename }
}

would fail (I think?) because the second time you tried to do for await (const chunk of file) {...} there would be no chunks left since they'd all been read already. This is particularly relevant for a mutation where you might want to use the same upload in multiple places (e.g. setting the same file as both your desktop and mobile background images). One potential solution to this is to add a validation rule requiring that each upload variable is referenced exactly once, but this isn't the nicest for clients as they might not know this rule exists.

There's also serious security consequences of using multipart/form-data w.r.t. CSRF and similar concerns.

Another issue, busboy says that if you use file it's important that you then consume the stream, but there is no guarantee that GraphQL will execute far enough to run the for await (const chunk of file) {...} code in the resolver (e.g. a validation or auth issue could occur); this could result in denial of service through memory exhaustion.

Apollo say:

Apollo recommends handling the file upload itself out-of-band of your GraphQL server for the sake of simplicity and security. This "signed URL" approach allows your client to retrieve a URL from your GraphQL server (via S3 or other storage service) and upload the file to the URL rather than to your GraphQL server.

-- https://www.apollographql.com/docs/apollo-server/v3/data/file-uploads

I think this general dissuasion is a good idea.

My general feeling is that we're better off without a documentation page on this unless we're extremely careful to handle all of the potential pitfalls. And I'm pretty sure I haven't enumerated all of them in this comment.

I'm leaning towards closing this, but I wonder if any of the @graphql/tsc feel different - in particular @andimarek and @PascalSenn?

@martinbonnin
Copy link
Contributor

A general stance about uploads in general and pitfalls/possible solutions would be helpful IMO. This is still something people try to implement in the field and I feel guidance with lots of caveats > no guidance.

Copy link
Member

@saihaj saihaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add this in docs. In the context of GraphQL Yoga and any of our customers, we generally advise against uploading files to the server. S3 (or similar storage solutions) are designed specifically for blobs. On the GraphQL side, you simply provide a mutation that returns a signed URL valid for uploading to the storage provider.

@benjie
Copy link
Member

benjie commented Jun 19, 2025

I feel guidance with lots of caveats > no guidance.

Agreed; but I'd prefer that our guidance was "don't do that, and here's why and what you might consider instead". Providing all the caveats is quite a large task, but the ones I pointed out above could be used as a starting point. Even something like:

In general we recommend against using GraphQL for file uploads, as it was not designed with that purpose in mind.
Should you wish to use GraphQL for file uploads anyway, please ensure you're aware of the following topics before selecting a solution:

  • GraphQL bombs: where the same upload is referenced multiple times in a GraphQL operation to attempt to achieve Denial of Service (DoS) through Out Of Memory (OOM) errors. Can be avoided via trusted documents or a validation rule preventing file uploads being referenced more than once in a document.
  • Streamed vs non-streamed data: files tend to be quite large, so often it's more memory efficient to process them in a streaming way. However, passing a stream through GraphQL variables or context is unsafe as there is no guarantee that the stream will be handled - a validation or execution error could cause the resolver to never be called, resulting in memory leaks. Consider streaming the data to temporary storage and passing the filename to GraphQL; ensure the temporary storage is cleaned up once the request completes!
  • Cross-site Request Forgery (CSRF or XSRF) risks: Content-Type: multipart/form-data is treated as a "simple" request, not using the CORS flow, and so can be used for cross-site requests if your server does not guard against them
  • Request size: ensure you implement caps on request size to avoid out of memory errors, especially if you are not streaming data.
  • Excess properties: ensure that all files submitted through the request are adequately handled, an attacker could upload files under variable names that are not used in the document with the aim of leading to OOM issues.
  • File names: file names provided through multipart/form-data are arbitrary and should be treated as untrusted data. Do not use them when generating paths without sanitizing them.
  • File types: file types provided through multipart/form-data are arbitrary and should be treated as untrusted data. Just because the media type says it's an image doesn't mean it is one. Always be careful when processing user-submitted data, consider "sniffing" the file type yourself and rejecting the request if the types do not align.
  • File contents: file contents provided through multipart/form-data are arbitrary and should be treated as untrusted data. Be sure to validate that the data aligns with what you expect. Also be aware of format-specific vulnerabilities such as zip bombs or maliciously created PDFs that exploit vulnerabilities in parsers. Use adequate protections against potentially malicious content.

Ultimately, while it’s possible to use GraphQL for file uploads safely, doing so requires a significant body of knowledge and it’s rarely the best tool for the job. A more robust approach is to upload files via a dedicated REST endpoint or signed URL (e.g., to S3), and then pass metadata or references through GraphQL. This separates concerns cleanly and avoids many (but not all!) of the pitfalls described above.

@martinbonnin
Copy link
Contributor

@benjie agree 100%.

Copy link
Member

@benjie benjie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a better direction! I've edited the text for accuracy, added a few extra pointers, and softened the language around the community spec.

@sarahxsanders sarahxsanders requested a review from benjie June 20, 2025 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants