Bulk import data with the GraphQL Admin API

Importing large volumes of data using traditional and synchronous APIs is slow, complex to run, and difficult to manage. Instead of manually running a GraphQL mutation multiple times and managing a client-side throttle, you can run a bulk mutation operation.

Using the GraphQL Admin API, you can bulk import large volumes of data asychronously. When the operation is complete, the results are delivered in a JSON Lines (JSONL) file that Shopify makes available at a URL.

This guide introduces the bulkOperationRunMutation and shows you how to use it to bulk import data into Shopify.

Requirements

Limitations

How bulk importing data works

You initiate a bulk operation by supplying a mutation string in the bulkOperationRunMutation. Shopify then executes that mutation string asynchronously as a bulk operation.

Most GraphQL Admin API requests that you make are subject to rate limits, but the bulkOperationRunMutation request isn't. Because you're only making low-cost requests for creating operations, polling their status, or canceling them, bulk mutation operations are an efficient way to create data compared to standard GraphQL API requests.

The following diagram shows the steps involved in bulk importing data into Shopify:

Workflow for bulk importing data

  1. Create a JSONL file and include GraphQL variables: Include the variables for the mutation in a JSONL file format. Each line in the JSONL file represents one input unit. The mutation runs once on each line of the input file.

  2. Upload the file to Shopify: Before you upload the file, you must reserve a link by running the stagedUploadsCreate mutation. After the space has been reserved, you can upload the file by making a request using the information returned from the stagedUploadsCreate response.

  3. Create a bulk mutation operation: After the file has been uploaded, you can run bulkOperationRunMutation to create a bulk mutation operation. The bulkOperationRunMutation imports data in bulk by running the supplied GraphQL API mutation with the file of variables uploaded in the last step.

  4. Wait for the operation to finish: To determine when the bulk mutation has finished, you can either:

    1. Subscribe to a webhook topic: You can use the webhookSubscriptionCreate mutation to subscribe to the bulk_operations/finish webhook topic in order to receive a webhook when any operation finishes - in other words, it has completed, failed, or been cancelled.
    2. Poll the status of the operation: While the operation is running, you can poll to see its progress using the currentBulkOperation field. The objectCount field on the bulkOperation object increments to indicate the operation's progress, and the status field returns a boolean value that states whether the operation is completed.
  5. Retrieve the results: When a bulk mutation operation is completed, a JSONL output file is available for download at the URL specified in the url field.

Create a JSONL file and include GraphQL variables

When adding GraphQL variables to a new JSONL file, you need to format the variables so that they are accepted by the corresponding bulk operation GraphQL API. The format of the input variables need to match the GraphQL Admin API schema.

For example, you might want to import a large quantity of products. Each attribute of a product must be mapped to existing fields defined in the GraphQL input object ProductInput. In the JSONL file, each line represents one product input. The GraphQL Admin API runs once on each line of the input file. One input should take up one line only, no matter how complex the input object structure is.

The following example shows a sample JSONL file that is used to create 10 products in bulk:

Upload the file to Shopify

After you've created the JSONL file, and included the GraphQL variables, you can upload the file to Shopify. Before uploading the file, you need to first generate the upload URL and parameters.

Generate the uploaded URL and parameters

You can use the stagedUploadsCreate mutation to generate the values that you need to authenticate the upload. The mutation returns an array of stagedMediaUploadTarget instances.

An instance of stagedMediaUploadTarget has the following key properties:

  • parameters: The parameters that you use to authenticate an upload request.
  • url: The signed URL where you can upload the JSONL file that includes GraphQL variables.

The mutation accepts an input of type stagedUploadInput, which has the following fields:

Field Type Description
resource enum Specifies the resource type to upload. To use bulkOperationRunMutation, the resource type must be BULK_MUTATION_VARIABLES.
filename string The name of the file to upload.
mimeType string The media type of the file to upload. To use bulkOperationRunMutation, the mimeType must be "text/jsonl".
httpMethod enum The HTTP method to be used by the staged upload. To use bulkOperationRunMutation, the httpMethod must be POST.

Example

The following example uses the stagedUploadsCreate mutation to generate the values required to upload a JSONL file and be consumed by the bulkOperationRunMutation. You must first run the stagedUploadsCreate mutation with no variables, and then separately send a POST request to the staged upload URL with the JSONL data:

Request

POST /admin/api/2021-07/graphql.json

View response

JSON response:

Upload the JSONL file

After you generate the parameters and URL for an upload, you can upload the JSONL file using a POST request. You must use a multipart form, and include all parameters as form inputs in the request body.

To generate the parameters for the multipart form, start with the parameters returned from the stagedUploadsCreate mutation. Then, add the file attachment.

POST request

GraphQL variables in JSONL file

Create a bulk mutation operation

After you upload the file, you can run bulkOperationRunMutation to import data in bulk. You must supply the corresponding mutation and the URL that you obtained in the previous step.

The bulkOperationRunMutation mutation takes the following arguments:

Field Type Description
mutation string Specifies the GraphQL API mutation that you want to run in bulk. Valid values: productCreate, collectionCreate, productUpdate, productUpdateMedia, productVariantUpdate
stagedUploadPath string The path to the file of inputs in JSONL format to be consumed by stagedUploadsCreate

Example

In the following example, you want to run the following productCreate mutation in bulk:

To run the productCreate mutation in bulk, pass the mutation as a string into bulkOperationRunMutation:

Request

POST /admin/api/2021-07/graphql.json

View response

JSON response:

Wait for the operation to finish

Option A. Subscribe to the bulk_operations/finish webhook topic

You can use the webhookSubscriptionCreate mutation to subscribe to the bulk_operations/finish webhook topic in order to receive a webhook when any operation finishes - in other words, it has completed, failed, or been cancelled.

For full setup instructions, refer to Configuring webhooks.

POST /admin/api/2021-10/graphql.json

View response

JSON response

After you've subscribed to the webhook topic, Shopify sends a POST request to the specified URL any time a bulk operation on the store (both mutations and queries) finishes.

Example webhook response

You now must retrieve the bulk operation's data URL by using the node field and passing the admin_graphql_api_id value from the webhook payload as its id:

POST /admin/api/2021-10/graphql.json

View response

JSON response

For more information on how webhooks work, refer to Webhooks.

Option B. Poll the status of the operation

While the operation is running, you can poll to see its progress using the currentBulkOperation field. The objectCount field increments to indicate the operation's progress, and the status field returns whether the operation is completed.

You can adjust your polling intervals based on the amount of data that you import. To learn about other possible operation statuses, refer to the BulkOperationStatus reference documentation.

To poll the status of the operation, use the following example request:

Request

POST /admin/api/2021-07/graphql.json

View response

JSON response:

Retrieve the results

When a bulk mutation operation is finished, you can download a result data file.

If an operation successfully completes, then the url field contains a URL where you can download the data file. If an operation fails, but some data was retrieved before the failure occurred, then a partially complete data file is available at the URL specified in the partialDataUrl field.

In either case, the returned URLs are authenticated and expire after one week.

After you've downloaded the data, you can parse it according to the JSONL format. Since both input and response files are in JSONL, each line in the final asset file represents the response of running the mutation on the corresponding line in the input file.

Operation success

The following example shows the response for a product that was successfully created:

Operation failures

Bulk operations can fail for any of the reasons that a regular GraphQL API mutation would fail, such as not having permission to access certain APIs. For this reason, the best approach is to run a single GraphQL mutation first to make sure that it works before running a mutation as part of a bulk operation.

If a bulk operation does fail, then its status field returns FAILED and the errorCode field returns a code such as one of the following:

  • ACCESS_DENIED: There are missing access scopes. Run the mutation normally (outside of a bulk operation) to get more details on which field is causing the issue.
  • INTERNAL_SERVER_ERROR: Something went wrong on Shopify's server and we've been notified of the error. These errors might be intermittent, so you can try making your request again.
  • TIMEOUT: One or more mutation timeouts occurred during execution. Try removing some fields from your query so that it can run successfully. These timeouts might be intermittent, so you can try submitting the query again.

To learn about the other possible operation error codes, refer to the BulkOperationErrorCode reference documentation.

Validation error

If the input has the correct format, but one or more values failed the validation of the product creation service, then the response looks like the following:

Unrecognizable field error

If the input has an unrecognizable field, then the response looks like the following:

Cancel an operation

To cancel an in-progress bulk operation, run the bulkOperationCancel mutation and supply the operation ID as an input variable:

Request

POST /admin/api/2021-07/graphql.json

Next steps