Shopify Query Language

The query language of the Analytics API is called ShopifyQL. A valid ShopifyQL query consists of parts similar to a valid SQL request. Let's look at an example:

SHOW sum(pageview_count) FROM visits SINCE -7d UNTIL -1d

This query will fetch the total pageviews for the last 7 days. The result would look something like:

sum_pageview_count 282778

Fetching data (SHOW)

There are a number of ways to select, filter and aggregate data using ShopifyQL.

Columns

Each schema is comprised of many columns.

SHOW utm_campaign_name, utm_campaign_term, utm_campaign_source

Aggregation Functions

A number of functions exist to aid with gathering or aggregating data.

function description
sum returns the summation of rows scanned
min returns the minimum value of rows scanned
max returns the maximum value of rows scanned
count returns the count of objects
DISTINCT returns the unique count of rows
avg returns the average value of rows scanned
median median value of results
any returns one value from the results

Aliases

Commonly used aggregation expressions are provided as named aliases.

SHOW total_pageviews vs SHOW sum(pageview_count)

Named aliases on the visits schema:

Name Type Definition
avg_duration number avg(duration)
total_pageviews number sum(pageview_count)
total_purchases number sum(purchase_count)
total_sessions number count(pageview_count)
total_visitors number count(DISTINCT user_token)

Result columns can be explicitly labeled with expression AS label, otherwise a label derived from the expression will be assigned which can often be a bit unwieldy.

SHOW sum(pageview_count) AS pageviews

Aggregate Conditions

ShopifyQL provides support for conditions within the SHOW statement. These can be used to filter rows from inclusion into aggregation functions. Aggregate conditions make it possible to get results back in a single query that would usually require two or more.

Selecting schemas (FROM)

Schemas are selected by the key word FROM. This works in a similar fashion to SQL. The following query is using the visits schema:

SHOW sum(pageview_count) FROM visits

Shopify provides several schemas that you can fetch data from:

Filtering (WHERE)

The WHERE clause indicates the condition or conditions that rows must satisfy to be selected.

Unlike aggregate conditions, which only filter for one column of the result. WHERE conditions filter for all columns that will be returned. WHERE clauses can simplify queries, such as below:

Using aggregate conditions

Using WHERE clauses:

Valid operators are:

  • !=
  • ==
  • IN

Grouping data

Grouping by results (BY)

The BY statement can be used to group the result-set by one or more columns. In the following example, we use the BY clause to group data based on day. Note that there is no row for 2016-02-01. This is because there is no data for that day. This behaviour differs from that of the OVER clause explained below.

Total pageviews by day (last 7 days)

SHOW sum(pageview_count) BY day(timestamp) FROM visits SINCE -7d UNTIL -1d

page_path sum_pageview_count
"2016-01-30T00:00:00-05:00" 69910
"2016-01-31T00:00:00-05:00" 33611
"2016-02-02T00:00:00-05:00" 18139
"2016-02-03T00:00:00-05:00" 22225
"2016-01-28T00:00:00-05:00" 49897
"2016-01-29T00:00:00-05:00" 59655

Grouping with backfill (OVER)

Similarly to BY the OVER clause can also be used to group result sets. The difference is that missing rows will be backfilled. In the following example, we use the OVER clause to group data based on day. Note that there is a backfilled row for 2016-02-01.

Total pageviews per day (last 7 days)

SHOW sum(pageview_count) OVER day(timestamp) FROM visits SINCE -7d UNTIL -1d

day_timestamp sum_pageview_count
"2016-01-30T00:00:00-05:00" 69910
"2016-01-31T00:00:00-05:00" 33611
"2016-02-01T00:00:00-05:00" 0
"2016-02-02T00:00:00-05:00" 18139
"2016-02-03T00:00:00-05:00" 22225
"2016-01-28T00:00:00-05:00" 49897
"2016-01-29T00:00:00-05:00" 59655

It is also possible to combine BY and OVER queries.

Total pageviews by browser per day (last 7 days)

SHOW sum(pageview_count) OVER day(timestamp) BY ua_browser FROM visits SINCE -7d UNTIL -1d

day_timestamp ua_browser sum_pageview_count
"2016-01-31T00:00:00-05:00" "Opera Mini" 0
"2016-01-31T00:00:00-05:00" "Sogou Explorer" 28
"2016-01-31T00:00:00-05:00" "UC Browser" 4
"2016-01-31T00:00:00-05:00" "Maxthon" 6
"2016-01-31T00:00:00-05:00" "Amazon Silk" 2
"2016-01-31T00:00:00-05:00" "Iceweasel" 0

Time Functions

Time functions can be used in conjuction with BY or OVER to bucket data into a specific period. For example, you can bucket pageview_count by month using the following query.

SHOW sum(pageview_count) OVER month(timestamp) AS month

Function Name Description Output Type Output on "2021-04-22T05:46:05-04:00"
year Rounds the timestamp down to the nearest year Timestamp "2021-01-01T00:00:00-04:00"
quarter Rounds the timestamp down to the nearest quarter Timestamp "2021-03-01T00:00:00-04:00"
month Rounds the timestamp down to the nearest month Timestamp "2021-04-01T00:00:00-04:00"
week Rounds the timestamp down to the nearest week (starting on monday) Timestamp "2021-04-19T00:00:00-04:00"
day Rounds the timestamp down to the nearest day Timestamp "2021-04-22T00:00:00-04:00"
three_hour Rounds the timestamp down to the nearest hour that is a multiple of 3 Timestamp "2021-04-22T03:00:00-04:00"
hour Rounds the timestamp down to the nearest hour Timestamp "2021-04-22T05:00:00-04:00"
minute Rounds the timestamp down to the nearest minute Timestamp "2021-04-22T05:46:00-04:00"
hour_of_day Extracts the numerical hour of the day from the timestamp Number 5
day_of_week Extracts the literal day of the week from the timestamp String "Thursday"
day_of_month Extracts the literal day of the month from the timestamp String 22
day_of_year Extracts the numerical day of the year from the timestamp Number 112
week_of_year Extracts the numerical week of the year from the timestamp Number 17
month_of_year Extracts the literal month of the year from the timestamp String April
weekday Returns a boolean to indicate if the day is a weekday Boolean true

Time Range (SINCE/UNTIL)

When writing ShopifyQL queries, the use of time/date constraints is strongly recommended. Not including the SINCE and UNTIL clauses will result in a query that scans a shops entire dataset, which can include years of data and take quite a long time.

Explicit dates are supported, such as 2016-01-28. There is also support for relative dates.

key unit
d day eg: SINCE -7d
w week eg: SINCE -3w
m month eg: SINCE -2m
y year eg: SINCE -1y

Note that SINCE will be taken from the beginning of the day, while UNTIL will be to the end of the day.

For example. Assume today is 2016-02-04

SINCE -7d UNTIL -1d would be equivalent to SINCE 2016-01-28 UNTIL 2016-02-03

Expression Functions

You can use the following expression functions to manipulate column values. These functions are outlined below:

Function Signature Description
trim(input String) Collapses repeating whitespace into a single space while removing leading and trailing whitespace
titleize(input String) Returns the titleized version of the input
site_name(URL String) Returns the host's domain without the top level domain (e.g. `.com`, `.org`)
domain(URL String) Returns the host's domain and top level domain from the URL-string
host(URL String) Returns the hostname of a URL
strip_url(URL String) Returns a string including the host name and path, without a trailing slash
trim_path(URL String) Returns a [canonical form](https://www.shopify.ca/partners/blog/canonical-urls) of an address field
dedot(input String) Returns a string where all instances of '.' are replaced with ' '
to_percent(input Number) Returns the input number as a percentage
concat(arg1 Any, arg2 Any, ...) Concatenates multiple arguments into a single string
round(input Number, precision Number) Returns the input number rounded to a certain precision
<arg1> + <arg2> Returns the sum of arg1 and arg2
<arg1> - <arg2> Returns the difference of arg1 and arg2
<arg1> * <arg2> Returns the product of arg1 and arg2
<arg1> / <arg2> Returns the quotient of arg1 and arg2