Advanced Analytic Queries
TimescaleDB can be used for a variety of analytical queries, both through its native support for PostgreSQL's full range of SQL functionality, as well as additional functions added to TimescaleDB (both for ease-of-use and for better query optimization).
The following list is just a sample of some of its analytical capabilities.
Median/Percentile
PostgreSQL has inherent methods for determining median values and percentiles namely the function percentile_cont (PostgreSQL docs). An example query for the median temperature is:
SELECT percentile_cont(0.5)
WITHIN GROUP (ORDER BY temperature)
FROM conditions;
Cumulative Sum
One way to determine cumulative sum is using the SQL command sum(sum(column)) OVER(ORDER BY group). For example:
SELECT host, sum(sum(temperature)) OVER(ORDER BY location)
FROM conditions
GROUP BY location;
Moving Average
For a simple moving average, you can use the OVER windowing function over some number of rows, then compute an aggregation function over those rows. The following computes the smoothed temperature of a device by averaging its last 10 readings together:
SELECT time, AVG(temperature) OVER(ORDER BY time
ROWS BETWEEN 9 PRECEDING AND CURRENT ROW)
AS smooth_temp
FROM conditions
WHERE location = 'garage' and time > NOW() - interval '1 day'
ORDER BY time DESC;
First, Last
TimescaleDB defines functions for first and last, which allow you to get the value of one column as ordered by another.
SELECT location, last(temperature, time)
FROM conditions
GROUP BY location;
See our API docs for more details.
Histogram
TimescaleDB also provides a histogram function. The following example defines a histogram with five buckets defined over the range 60..85. The generated histogram has seven bins where the first is for values below the minimun threshold of 60, the middle five bins are for values in the stated range and the last is for values above 85.
SELECT location, COUNT(*),
histogram(temperature, 60.0, 85.0, 5)
FROM conditions
WHERE time > NOW() - interval '7 days'
GROUP BY location;
This query will output data in the following form:
location | count | histogram
------------+-------+-------------------------
office | 10080 | {0,0,3860,6220,0,0,0}
basement | 10080 | {0,6056,4024,0,0,0,0}
garage | 10080 | {0,2679,957,2420,2150,1874,0}
What analytic functions are we missing? Let us know on github.