Filters, Aggregations and Groupings
Let's have a look at some filters which might be not self-explanatory.
- != is a common expression in programming and means "not equal", hence it filters for all rows which do not match the given value.
- in takes multiple values and filters for all rows which match one of the values.
- not in takes multiple values and filters for all rows which NOT match one of the values.
- Timestamp filters are applied to data according to the selected time zone for the dataset. If no time zone has been selected, then it will by default be the time zone of the machine that created it. Also, keep in mind that filters on a timestamp column are always applied on the raw data not the aggregated or grouped result.
- Relative Between acts as a combination of the "Today"/"Yesterday" and the "Between" filter types for timestamps. Data can be filtered to show a specific period over the current and previous day, relative to the time the dataset is executed.
- like, not like behaves as the SQL operator does. So you can use "%" to match any sequence of characters and "_" to match any single character. No need to enclose the string you are looking to match in quotes ("). E.g. the filter "like: %chine_" would match "machine1" or "Location1$machine5" but not "machine20".
- RegExp Match, Not RegExp Match will filter for the entries that match (or don't match) your regular expression. You can find many sources on how to write a regular expression on the internet.
By default, all timestamp fields in datasets will be displayed in your local time. If you wish to see how your data looks in a different timezone, you may use the time zone selector at the top of the screen. Changing the timezone will display all dataset timestamps in their equivalent time for your selected zone. Filtering and grouping will be done on the translated timestamp, which may change the data included in your dataset. For example, a filter showing data from the last 60 minutes will show different data for TZ "Europe/London" then it would for "Europe/Vienna".
(Left) Translated timestamps in "Australia/Sydney" (AEST) time. (Right) Local Time (CET)
Note: the timezone field is for testing purposes only, and the selected value will not persist when saved.
- No data entries get ignored by default in the aggregations.
- Minimum, Maximum applied to a text column will give you the lowest (for Minimum) or the highest (for Maximum) entry alphabetically (with the number being lower than character, hence "0Z" < "A").
- Count will simply count the rows.
- Standard Deviation, Variance uses the sample variation and sample standard deviation if you are on a system using Timescale DB. If you are on a system using Crate DB the population variance/standard deviation is used ("no data" entries are excluded from these calculations).
- grouping creates a group for every distinct element and if you group by multiple columns every distinct combination results in a group. A group results in a "summary" of all entries represented in one single entry.
- grouping timestamps has more options than grouping for text or numbers. Here you can put timestamps which are from the same minute, hour, day, etc. in a group. Grouping by e.g. day, week, or month implicitly needs a time zone. Here we use the local time "Europe/Vienna".
- grouping timestamps by time period is an input that allows you to enter an amount of time to create groups or so called "time buckets" which size is determined by selected timestamp above and entered amount e.g. 7 minutes, 5 hours, 2 weeks etc. Default value is 1, and minimum allowed value is 0.01. Decimal values are supported.
Example of grouping by 5 seconds.
In the following video, we show how to create a dataset which contains the average speed of different machine per day, excluding the speed values below 5. We achieved this through grouping by thing and day, taking the average of one-speed column and filtering the second speed column for ">5" and setting it to not show data. Which leads to the data being filtered instead of the results.