Datasets Query Builder

Overview

Performing a Query typically involves pulling data objects from a table or a series of tables. In Osmos, once you have created your Dataset(s) and have populated at least one Table, you are ready to build a query.

Queries are created in the Query Builder interface in Datasets.

Pre-Requisites to build a Query

  1. Create a Dataset and upload data

How to Build a Query

Step 1: Select a table you would like to Query, and enter the Query builder

  1. On the Datasets panel, click on the name of one of the tables you will be querying.

Step 2: Select Table and Joins

  1. The table that was initially selected will be pre-populated into the query. Additional tables can be added by linking values in the tables via join logic.

Step 3: Select Columns

  1. The columns available from table selection and any join activity will appear in a dropdown. The values here can be selected for display in your output.

Step 4: Aggregations

  1. Rows can be aggregated in order to reformat the data in the table. Any columns created using aggregations will need to be named. Available aggregations are: Average, Count, Count Distinct, Minimum, Maximum, Sum, Concatenate, Concatenate Distinct, Boolean: All, Boolean: Any

  1. Columns can also be inter-aggregated in order to create new outputs. In this example the columns First name and Last name have been combined into a single output

Step 5: Filters

  1. In order to access a specific subset of records, conditions can be set to include or exclude records based on values in specific fields. Complex filters can be designed leveraging AND and OR scenarios.

Step 6: Group By

  1. Group by allows a user to select the key fields to be output, and collapse non-key fields. This step is required for aggregations to be performed. In the context of an excel pivot table, Group by fields are the equivalent of the "Rows" used to construct a pivot. In this example, grouping by State and Employee Type, Our aggregations will tell us how many employees of each type are in a given state, and how much salary is being paid out to that group. This field will auto-populate based on the requirements of any aggregations that are to be performed.

Group By is required for aggregations to be performed

Step 7: Run Query/Download Output

  1. Once the query has been built, a user can select "Run Query" in order to confirm the output. If there are any issues with the query configuration, an error will populate at this point. If a query is successful, a user will receive a notification of the success and the number of rows returned.

  1. If the output created by the query is correct, a user can now download their results as a .csv file. If a query is being built for a pipeline, the query can then be saved, to be run at set intervals.

Step 8: Save query

  1. If a query is likely to be used again in the future, either to create consistent inputs to a pipleine or to generate output ad hoc, the query can be saved. Selecting save query and providing a name will allow the query to be recalled using the dropdown on the top right of the query builder.

Last updated