Introduction to Datasets

Acting as the platform’s data warehouse, Datasets will store any datasets that have been pulled into the platform. Each dataset will have a name, a creation/refresh date and a symbol in the bottom left of the box indicating one of two things. 


The connections symbol represents a dataset that has been pulled directly from the source. A Pipeline symbol will be present where the dataset is the product of a transformation that has taken place in Pipelines, and has been toggled on to save.


We can also perform a preliminary analysis by exploring the dataset at this stage, which will open up the dataset so that we can see & search each individual row, summaries for the columns, correlations & distributions, and the Advanced Explorer which acts similarly to a pivot table.