Modeling your data as described in Correlated's Required Data Model for Data Warehouse Connections will put you in a good place to get started with Correlated. However, as we all know, our data needs evolve over time and there are some last mile calculations that make sense to define in Correlated. Typically, these transformations make sense if they involve adjusting or calculating rows in one or more columns in the provided data models. They aren't a good fit for heavy aggregations that span multiple rows or for joins across multiple tables.
Correlated currently supports last mile transforms via our in-app custom SQL editor for data warehouses.
Our ability to write custom SQL is currently only compatible with the following data warehouse integrations: Snowflake, BigQuery, and Redshift.
- Once you have fully connected your data warehouse, click "Edit" on the desired integration card. For more info on connecting, refer to this document.
- From here, click into the "Connect Table" button.
- Next, you'll see our self-serve data warehouse table connector - Instead of choosing a table from a particular schema, select the "Use a Custom SQL Query" box.
- Give this new view a clear name. Right below is where you can write a SQL query on top of any of the existing tables currently exposed to Correlated.
Tip: If you're unsure what tables are being shared with Correlated, go back and click into the "Connection" tab to see a list of all available tables.
- When you're done writing the query, click "Save Query." The Dimensions you created will populate on the right-hand side of the screen. These Dimensions are the data points you can access to build your Playbook Conditions.
- When you're done writing your query, you still need to tell Correlated how to interpret this new view:
- Select a table type (Accounts or Users).
- The timestamp column is needed for "versioned" data source. Versioned data sources are updated incrementally, whereas un-versioned data sources are only updated daily. Please note that this column MUST be when that row was written into the DW or populated in a view. This cannot be another timestamp from a downstream system.
- Select the User ID that uniquely identifies a User and the Account ID that uniquely identifies an Account as required. You must use the same Account ID and User ID as you do in other data sources so that you have one single, global identifier. Failing to do so will result in duplicate Accounts and Users.
- It's also recommended to add a short description of the table that you can refer to later on.
- Finally, tell Correlated how often we should re-sync this table: Every 24 hours, every 12 hours, every 4 hours, every 1 hour, or every 15 minutes.
*Note: You will need to include all of the above fields in your query (Account ID / User ID and Timestamp if relevant). For more info on our recommended schemas, see this document.
- If your columns include text fields, we can help you create dimensions that check whether or not a substring exists. For example, if your CS team writes in their notes that a customer has a specific use case, you can search for that specific text and create a dimension that is true with the text exists, and false if it doesn't.
Cross Column Calculation
Some examples of this include:
- Calculating a metric: for example in order to calculate license utilization, you divide total licenses sold by total licenses utilized
- Comparing two columns: for example, you check whether or not the date when someone invited another user is greater than when they upgraded their plan
Updated 11 months ago