Correlated's Required Data Model for Data Warehouse Connections

If you are NOT planning on connecting a data warehouse, the following section will not be relevant for you.

Correlated automatically handles the schemas provided by data sources like Salesforce and Segment. However, if you are going to connect a data warehouse, read on. If you are planning on using Reverse ETL, check out our documentation on how to connect Hightouch Integration and Census Integration.

Correlated makes the data you already have even more powerful and actionable, but your data does need to meet several minimum requirements.

Bare Minimum Data Requirements

User ID: You must have a unique ID that identifies your users across all data sources. Some customers choose to use email, but this is not recommended!

Account ID: You must have a unique ID that identifies groups of users across all data sources. This can be an Account ID that maps to Companies, or it could be a Workspace or Organization ID.

📘

Choose your unique ID that identifies groups by mapping that choice to how your organizations sells

Correlated does not care what your ID describes, as long as at least one exists to group users together. Typically, customers will choose to use whichever ID best groups users by how they want their GTM teams to reach out. For example, if your organization focuses on selling to entire Accounts and Companies, choose an Account ID that identifies that. If a single Account could have multiple teams or workspaces, and your GTM teams sell into each team or workspace separately, use that as your unique ID.

If you are choosing to connect your CRM either as a data source to populate Correlated fields, or as a downstream integration, you must have a field on the Account and Contact (or Company and Contact fields in Hubspot) that identifies the unique Account ID and unique User ID. Correlated does offer ID Syncing, which essentially populates custom fields in Salesforce with unique Account and User IDs. However, doing your own ID Syncing in your CRM allows you to own that logic, is more reliable, and is our recommended approach. Correlated does not offer ID Syncing for Hubspot.

If you are not connecting your CRM as a data source, we require that you sync the portions of CRM data that you need to build the Signals you want into your data warehouse. If you do not do this, your GTM teams will not be able to build robust or complex Signals, which will severely reduce the value you will be able to get out of Correlated.

Recommended Data Schema

Now that we've discussed the minimum requirements to get started with Correlated, here's what your tables should look like. Note that you can have as many tables as you want, as long as you have one table that describes all your accounts, and one table that describes all your users. Although having a parent table that lists all accounts and users is not required, it ensures better data integrity.

Describing Accounts
This table should include ALL of your accounts.

Column NameRequired?Purpose
updated_atYESTimestamp of when the row was updated to support incremental, more real-time updates
account_id (or workspace_id)YESPrimary Key
sfdc_account_idRECOMMENDEDEnable Salesforce Tasks
account_nameYESIdentify Accounts in a human-readable way
sfdc_account_owner_emailRECOMMENDEDSend Slack DMs to Account Owners
sfdc_csm_owner_emailOPTIONALSend Slack DMs to CS Owners
sfdc_account_owner_idRECOMMENDEDAssign Salesforce Tasks to Account Owners
sfdc_csm_owner_idOPTIONALAssign Salesforce Tasks to CSM Owners

Describing Users
This table should include ALL of your users.

Column NameRequired?Purpose
updated_atYESTimestamp of when the row was updated to support incremental, more real-time updates
user_idYESUniquely identify users
account_idYESMap users to Accounts
sfdc_contact_idRECOMMENDEDEnable Salesforce Tasks
emailYESIdentify users in a human-readable way, enable downstream email integrations
sfdc_account_owner_emailRECOMMENDEDSend Slack DMs to Account Owners
sfdc_csm_owner_emailOPTIONALSend Slack DMs to CS Owners
sfdc_account_owner_idRECOMMENDEDAssign Salesforce Tasks to Account Owners
sfdc_csm_owner_idOPTIONALAssign Salesforce Tasks to CSM Owners

You can add an arbitrary number of additional columns to describe accounts and users, as long as you only have one unique value per updated_at timestamp. If you have duplicate values, Correlated will only select the most recent value with the latest updated_at timestamp.

Common Data Types and How to Expose Them to Correlated

Data that describes "who" a customer is
Examples of this type of data include employee count, industry, customer type, customer plan type, etc. These fields are typically descriptors of who an account or user is. These data points can be added as columns in the above schema.

Data that describes customer behavior for a fixed period of time (or for all time)
Examples of this type of data include sign-ins over all time, or sign-ins in the last 7 days. These data points can be added as columns in the above schema.

Data that describes customer behavior for a dynamic period of time
Examples of this type of data type include a count of sign-ins over x number of days, or percentage change in sign-ins week over week. This type of data can be modeled dynamically in Correlated, as described in the next section.

Modeling Product Usage in Correlated

Correlated can accept most tables from data warehouses, as long as each table contains a global Account ID and User ID that can be used to uniquely identify accounts and users. For product usage, we typically recommend that you handle any necessary event rollups prior to importing them to Correlated.

Here are the most common tactics to convert this data so that every Account or User has fields that represent this data in different ways:

  1. Rollups: Events are aggregated across time windows (for example, the last day, last 7 days, last 14 days, last 30 days, etc.)
  2. Last Event Timestamp: The last time an event occurred is calculated (for example, minutes since the last time someone signed in, or days since someone connected an integration)

If you'd like to use SQL to create these rollups, reach out to your Correlated rep for assistance and access to our Custom SQL Query modeling, directly in the data warehouse integration!

2838

IMPORTANT
If you only have access to raw event tables via your data warehouse (see the table below as an example) we HIGHLY recommend using a reverse ETL solution like Hightouch or Census to transform this data into actionable metrics.

In this scenario, using a reverse ETL tool is the best and most seamless data onboarding experience for Correlated.

Pro Tip: If you don't already have Hightouch, you get your first destination FREE and can start sending data to Correlated to test out.

Column NameRequired?Purpose
user_idYESUniquely identify users
account_idRECOMMENDEDMap usage to accounts. Without this, you will only be able to query usage on a user basis
event_nameYESIdentifies the event that occurred that represents product usage
propertiesOPTIONALAdditional properties that describe the event

Data Categories That You Should Consider Exposing to Correlated

Customer Lifecycle Stages
Correlated provides visibility into how customers move through the entire customer lifecycle. By providing us with what stage your customer is in, we can track things like: when a customer changes stages, how long a customer spends in a given stage, etc. You can also build "Signals" directly in Correlated to monitor changes in stages, but if you have a field that describes these various stages in your own data warehouse, these are very useful fields to include

Examples:

  • Organization Created Date
  • User Created Date
  • Opportunity Created Date
  • Conversion to Paid Date
  • Trial Started Date
  • Fully Onboarded Date

Feature Usage
SaaS companies have unprecedented visibility into what their customers are doing in their product, which allows GTM teams to provide a better, more relevant, and targeted experience. That's why tracking feature usage is super important! You can either track feature usage events like clicks and page views, or aggregate feature usage like how many workflows an account has created.

Segmentation
With Correlated, you can offer a different GTM experience for customers in different segments. For example, you might want to send Enterprise customers different content compared to self-serve SMB customers.

Examples:

  • Employee count
  • Industry
  • Tier

Monetization
Of course, at the end of the day, we all want to make an impact on top line revenue. By tracking monetization data in Correlated, you'll be able to segment customers and build unique workflows based on how customers are buying your product.

Examples:

  • Plan
  • Usage (if pricing on usage)
  • License Utilization
  • License End Date

As you can see, Correlated is an incredibly flexible platform. We typically recommend customers to start with a core 3 Signals they want to implement and to set up the data they need to enable those Signals prior to boiling the ocean on connecting data with Correlated.