Correlated Data Security Overview

Prior to integrating your company's data with Correlated, it's important to understand exactly what type of data we will be ingesting, and from where.

📘

Questions?

If your data or security team has additional questions around data collected by Correlated that are not covered in this document, please contact us directly at [email protected] and we're happy to provide more information.

Overview

Correlated requires two types of data: CRM data and Product Usage data.

CRM data includes, but is not limited to fields that are typically stored in Salesforce such as Account Names, Contact Names, Contact Emails, Account Tier, Account Industry, etc.

Product Usage data is composed of account-based and user-based behavioral data on how specific users are using your product. This includes data such as how many times a user utilized a specific feature. We can also leverage Account and User data about subscriptions/trials such as the date a customer signed up, the plan they signed up for, and their contract size.

Beyond CRM data and Product Usage data, we will also have some visibility into internal users of applications we integrate with, such as Slack, Outreach, or Salesforce. We do not pull data from primary downstream integrations - our data is ingested from data sources like CRMs and Data Warehouses. However, we will be able to check if a user exists in a downstream integration, like Salesloft and Outreach.

See below for additional details on data collection and extraction schedules per source:

Segment Data Source

  • Account ID
  • User ID
  • Segment_track_events
    • These are time-series events (think button clicks, features used).
  • Segment_identify_events
    • These are time-series events that identify users (and sometimes accounts). This table is by default a User table.
  • Segment_group_events
    • These are time-series events that identify accounts. By default, this table is an Account table.
  • Segment_page_events
    • We rarely find that page views/events are relevant for our users, but we do pull the data.

Segment Extraction Schedule

  • Segment is the only real-time event stream that Correlated currently supports. What this means is that when Segment events are triggered, we ingest them in real-time and as a result, can augment the time-series data with joined meta data.

CRM Data Source

  • We currently pull any objects that have been updated in the last year on our first pull, and only pull newly updated objects after that.
  • If users provide us with fields that store unique IDs, we will use that. Otherwise, we can rely on our internal heuristic to build Account dimensions.
  • For Salesforce specifically, Correlated will make API calls on your behalf, which will use your rate limits. We do our best not to use them wastefully and we’ll stop using API calls if we’re within 90% of the max. The rate limit refreshes every day, so if we aren’t able to pull data due to being rate limited, we’ll just pick up where we left off when the limit is refreshed. For Salesforce Task creation, Correlated makes one request per task.

CDW Data Source

  • Account Tables
  • User Tables
  • Product Usage

CRM and CDW Extraction Schedules

  • If we have no data at all from the source, we pull everything that’s been created or updated in the last year.
  • Otherwise, we pull everything that’s been created or updated in the last 15 minutes (for Cloud Data Warehouses) or in the last 60 minute (for CRMs).
  • Regardless, every day we pull the last day’s worth of created/updated data.