I was reflecting on the recent acquisitions of CDP companies Boxever and Zaius and thought it would be useful to offer a primer on what a customer data platform is and why they are a critical component in a modern architecture for managing digital customer experiences.
A Customer Data Platform (CDP) is software that collates data about customers and their activities across channels and combines it into unified customer profiles which can then be used to target marketing more precisely and improve return on marketing spend.
Table of Contents
- Single View of the Customer
- CDP Key Functions
- CDP Identity Resolution
- CDP Data Sources
- Predictive Metrics
- Segmentation, Audiences and Experimentation
- Use of CDP for Loyalty Programs
- CDP Build Versus Buy
- CDP and Web Analytics
- CDP Vendors
Single View of the Customer
A CDP's primary goal is to generate a consolidated, comprehensive store of customer profile data called a "single view of the customer". The single view of customer is a treasured goal in retail businesses because they can use this data to improve the targeting of marketing and other interactions with customers. More precisely targeted experiences encourage each customer to spend more money, more frequently, improving the return on investment (ROI) of marketing spend. Targeting includes presenting the most relevant products, offers and services to each customer. The more data that is collected in the single view of the customer, the more insight can be derived from it, and therefore the better the results can be.
There are several barriers in front of a retailer trying to create a single view of the customer. Retailers interact with their customers across multiple channels, including social, web, App, email, contact center and store. In order to build a complete picture of the customer, a retailer must be able to join together all of this information from these different channels into a unified view. It is not always obvious which interactions in all these channels, over an extended period of time, belong to the same person. The CDP overcomes these challenges.
CDP Key Functions
The key functions of a CDP are:
- A data model in which unified customer profiles can be stored and maintained
- Data matching and identity resolution - determining which customer profile a new piece of information belongs to, and creating new customer profiles as necessary
- Grouping customers into segments (or audiences) and assigning customers to segments fluidly as data about them changes
- Recording a history of changes to a customer profile and segment membership over time
- Open interfaces allowing external systems to access customer profile data, especially marketing execution engines that deliver marketing messages and experiences to customers
- Pre-built connectors for the most popular of these marketing engines, to make integration of the CDP simpler and quicker
- Reporting and analytics functions to understand data, trends and statistics
- Management screens to manage all of the above, putting as much power as possible in the hands of marketing teams rather than needing IT intervention
CDP Identity Resolution
One of the key functions of the CDP is to resolve multiple identities of visitors into a single view of the customer. This is surprisingly hard to do. Customers use multiple devices with multiple browsers and multiple applications on each device. They interact with a retailer across multiple channels, including social, web, App, email and store and over an extended period of time measured in days, weeks, months and years. Also, their core data (e.g. name, address, age), interests and behaviors change over that time. In order to build a complete picture of the customer, the CDP keeps track of all known interactions people have with the business, both those known to the business, and anonymous people, and attempts to detect when activities originate from the same person to collect more complete profile data.
Identity Resolution Example
The retailer in this example is capturing information about visitors to its website, users of its mobile App, and recipients of emails it sends. This is a simple, common scenario.
If there was no CDP, the retailer might see some or all of these interactions over time as belonging to three or more customers. In fact, in this example they all belong to one customer. Here is how the CDP detects this and resolves these multiple identities into a single customer view. Each numbered event in the timeline above could be separated by seconds, hours, or days.
- There is an anonymous visit from a browser on a cellphone - it is anonymous because this is someone using Incognito mode or just a first time visitor to the site. The website drops a cookie on the browser once permission has been given.
- Later, someone downloads and registers as a new user on the retailer's mobile App. The App does not have access to the cookie in step 1 therefore the CDP does not know this activity is from the same person.
- Next, a first time visit is made from a desktop PC browser to the retailer website. At this point all three of these activities so far are seen as three different people - there is nothing to suggest they are the same person.
- A second visit is made to the website from the same cellphone browser as in step 1. Since the cookie is in place, the CDP knows this is the same person as in activity 1 and can add to the single view of that particular visitor, even though it still does not know the name, it will know what products or content was viewed across both visits (1 and 4).
- The welcome email that was sent when the mobile App registration was performed is opened and clicked from web mail using the same desktop browser as in step 3. Since the cookie was set in step 3 and the clicked email link uniquely identified the App registration, the CDP now knows activities 2, 3 and 5 are all from the same customer. But it still does not know that activities in steps 1 and 4 are also the same person. Any marketing activity that is conducted now will target this known customer based on their App registration data and their website visits using this desktop browser, but will not be able to take into account any product interest shown during the visits using the cellphone browser.
- The visitor continues to browse the website on their desktop and this activity is also recorded against the same customer profile identified in the previous step.
- Finally, the mobile browser used in steps 1 and 4 is used to login to the retailer website using the email and password setup when the mobile App was first used. The CDP can now finally detect that all seven activities therefore belong to the same individual. Marketing that takes place from now will use the rich, unified customer profile that includes all of the interactions this customer has had with the retailer.
The above is a simple example, but shows the principle of exact matching of different streams of activity, so that over time more and more becomes known about a customer, and previously isolated sources of knowledge are combined into a single view. Having this rich view of a customer gives targeting tools like product recommendation engines more power to deliver messages and products that maximize the likelihood of the customer making a purchase.
While the streamed data in the above example is exactly matched and consolidated, some data sources require approximate matching to previous records, for example matching on name and address. Fuzzy logic and match scoring is used on each potential match to give a probability a record from source system belongs to a individual profile already in the CDP. The CDP can be configured to only match records with a high probability of being the same person. Even then errors can be made and it should be noted that all CDP identity resolution is done at a logical level. If further information is revealed that proves a previous assumption incorrect, the CDP can reorganize the data quickly to correct each customer's profile, in essence breaking previously formed links and relationships and reforming them in response to the new information received.
CDPs also provide the ability to group individual customers into households, which can be used to reduce direct mail spend.
CDP Data Sources
The above examples are not the only sources of data that is consumed by a CDP to form a single view of the customer. The CDP is specifically designed to be configured by a retailer to connect multiple, diverse data sources, including structured and unstructured data, batch data and streamed events, through its own APIs and custom connectors.
The following are common data sources that a CDP consumes.
On of the key indicators of a customer's propensity to buy is what they have bought before (and when), so sales transactions are key sources to connect to the CDP. This includes ecommerce sales transactions, add-to-basket and checkout events, in-store purchases, and order returns and exchanges.
Once customers make themselves known to a retailer they reveal or offer up data such as name, billing and shipping addresses, payment types, birthday, gender, interests and any responses to surveys they are sent. Permissions they grant or revoke must also be stored and honored. This data is captured and stored by an App or website and a copy is sent as a batch to the CDP to augment the customer profile.
First party data captured by the retailer can be supplemented with external, third-party data such as demographic data, based either on the individual or their neighborhood, to enrich the data model being assembled by the CDP.
Customers reveal much about themselves as they traverse social networking sites and generate content for others to consume. Customers who like a retailer's Facebook page and who subsequently login to the retailer website using Facebook Connect are providing valuable social graph data to the business. This can all be consumed by the CDP.
Depending on the CDP and how it has been configured and the role it will play, the CDP might also need to consume and manage data sets that are not specific to a customer, but rather allow the marketing systems that will consume the data in the CDP to operate more effectively. Examples include product data including brand, price and other attributes, also stock levels by location. This data can be used by the marketing execution system to ensure that an offer being sent to the customer is for a product they do not already own, at a price point they are likely to be attracted to and is in stock for delivery or in the location where they live.
One of the more advanced features of a CDP is the prediction of customer metrics based on the information known about them. The CDP predicts, based on observation, history and machine learning, metrics for each individual customer, such as:
- next purchase probability, date, value and channel
- churn probability by date
- upsell / cross-sell probability, by product category / product
The retailer can intervene based on these predicted metrics, to change the outcome. For example a telco could send an attractive renewal offer to customers predicted to churn, or a retailer could send a discount code limited to a larger basket size to a customer who is predicted to buy, to increase their spend. Here is how these interventions are managed:
- Predictive metrics are generated and maintained automatically by the CDP
- Marketing actions to take (offers, experiences, etc.) are designed by the retailer's marketing and commercial teams
- Marketing communications are enacted by the marketing execution engines, e.g. to store personalized offers, send push notifications and emails
Segmentation, Audiences and Experimentation
The CDP is used to segment the customer profiles it builds. As data is received, a customer's membership of segments can change in real-time. This ensures any marketing system reading data from the CDP is always doing so based on the most up-to-date insight.
The differentiation between the terms segment and audience is subtle and not universally applied. Sometimes segment is used to mean a subset of a broader audience. Sometimes audience refers to a grouping of customers to which marketing communication will be sent whereas segment is any other grouping.
The term audience is widely used among Data Management Platform (DMP) providers, which is a class of software with some overlap with a CDP, but is designed primarily to identify which customers to target with online campaigns on third-party websites. A DMP consumes mostly third party data, and does not build persistent customer profiles or allow much broader uses like a CDP does.
The CDP you use will use these terms or others, and define them, so get used to whatever terminology they use.
Rule based segments are defined through rules designed by the retailer, for example VIP customers aged between 18-30 living in Ohio. The CDP provides a rule builder to enable these rules to be defined and the CDP will maintain membership of these segments as information about customers changes.
Predictive segments are defined not by the customer's known attributes, but by their expected or predicted behavior. By defining an outcome and a likelihood of that outcome being reached by each customer, the CDP can identify a group of customers that can be targeted. For example, by defining a outcomes of "customer cancels subscription" and "customer does not renew subscription", and likelihood of 80%, the retailer can direct messaging or offers at these customers to reduce their likelihood of churning, while saving the spend or margin that would be wasted by targeting customers unlikely to churn.
Static segmentation is also normally supported, where a query is run in the CDP at a specific point in time and an audience of customers is extracted for a specific communication or campaign. Customers whose data has been lost in a data breach would be an extreme example of a static segment.
Event-triggered segmentation is evaluated in response to a specific event. For example using geofencing to identify a target group of customers near a specific store is only evaluated the first time a customer visits the website and no more frequently than daily thereafter. Once a customer is in this segment, they remain in it until the next event triggers a re-evaluation of whether they should remain in it or be removed.
Scheduled or recurring segmentation is evaluated on the basis of a fixed schedule, such as daily. Each day the CDP can be set to evaluate which customers have a birthday today, or identify customers who purchased a consumable supply a month ago but have not re-ordered a replenishment since. Another common segment calculated on a scheduled basis is recency, frequency, monetary (RFM) segmentation, where the most recent purchase, the number of purchases in a defined period and total spend in that period are all calculated for every customer and customers can be segmented accordingly.
Experimentation, in the form of A/B testing, split testing, multi-variate testing, or similar techniques, can be a feature within the CDP, or this activity can be delegated to a specific A/B testing solution. Some vendors have both CDP and A/B testing software in their portfolio, e.g. Acquia and Bloomreach.
The idea is to test hypotheses made by the marketing team, to assess whether segment definitions and marketing activities are producing the expected results, or whether changes in definition and execution will produce a greater ROI. By testing a marketing action on a small group from the segment that has been defined, the results can be assessed and only successful activities launched for all customers in the segment.
Use of CDP for Loyalty Programs
CDPs can be used as a key component of a customer loyalty program, since it has access to purchases, contact and permission information. It can be used to maintain rules on eligibility and membership and calculate tier levels. It can also be used to assign point values for other events it is aware of, such as mobile App registration or newsletter sign-up.
The CDP can also be used to motivate customers to ascend tiers, by segmenting customers who are close and offering incentives for actions they can take to reach the next level in the program. By automating these types of marketing activities, a retailer can make their program both more successful and cheaper to run.
CDP Build Versus Buy
CDP is a relatively new class of software application and many consumer-facing businesses have built software that does much of what a commercial off the shelf (COTS) product would do. The business case to rip out this bespoke software and replace it with a commercial CDP solution may be marginal, depending on the sophistication already in place or the issues the current solution has.
For businesses without a CDP and no significant CDP-like functionality in their existing systems, the question arises whether to build a CDP or to buy a package. This evaluation is similar to any other build vs buy analysis. The optimum route will depend on factors such as the skills of the IT team, the investment required, time-to-value, strategic importance of this capability and the retailer's view on the strengths of commercial packages available and uniqueness of their business model.
CDP and Web Analytics
CDP offers a view of individual customers and collates these into segments (or audiences) that have particular characteristics or intents. This segment information can be pushed into the retailer's web analytics tool (e.g. Google Analytics) in order to report on web traffic at the segment level. The CDP is far superior to the Web Analytics tool as it can consume and consider far more sources of data. Google Analytics has since 2019 allowed mobile App usage to be combined with web traffic for reporting purposes, but a CDP can consume far more sources and has much better identity resolution tools to give more complete customer profile data and hence derive more accurate, more useful, segments.
The other major difference between web analytics and CDP is that the CDP is designed with open interfaces that allow other systems to access customer profile data, whereas a web analytics tool's output is reports and visualization of aggregated insights. Marketing execution engines such as product recommendations, email marketing or ad serving technologies can more easily be integrated with a CDP, to make their marketing efforts more effective.
CDP is a relatively new category of software and vendors are still in the making, being bootstrapped or funded by VC or corporate capital, and momentum in the market is shifting rapidly. The following is a list of Customer Data Platform software vendors at various scales & levels of maturity:- Adobe Experience Platform, Acquia AgilOne, Bloomreach Exponea, BlueConic, Lytics, mParticle, Salesforce Interaction Studio (Evergage), SAP (Emarsys and Gigya), SALESmanago, SessionM, Simon Data, Tealium AudienceStream & EventStream (part of Universal Data Hub), Twilio Segment, Treasure Data.