Data as a product has become one of the most popular phrases today as data teams become more integrated with overall product strategy. But what do we really mean when we say that? In today’s post, I’ll break down the different types of data as a product and some nuances for data PMs and data teams thinking about them.
Data platform as a product
Your team provides a data platform to help your organization handle data better. In its essence, your job is to build a two-sided marketplace that incentivizes data producers to provide reliable and high quality data and connect that with data consumers in ways that help them optimize the value of that data. Traditional consumer use cases include dashboards, business intelligence and general SQL-based analytics. More advanced use cases include using Machine Learning, stats, and big data processing to mix offline (analytic) and online (customer-facing) use cases.
Your product is not so dissimilar to building something like a ride-sharing product. You have to identify and engage your suppliers (drivers/data producers) so that they will populate your ecosystem with the product (rides/data). You then connect that product with your customers (riders/data consumers) in the various ways that help them get the value they need.
When building data platform as a product, you are looking to incentivize multiple parties in a two-sided marketplace to proliferate the existence and use of quality data for various use cases
In order to do this, your job is to build out the ecosystem, inviting and incentivizing data producers and consumers to participate. You need to work with producers to figure out where their data sources live, how to help them easily get their data into the core platform, and transform that data into something that helps data consumers. You connect the dots with the data consumers by understanding how they need to access the data for their various use cases (reporting, data exploration, ML, etc), and you enable those access patterns accordingly.
As your ecosystem grows, you need to carefully introduce governance and standardization to ensure that the platform is compliant and productive for all. You add in extra policies to protect PII and incorporate things like data observability to ensure that suppliers provide quality data for consumers.
You’ll measure the success of your marketplace with both input and output metrics. On the input side, you’ll see an influx of data assets created, and done with a high level of quality. If your platform connects with data consumers well, you should see a proliferation of queries and downstream applications that deliver the value of your data. Savvy platforms can connect the data with the number of customers directly impacted by the platform or amount of revenue associated with the data platform.
Data insights as a product
Another way data becomes a product is through insights, such as reporting & analytics. The data itself provides insights that are non-obvious and helps decision makers navigate a fast-moving environment. As the PM for a data asset product, your job is to identify how data can be a reliable compass for decision makers.
This type of data product has traditionally manifested as the Data Warehouse. Teams of data or analytics engineers build a series of tables to represent the business in data format and then work with analysts to deliver reports and insights to stakeholders.
Delivering insights is more than just building tables in a data warehouse, however. In building this product, you have to fully understand how the business works and how data can help identify and measure improvements. Just like in classic product management, you need to work with your customers to understand how they use data in their daily work and how data influences their jobs to be done.
To make data insights successful as a product, you need to understand the data fluency of your end users and how they consume data to run their business. Your analytics approach will vary based on your customers in order to optimize the impact of data insights for the business.
In many cases, your stakeholders might be asking for faster dashboards and more one-off insights, but in reality, what they need is a more structured approach for using data to run the business. Just like how people were asking for faster horses and didn’t realize they needed cars, stakeholders today don’t fully understand how they can use data to make better decisions.
For example, most businesses use data to measure their user acquisition funnel. Your team might create several tables to identify the funnel steps and help your business partners to identify areas of improvement where they see increased drop-off. But there’s likely more insights that could be uncovered if you’re able to efficiently connect attributes and actions of unregistered users with the funnel metrics.
By understanding what your business partners are trying to do (proactively identify areas of drop off and look for levers for growth), your data insights product can go beyond providing a simple funnel report and deliver more nuanced insights about which types of users drop off, where there could be more instrumentation to discover additional steps in a funnel, or how to predict dropoff before it happens so that the team can update the product flow itself.
Going beyond what your stakeholders are asking and more deeply understanding the problems they’re trying to solve will help you develop a more robust and impactful data product.
Data activation as a product
A third way that data becomes a product is by powering the product experience. Common ways of this include personalization, feed algorithm, fraud identification, and user-facing analytics. We will call this ‘data activation’ as this is activating the data for a user-facing application.
When the data platform is connected to production environments (usually via reverse ETL), your team can activate the data aggregations or Machine Learning model outputs and use them directly into the user-facing product. For example, teams often use data to identify cohorts of users likely to upsell based on their attributes and past behaviors. In order to accurately identify these cohorts, you need a solid data model to pull from. Enabling cohorts and audiences from your data platform turns your data warehouse into a CDP and instantly makes data more valuable to the product’s growth story.
A great example of data activation is LinkedIn’s Economic Graph. The LinkedIn team was able to create a data asset representing the global economy by aggregating and deriving insights from all the data on their platform. This data asset helps LinkedIn provide better recommendations on who to connect with on the platform, helps recruiters more effectively find candidates, and has created a unique brand.
Data activation connects the aggregations and insights from your data platform to the end user of your product. When done well, this type of data product can create immense value for your organization, similar to Facebook’s Social Graph or LinkedIn’s Economic Graph
Data PMs in this area need to have a good understanding of the end user and how data can influence their experience, both positively and negatively. Successful PMs for this type of data product understand the nuances of human psychology and how nudges from data (such as personalization) can affect the overall product experience. They work closely with other user facing teams to identify surfaces of the product experience where the data can greatly impact the user experience. This could be to help generate deeper engagement, upsell, or just to simply improve user sentiment.
A PM building this type of data product also needs to be acutely aware of unintended secondary effects, namely with bias in the data. When using data at scale, especially with techniques like ML, the outcomes can create pathways that aren’t necessarily healthy for users or can discriminate against certain groups of people. A few examples include Amazon’s hiring algorithm that favored men for technical jobs or racial bias in healthcare. Data PMs that work on these experiences must be extra vigilant to counteract unintended bias in their data.
When building for data activation, PMs must connect data with opportunities to improve the end-user experience and work with other teams to enable those opportunities. However, they must be vigilant when using data to personalize the product or impact decision making as unintended biases can create discriminatory effects.
This type of data product is one of the newest, most complex, and most impactful. It’s up to data teams to identify these types of opportunities and connect with other product teams to deliver this kind of value.
Recap
So, data as a product is actually quite complex with many different variations!
A data platform PM focuses on building a 2-sided marketplace that provides governance and high quality data for the organization
A data insights PM works closely with decision makers to understand how to run the business and identifies new areas where data can provide deeper insights
A data activation PM works with other product teams to identify areas where data can influence the end-user experience and establishes data as a competency that other organizations can’t easily imitate
In many ways, describing a data PM generically does a major disservice to the types of PMs out there. Each of these 3 data products has different customers, problems to solve, and unique nuances of what makes them successful. Over time, we will only see the field of data PMs become more robust as data becomes its own competency across companies.
This was possibly the first framing on this that I enjoyed, thank you!
I thought possibly that it might be that you are describing 3 capabilities or features of the same "Data Product"?
It seems that from left to right, the products describe features suitable to various stages of maturity in the "system improvement delivery process" of a data team?