Back
Blog Post

A Guide to Data Contracts with Andrew Jones

Leeroy Hjorten
January 29, 2025

Data contracts are essential for effective data management, defining how organizations use and understand their data. As data volumes and complexity grow, implementing data contracts becomes crucial for maintaining data quality and fostering collaboration between data producers and consumers. This post explores key concepts, challenges, and best practices in data contract implementation, drawing insights from Andrew Jones, an independent consultant and author of the book “Driving Data Quality with Data Contracts”.

This post is part of INNER JOIN, a live show hosted by Select Star. INNER JOIN brings together thought leaders and experts to discuss the latest trends in data governance and analytics. For more details, visit Select Star's LinkedIn page.

Table of Contents

What Are Data Contracts?

Data contracts are human-readable and machine-readable documents that codify agreements between data producers and consumers. These contracts typically include schema definitions, ownership details, and service level objectives (SLOs). By establishing clear expectations and guidelines, data contracts help organizations improve data reliability, prevent downstream disruptions, and streamline data governance processes.

The concept of data contracts emerged from the need to address challenges in data quality and reliability, particularly when building revenue-generating products based on data. Andrew Jones, during his time at GoCardless, recognized that upstream changes often occurred without warning, causing issues for downstream processes. This realization led to the development of data contracts as a solution to improve data reliability and quality.

Data contracts can enable a data-driven organization by building in data governance from the start (Source: Andrew Jones)

Why Do We Need Data Contracts?

Data contracts play a vital role in addressing critical needs in modern data management. They significantly enhance data reliability and quality by establishing clear standards and expectations for data producers and consumers. This proactive approach prevents downstream disruptions caused by unexpected schema changes, ensuring that data flows smoothly through the organization's systems. Data contracts also serve as a powerful communication tool, fostering better understanding and collaboration between teams by clearly defining data requirements and responsibilities.

Furthermore, data contracts streamline data governance and compliance processes, providing a structured framework for managing data assets and adhering to regulatory requirements. By implementing data contracts, organizations create a more stable and predictable data environment, effectively reducing the risk of data-related issues and improving overall data quality. This systematic approach to data management enables businesses to make more informed decisions and derive greater value from their data assets, ultimately contributing to improved operational efficiency and strategic decision-making.

Key Components of Data Contracts

Effective data contracts typically encompass several key components that form the foundation for robust data management and governance. These components include schema definitions, ownership details, service level objectives, and version information.

  • Schema definitions: Detailed descriptions of data structures, including field names, data types, and relationships between different data elements. These definitions provide a clear blueprint for how data should be organized and structured, ensuring consistency across various systems and applications.
  • Ownership details: Clear identification of the teams or individuals responsible for maintaining and updating specific data assets. This assignment of ownership promotes accountability and helps streamline communication when issues arise or changes need to be made.
  • Service level objectives (SLOs): Agreed-upon metrics for data quality, availability, and performance. SLOs set clear expectations for how data should behave, including factors like update frequency, acceptable error rates, and response times for queries.
  • Version information: Documentation of changes and updates to the data contract over time. This historical record helps track the evolution of data structures and agreements, facilitating easier troubleshooting and ensuring all stakeholders are aware of current and past configurations.

These components work together to provide a comprehensive framework for managing data assets, ensuring that all stakeholders have a shared understanding of data structures, responsibilities, and expectations. By clearly defining these aspects, organizations can reduce misunderstandings, improve data quality, and foster better collaboration between data producers and consumers. Additionally, well-structured data contracts can serve as valuable reference points for onboarding new team members, conducting audits, and making informed decisions about future data initiatives.

6 Steps to Implement Data Contracts

Implementing data contracts requires a strategic approach. Here are key steps to consider:

  1. Begin with critical datasets: Focus on the most important and frequently used data assets
  2. Engage stakeholders early: Involve data producers, consumers, and other relevant parties in the process
  3. Align with business goals: Ensure data contracts support overall organizational objectives
  4. Design for adoption by data producers: Make the process of creating and maintaining contracts as seamless as possible
  5. Define ownership and responsibilities: Clearly outline who is responsible for each aspect of the data contract
  6. Establish change management processes: Develop procedures for updating and evolving data contracts over time

By following these steps, organizations can create a solid foundation for successfully implementing data contracts and realizing their benefits.

Data contracts can drive what you do with your data and implementing change management is critical for the contracts to evolve as needed (Source: Andrew Jones)

Overcoming Misconceptions and Challenges in Data Contracts Implementation

Implementing data contracts comes with its own set of challenges. Here we discuss common obstacles and approaches to overcome them.

Misconception: Enforcing contracts is key

Many organizations initially focus on strict enforcement of data contracts, which can create resistance and hinder adoption. Instead, the emphasis should be on collaboration and communication between data producers and consumers. By fostering a culture of shared responsibility and mutual understanding, teams can work together more effectively to maintain data quality and reliability.

Challenge: Balancing flexibility with stability

Data contracts need to strike a balance between providing consistent, reliable data structures and allowing for necessary changes and updates. To address this challenge, organizations can integrate data contracts into existing workflows and automate processes where possible. This approach helps maintain stability while allowing for controlled evolution of data structures and definitions.

Challenge: Managing expectations across teams

Different teams may have varying needs and expectations when it comes to data. To overcome this challenge, organizations should regularly review and update data contracts, ensuring they remain relevant and valuable to all stakeholders. This process should involve open communication and collaboration between data producers and consumers.

Challenge: Cultural shifts in data governance

Implementing data contracts often requires a shift in how organizations approach data management and governance. To facilitate this change, it's important to emphasize communication over rigid contract adherence and create incentives for teams to participate in the data contract process. By highlighting the benefits of improved data quality and reliability, organizations can encourage buy-in from all levels of the company.

Case Study: GoCardless' Journey with Data Contracts

GoCardless, a payments company, provides valuable insights into the implementation of data contracts. As the first data engineer at GoCardless, Andrew faced the challenge of building a data platform from scratch. The company's objective to use data for revenue-generating products through machine learning models highlighted the need for reliable and high-quality data.

Initially, GoCardless struggled with data reliability issues, particularly with schema changes that occurred without warning. These changes often disrupted downstream processes, impacting the company's ability to build robust data products.

To address these challenges, GoCardless implemented data contracts. This approach involved:

  1. Defining clear agreements between data producers and consumers
  2. Establishing schema definitions and ownership details
  3. Setting service level objectives (SLOs) for data quality and reliability
  4. Implementing version control for data contracts

The results of implementing data contracts at GoCardless were significant. The company saw improvements in data reliability, reduced incidents of breaking changes, and enhanced collaboration between teams. Data contracts provided a framework for managing expectations and ensuring consistent data quality across the organization.

Future of Data Contracts

The future of data contracts looks promising with several emerging trends and developments. Efforts to create open standards for data contracts are underway, which could simplify integration and adoption across various systems and organizations. Data contracts are expected to become more closely integrated with other data management tools, offering a comprehensive approach to data governance. As technology progresses, we may witness increased use of artificial intelligence and machine learning in creating, maintaining, and enforcing data contracts. Additionally, as the advantages of data contracts become more widely recognized, their adoption is likely to expand beyond tech-focused companies to a broader range of industries.

Data contracts continue to evolve, adapting to the changing needs of modern data management. As organizations strive to derive more value from their data assets, effective implementation of data contracts will remain crucial. The future of data contracts will likely see further integration of advanced technologies, with a continued focus on balancing performance, flexibility, and governance to meet the ever-growing demands of data-driven decision-making.

By embracing data contracts, organizations can create a more reliable, efficient, and collaborative data ecosystem. As the field continues to mature, data contracts will play an increasingly important role in helping businesses harness the full potential of their data assets.

Interested in learning how else you can harness the full potential of your data assets? Schedule a time to speak with us.

Related Posts

A Guide to Data Strategy with Dylan Anderson
Learn More
7 Tips for Effective dbt Operations with Noel Gomez
Learn More
Using Data Lineage to Improve Data Quality with Piotr Czarnas
Learn More
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Turn your metadata into real insights