Back
Blog Post

Snowflake Data Lineage Guide: From Metadata to Data Governance

Snowflake Data Lineage Guide: From Metadata to Data Governance
An Nguyen, Marketing & Operations
April 8, 2025

Data lineage tracks how data flows from source to target objects, revealing relationships from Snowflake objects through to your BI dashboards. As regulatory demands increase and data ecosystems grow more complex, organizations require robust and automated data lineage capabilities to keep up with scale and compliance. 

Lineage provides the map to navigate these complexities to maintain transparency and compliance as well as enable efficient troubleshooting. At Select Star, we've spent years developing best-in-class data lineage, and we've created this guide to share how to implement effective data lineage in Snowflake, covering essential concepts, setup procedures, and proven best practices.

Table of Contents

Understanding Data Lineage in Snowflake

Data lineage features provide critical visibility into data flows throughout your organization. At its core, data lineage tracks how information moves from source to target objects, revealing relationships between Snowflake assets. 

Snowflake offers both table-level and column-level lineage tracking. Table-level shows relationships between entire tables and views, while column-level provides more granular insights into how individual fields are related. 

There are 3 ways to tap into Snowflake’s existing data lineage tracking capabilities:

  1. Directly query in Snowflake
  2. Lineage API
  3. Snowsight Lineage (Snowflake Lineage UI)

1. Directly query in Snowflake

OBJECT_DEPENDENCIES tracks basic relationships between objects. This however is limited to table and view dependencies and lacks column-level granularity.

SELECT * 
FROM SNOWFLAKE.ACCOUNT_USAGE.OBJECT_DEPENDENCIES 
WHERE REFERENCED_OBJECT_NAME = 'CUSTOMER_TABLE';

ACCESS_HISTORY can provide column-level granularity between Snowflake objects. 

select
  directSources.value: "objectId" as source_object_id,
  directSources.value: "objectName" as source_object_name,
  directSources.value: "columnName" as source_column_name,
  'DIRECT' as source_column_type,
  om.value: "objectName" as target_object_name,
  columns_modified.value: "columnName" as target_column_name
from
  (
    select
      *
    from
      snowflake.account_usage.access_history
  ) t,
  lateral flatten(input => t.OBJECTS_MODIFIED) om,
  lateral flatten(input => om.value: "columns", outer => true) columns_modified,
  lateral flatten(
    input => columns_modified.value: "directSources",
    outer => true
  ) directSources

An example query to get the mapping between directSources field and a column (Source: Snowflake)

2. Lineage API

The Snowpark Lineage API enables granular tracking of machine learning workflows within Snowflake, allowing users to programmatically trace dependencies between ML objects like Feature Views, Datasets, and Model Versions. 

Using Python methods such as lineage() on these objects, you can map upstream sources (e.g., raw tables) and downstream impacts (e.g., deployed models) while filtering by domain (tables, views, models) for focused analysis. This API is focused on ML lineage rather than traditional data lineage and excludes non-ML objects like stages in its domain filters. If you are already using Snowpark ML, this API provides foundational lineage tracking but you will need supplementation for full-stack visibility. 

# Trace MYVIEW dependencies
lineage_df = session.lineage.trace(
    object_name='YOUR_DB.SCHEMA.MYVIEW',
    object_domain='TABLE',
    direction=LineageDirection.BOTH,
    distance=2
)

# Display lineage relationships
lineage_df.select("SOURCE_OBJECT_NAME", "TARGET_OBJECT_NAME", "DISTANCE").show()

3. Snowsight Lineage (Snowflake Lineage UI)

If you have the Enterprise Edition, or higher, of Snowflake, you can visually track how data flows from source to target Snowflake objects within Snowsight (Snowflake’s UI). This offers a more comprehensive view of upstream and downstream object relationships, but remains limited to Snowflake objects.

The basic lineage actions available within Snowsight include: A) show additional columns within an object, B) show or hide objects that are upstream and downstream, C) show how the downstream object was created (e.g. SQL statement), and D) view a new lineage diagram focused on the selected object (Source: Snowflake).

While Snowsight's lineage view is useful for quickly understanding dependencies between tables, views, and other Snowflake-native objects, Snowsight’s lineage view is limited to Snowflake-native objects and does not include external sources or downstream consumers, such as cloud storage stages, third-party tools, or BI platforms. The visual graph itself is object-level only, and the column-level lineage is not rendered directly on the graph but is instead accessible via a side panel for supported columns today, as of March 2025.

Implementing Data Lineage in Snowflake at Your Organization

While Snowflake provides built-in features to track lineage at various granularity levels for its objects, most organizations need a comprehensive, cross-platform data lineage. In order to meet the complex requirements, organizations are increasingly turning to automated solutions like Select Star. By continuously parsing queries and capturing metadata changes, these automated tools maintain accurate, up-to-date lineage with minimal work. This allows teams to focus on leveraging lineage insights rather than maintaining the lineage itself.

Select Star's interactive graph exploration at the column level allows for granular investigation of data lineage, empowering users to drill down into specific data elements and understand their journey through the data stack including Snowflake.

Using Select Star to Automate Snowflake Data Lineage

Select Star offers powerful and comprehensive Snowflake lineage capabilities to provide a full visibility into your data ecosystem. Through automated lineage generation, Select Star parses SQL statements to create detailed column-level lineage, allowing users to trace data flows from source systems all the way through to BI dashboards. 

Robust lineage underpins many of advanced use cases of data lineage, including the following: 

  • Automatic propagation of PII classification and other tags to facilitate data access and masking policies
  • User notifications for impactful data changes, allowing one to quickly reach any set of users whether they are individual stakeholders, entire teams, or downstream asset users, based on data lineage

  • Cross-platform usage statistics provide insights into how data is being utilized across your organization and which data could be deprecated

Leading companies across industries are leveraging Select Star's powerful data lineage capabilities to transform their data management practices and drive business value. 

HDC Hyundai, a major South Korean construction and real estate development company, achieved complete visibility into its data landscape, streamlining audit compliance and reducing internal data request handling time by 75%. The company's migration to Snowflake was significantly smoothed by Select Star's comprehensive metadata visibility and lineage tracking. 

Wallbox, a global leader in EV charging solutions, leveraged Select Star to gain clarity and control over their rapidly expanding data ecosystem. By utilizing data lineage, Wallbox was able to deprecate over 200 dashboards, streamline their data pipelines, and implement a tiered governance system for their critical metrics. These success stories demonstrate how Select Star's lineage capabilities are enabling companies to navigate complex data landscapes, ensure data quality, and make more informed, data-driven decisions.

Nib, a major health insurance company in Australia and New Zealand, tackled the complexity of managing over 10,000 data tables in Snowflake by using Select Star's column-level lineage to trace data origins and transformations quickly. This enabled nib to enhance data discovery and governance across its multi-national operations. 

Faire, a B2B e-commerce platform, used Select Star in conjunction with Snowflake to slash their data pipeline costs by 70% and reduce debugging hours for analytics engineering by 80%. Select Star's intuitive interface and automatic popularity rankings allowed Faire to quickly gain insights into their data usage, identify essential columns, and deprecate unnecessary ones, leading to significant cost savings and improved data quality. 

These success stories demonstrate how lineage can enable companies to navigate complex data landscapes, ensure data quality, and make more informed, data-driven decisions while optimizing their data infrastructure costs.

The Future of Snowflake Data Lineage

By providing a comprehensive view of data flows and transformations, lineage underpins effective data governance, enabling companies to maintain compliance, ensure data integrity, and streamline auditing processes. Data lineage empowers proactive data quality management by allowing teams to quickly identify and address issues at their source, reducing the propagation of errors throughout the data ecosystem. Most importantly, robust lineage capabilities support faster, more confident decision-making by providing context and clarity around data origins and transformations, thus enhancing trust in data-driven insights. 

As organizations continue to grapple with increasingly complex data landscapes, the value of automated, granular data lineage cannot be overstated. We invite you to explore Select Star's automated lineage features and we'd love to help accelerate your journey towards more effective management, governance, and utilization of your data.

Related Posts

How to Use Snowflake Object Tagging for Better Data Governance
How to Use Snowflake Object Tagging for Better Data Governance
Learn More
Data Stewardship for Data Governance: Best Practices and Data Steward Roles
Data Stewardship for Data Governance: Best Practices and Data Steward Roles
Learn More
How Generative AI is Transforming Data Management
How Generative AI is Transforming Data Management
Learn More
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Turn your metadata into real insights