Back
Blog Post

Take Control of Snowflake Costs with Select Star

Pratyush Walvekar
June 29, 2023

The adoption of cloud data warehousing has become increasingly prevalent, offering businesses the ability to scale and efficiently manage their data. The benefits are numerous, varying from improved performance, reliability and to ease of use. However, one critical aspect that often goes overlooked is the cost of operating in the cloud and how it scales over time. As organizations scale their operations, the expenses associated with cloud infrastructure can quickly spiral out of control, leaving businesses struggling to understand why their costs have skyrocketed.

Cloud data warehouses, such as Snowflake, have revolutionized the way businesses store, process, and analyze their data. The elastic scalability with separating the storage vs. compute, offered by these platforms allows organizations to seamlessly adapt to changing data demands and ensuring optimal performance. However, this flexibility comes with a price tag. While data democratization is allowing various departments to make data-driven decisions, this increasing need for data analysis and computation make operational cost of cloud infrastructure unpredictable.

In today's age where organizations are prioritizing data & enablement of AI, the ability to estimate and model the compute workload on the data warehouse holds immense significance. However, comprehending and effectively managing the associated costs may not be so straightforward…

Let’s take an example of a large food delivery company I used to work at. Growing rapidly over the pandemic, we were reaching an order value of over $30 billion dollars a year. Subsequently the marketing, business operations, and engineering that supports the delivery ordering and logistics is very much of a data-driven operation.

The data consumption from Snowflake was widespread among various teams who relied on ad hoc queries and dashboards in four different BI tools at the time. As the company experienced a fivefold growth, the number of requests for one-off dashboards or ad hoc queries to the data team surged by 400%. These dashboards would then circulate within their team and become the trusted Source of Truth. Over time, hundreds of users began relying on these dashboards for their everyday operations.

As our spending on Snowflake reached millions of dollars, we embarked on an internal analysis of our cloud costs, which uncovered different insights throughout the process. Among the most critical pain points we encountered was directly associated with our BI tools, where we identified certain dashboards that accumulated over hundreds of views per day, were single-handedly incurring costs exceeding $100K annually.

Snowflake’s Usage-based Pricing Model

Understanding the cost of operating in a cloud data warehouse environment, such as Snowflake, is far from straightforward. Snowflake's cost is measured in credits, which represent the units of computation or storage used by the system. Compute Credits are consumed whenever a query is executed, data load operations are performed, or triggered any other operations in Snowflake. The amount of credits used depends on factors like the size and complexity of your query, sorting and clustering the tables as required during analysis, the volume of data processed, and the compute resources you utilize.

Snowflake's cost allocation is based on a combination of compute resources and storage.

  • Compute resources include virtual warehouses, which are clusters of compute nodes dedicated to processing queries. The size and concurrency level of the virtual warehouses directly impact the cost.
  • Storage costs are determined by the amount of data stored in Snowflake, including both structured and semi-structured data.

For large organizations, the main factors Snowflake costs exponentially increase over time are following:

  • Elasticity: Snowflake provides you with a flexibility to scale warehouse size and number of clusters up or down based on demand. However, this scalability can result in unpredictable cost fluctuations as compute costs are influenced by both the size and number of clusters. Without careful monitoring of usage, businesses may encounter challenges in managing and forecasting their costs effectively.
  • Unpredictable Pricing Model: Snowflake's pricing model involves multiple components, such as storage costs, compute costs, and data transfer costs, making it difficult to track and optimize expenses. And today, there is lack of visibility on Snowflake’s UI to understand where the cost is actually being incurred from.
  • Poorly Written Queries: Inefficient queries or poorly designed data models can lead to unnecessary resource consumption, especially when used frequently and widely across the organization or as core models.
  • Third-Party Integrations: Snowflake offers seamless integration with various BI tools, third-party tools and services. However, some integrations may incur additional costs, such as data ingestion from external sources (ETL jobs), frequent refresh schedules for resource-intensive dashboards or data exports to other platforms. Considering the cost implications of these integrations is important when evaluating overall expenses.

Snowflake Cost Analysis with Select Star

At Select Star, we believe the first step towards effective cost management is understanding where the costs are originating from, its context, and then figuring out how they can be optimized.

We wanted to tackle this problem by understanding their cloud data warehousing costs comprehensively and identifying the various factors impacting their expenses. With a clear attribution of the cost, we can provide insights on where further optimization is necessary.

Today I’m excited to announce Select Star’s Snowflake Cost Analysis, which provides detailed insights into Snowflake costs at multiple levels of data assets including SQL queries, database tables, BI dashboards, compute warehouses, and users and teams.

Select Star’s Snowflake Cost Analysis includes the following:

Overview: This section provides a concise and informative snapshot of KPIs related to Snowflake cost, offering a summary of the expenditure.

Overview tab of Select Star’s Cost Analysis feature

Total Cost & Query Count by Day:This chart visualizes the daily trends and relationship between the total cost incurred and the number of queries executed.

Chart showing Total Cost & Query Count by Day

Average Cost per Hour and Day of the Week: This interactive chart provides deeper insights by allowing users to explore cost trends at the query level based on hourly and day of week.

Chart showing Average Cost per Hour and Day of the Week

Granular Insights: The other tabs offer granular insights at the query, dashboard, warehouse, table and user level, providing detailed cost and query count for comprehensive analysis and optimization of resources within the Snowflake environment.

Select Star’s Cost Analysis feature showing Top 100 queries by cost

Cost per Team and User in Select Star’s Cost Analysis feature

Conclusion

With Select Star’s Snowflake Cost Analysis feature, you gain the power to:

  • Unlock insights into usage patterns across the development lifecycle and project anticipated costs over time.
  • Stay on top of performance and cost by regularly monitoring and conducting check-ins for both the trusted Source of Truth and ad-hoc dashboards.

In the realm of cloud data warehousing, cost analysis is a crucial practice that enables businesses to comprehend, manage, and optimize their expenditure effectively. Understanding where costs originate and how they can be optimized is vital for long-term sustainability and growth. Our Cost Analysis feature is designed to provide businesses with the insights and capabilities they need to navigate the complexities of cost attribution within the Snowflake environment. By leveraging this powerful tool, companies can gain a competitive edge by optimizing their Snowflake spend, unlocking significant cost savings, and maximizing the value of their data warehousing investment.

To learn more about Select Star’s Cost Analysis, visit our documentation or reach out to support@getselectstar.com to get access to this feature.

Related Posts

No items found.
A Guide to Data Strategy with Dylan Anderson
Learn More
7 Tips for Effective dbt Operations with Noel Gomez
Learn More
Using Data Lineage to Improve Data Quality with Piotr Czarnas
Learn More
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Turn your metadata into real insights