Hello from Snowflake Summit 2022!
Today, we are releasing Auto-generated ERD (Entity-Relationship Diagram) in order to better support data analysts and citizen data scientists to understand and leverage data more effectively.
What is Entity Relationship Diagram (ERD)?
Entity Relationship Diagram (ERD) is the baseline of where any data analysis starts. According to TechTarget, Entity Relationship Diagrams provide a visual starting point for database design that can also be used to help determine information system requirements throughout an organization. After a relational database is rolled out, an ERD can still serve as a reference point, should any debugging or business process re-engineering be needed later.
This makes ERDs to be one of the most essential parts of data modeling and architecture, especially for any relational databases that execute SQL.
Today, most of the ERDs are created as the blueprint and mapping of the business process to the data in the beginning. However, as the company grows and as data democratization happens in the company, keeping the ERDs up-to-date manually is almost impossible.
At Select Star we think ERDs are one of the core parts of working with data - in order for anyone to understand how each dataset can be joined with another.
While data lineage maps out the flow of data (where did the data come from & is going to?), ERD maps out how the different datasets can be used together.
Automatically Generating ERDs
So how does this work?
First, Select Star looks at any primary key & foreign key labels in the database. Based on the relationships that it sees, it will add the relationship to our ERD model.
Second, Select Star will also look at all the joins & its join conditions that the column makes. In SQL, the join conditions may look like this:
Looking up ORDERS table in Select Star. You can see that there are 3 other tables related to this table - ORDER_ITEMS, CUSTOMERS, and ORDER_PAYMENTS.
From these join conditions, we can infer which tables are related to each other and how.
“The automatic ERD feature allows data analysts to more easily discover what table and field to join for a query without having to talk to another analyst who has tribal knowledge.”
Veronica Zhai, Director of Analytics Product and Operation @ Fivetran
As the majority of our user base is data analysts, we’re excited to bring this feature to live production today, and continue bringing more insights about your data. Please check it out and let us know what you think!