Back
Blog Post

Tacos and Data Teams: Data Council 2023

Sean Anderson
May 1, 2023

Nestled in the quiet confines of the University of Texas in Austin was the location for one of the most unique and fastest-growing data conferences in North America. Positioned for data practitioners, Data Council is a 3-day conference covering data management, data integration, machine learning, and exploratory data science. The broad nature of the subject matter allows for cross-discipline conversations that truly capture the breadth of data work that modern teams have to tackle. The speakers are often authentic and purposeful, focusing on building solutions versus interjecting further confusion with hype and rhetoric. In addition, conference attendees get to mingle directly with solution providers while snacking on some of the tastier breakfast tacos the lone star state has to offer. Overall the mix of practitioners, luminaries, and solution providers provides a unique climate for problem-solving, well appreciated by the attendees. In the mix this year were keynotes and roundtables, lightning talks, and breakout sessions. We will dive into some of the highlights. 

A special message from former President Obama on the importance of encouraging data-driven culture
U.S. Chief Data Scientist Dj Patil's mission

Keynotes and Breakout Sessions

Day 1 started with much anticipation and said delicious tacos. Keynotes covered controversial topics like “Big Data is Dead” and breakouts featured key data leaders like data contracts evangelist Chad Sanderson, machine learning garbage fire chief Emily Curtin, and Uber’s Nan and Yupeng, who detailed the team's journey to hyper-growth. After a full day of talks, attendees were eager to mingle. This team at Databand delivered a perfect opportunity for people to connect at the conference community event (Thanks Databand!).

Day 2 keynotes ventured into topics of machine learning and AI, with the opening panel on the state of AI featuring some of machine learning’s core orchestrators. This cascaded into breakout conversations about development for generative AI, leveraging OpenAI for crisis response, and building metric tree assessments for team productivity. The Data Council team also assembled a great panel of investor-leaders who discussed how investors think about data and its growing potential.

Day 3 delivered a powerful and important message from former US Chief Data Scientist DJ Patil on the government's core mission to enable data workers everywhere. DJ ruminated on his time in the role and what he might have changed looking back. The breakouts that preceded focused on the heroics of data teams in building out processes and new capabilities. Data leaders from Github, Riot Games, and Freewheel chronicled their team stories to help future data teams guide toward success. The day ended with another expert panel dead-set on dispelling the hype and confusion in the data ecosystem with leaders from Hex, dbt, and West Marin Data. Throughout the third day, there was another lively gathering happening, a lightning talk track brought to you by the organizers of Data Council and Select Star.    

A full day of talks from data leaders and solution builders.

Lightning Talks: Practical Tidbits from Fast-growing Companies 

The lighting track was featured during the last day of the Data Council conference. Although it was the last day of the conference, we were met with a ton of energy and interest. This lightning track was curated by Select Star founder and CEO Shinji Kim in partnership with Data Council. The track aimed to feature innovative data product and solution leaders in a direct and abbreviated conversation explaining how they did it and what lessons they learned along the way. The response to the call for speakers was inspiring and we knew we had some great topics and enigmatic speakers on our hands. The reality of the day did not disappoint. 

The lightning talks covered various topics of the data ecosystem - data management, machine learning, data reliability engineering, building data products and more. Here’s a play-by-play and a little taste of what you can expect from this crowd in the future.

Pardis Noorzad from General Folders kicked things off by setting the state of cross-company data exchange. Pardis described a variety of methods enterprises use to share and transfer data, detailing their pros and cons.
Data discovery has an important role in modern data management. Alec Bialosky from Select Star shared tips for how to set your data teams up for success using automated data discovery.
Alich Leech from Whatnot introduced us to the concept of extreme self-service, a practice that is not for the faint of heart. Whatnot could only achieve this by building a strong data community, encouraging SQL literacy, and constructing robust guardrails in shared development environments.
Timothy Chan talked about experimentation. How to start small, then scale your experiments. Tim explained how his team rolled out systematic experiments that increased user adoption and optimized for speed and responsiveness.
Katie Hindson shifted the topic to data products. She explained that when data stakeholders become more literate about leveraging data, the impact of data products becomes tangible to everyone, not just data teams.
Rick Saporta continued on the theme of data products but got a bit more existential. Rick dove into data’s core purpose, stating that once you align your teams to the purpose, you can increase adoption and user love.
Moving up the data value chain, Ivan Aguilar talked about how his team built real-time model training. Deploying the model is really only an interim step, explained Ivan, good models require monitoring for accuracy and collecting statistical metrics. 
But what even is production? Dagshub founder and CEO Dean Pleban posed the critical questions you must ask your data team. Since over 87% of data science projects never make it to production, data teams need to agree on what production actually means.
Models in production is a null metric when your models are unpredictable and deliver erroneous results. A key culprit of these downstream issues is label errors in machine learning models. Curtis Northcutt approached the problem from an academic worldview and chats about the open-source project he built.
Running data reliably at SaaS giants like Uber is no easy undertaking, often involving searching for root causes in uncharted waters. Bigeye founder and CEO Kyle Kirwan talked about how they instituted incident management for his data team and broke down the role of data reliability engineering. 
During the conference, one topic of rotation was data contracts, a concept well established in theory but forming in practice. Zackary Klein explained how his team at Whatnot went from inception to processing tens of millions of events across hundreds of different event types each day through the use of data contracts and Interface Definition Language (Protobuf).
Complexity is often commonplace in data teams, but some posture that data work doesn’t have to be that complex. Enter the concept of the activity schema, an open-source modeling framework for data warehouses. Its creator, Ahmed Elsamadisi explained how the activity schema uses time-series data to simplify data modeling for general-purpose reporting and analytics.
We completed the day by talking about the under-stated importance of data correctness. Emma Tang detailed how Stripe’s low tolerance for data inaccuracy poses unique constraints to how infrastructure is designed.

We can’t thank everyone enough, from the speakers to the participants, for their energy on Day 3. The level of engagement we experienced helped us finish Data Council on a strong note.

Select Star team ready to talk data discovery and data governance.

Wrapping Up

We would be remiss if we didn’t mention the great conversations had at the Select Star booth at Data Council. We spoke with engineers trying to better document their integrations and curated data sets, analysts who wanted to better understand the origin and dependencies of their workbooks/dashboards, and data scientists who wanted to better understand their data model to build features. Select Star can better solve real-world problems in their organizations. 

That’s a wrap - we look forward to the next year’s Data Council event. If you chatted with us at the show or simply want to learn more about data discovery, schedule time with our experts today.

Related Posts

Using Data Lineage to Improve Data Quality with Piotr Czarnas
Learn More
No items found.
Building a Smarter Data Foundation: HDC Hyundai’s Journey to AI-Ready Data
Learn More
Snowflake Cost Management Best Practices with Ian Whitestone
Learn More
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Turn your metadata into real insights