Inside facilities the size of a big-box store, Bowery Farming grows leafy greens and strawberries for over 1,000 retailers throughout the Northeast and Mid-Atlantic. These crops are traditionally grown outdoors and shipped long distances. Bowery Farming brings production closer to the end customer by growing indoors, which eliminates the need for pesticides, allows for up to 100 times more efficient use of space and gets fresher produce on store shelves faster.
Data operations that need to scale with the business growth
Bowery Farming currently operates four indoor growing facilities, with a fifth coming online later this month. Each facility contains thousands of sensors and cameras that precisely track variables like temperature, humidity, and soil chemistry. Data from these technologies informs how growers manage the environmental conditions in each facility, supplying them with critical insights to mitigate issues and hit yield targets. This way, data matters just as much as soil and light for plant production, which creates four distinct challenges:
Given the vital importance of data in the operation, Bowery Farming currently employs 11 data scientists among a total staff of over 500. That small team manages a data warehouse with 900 tables containing billions of individual data points. It’s a massive and growing repository of information - scaling not just in number of farms and ‘rows’ of data points, but also in types of sensors and systems or ‘columns’ of data.
With such a large data warehouse, it was challenging for the data team to identify what was present, what was missing, and document those findings in a systematic way. What exactly the data warehouse contained (and what was extraneous) was unclear, making the whole thing harder to manage and leverage as a result.
Lack of visibility into intricate dependencies between data models made it difficult to see the connections between data tables and data dashboards located downstream. Changes to tables regularly resulted in broken dashboards, leaving end users like growers without the time-sensitive data necessary to maintain optimal growing conditions.
Sometimes it takes an hour to fix dashboard issues, but complex problems can take a full day for a data scientist. When a dashboard is broken, the on-call data scientist needs to put their priorities on pause. The business stakeholders also need to wait for the dashboard to be fixed before they can make their decisions. As the business grows and more stakeholders need support, this challenge could pose a significant bottleneck.
Problems with dashboards and data weren’t isolated; issues appeared once or twice a week on average. “As our team and number of tables expanded it became increasingly expensive to not have a more comprehensive data discovery tool,” says Travis Dunlop, Sr. Data Scientist. “More and more often, we would accidentally break dashboards when updating our dbt models.”
For an operation so dependent on maintaining optimal growing conditions of leafy greens, anything that impeded data was a problem. It was clear something needed to be done. But with so much data under management, finding the right solution seemed as difficult as the problem itself.
The ability of Select Star to offer column level lineage has been incredibly helpful in determining dependency flows within our dbt project. Also with Select Star's integration of Mode Analytics, we can be confident where metadata changes will be surfaced in BI reports.
Travis Dunlop
Senior Data Scientist and Engineer at Bowery Farming
A data discovery platform for the modern enterprise
The data team first heard about Select Star from Bowery Farming’s investors and decided to give it a try. After their trial, it became clear that the Select Star’s unique build and data discovery-focused functionalities supported exactly what Bowery Farming needed and wanted to achieve.
The primary use case for Bowery Farming was to easily understand the origin and flow of data in a complex environment across multiple data tools. Delivered on AWS Cloud, Select Star works automatically to catalog and classify the entire contents of Bowery Farming’s databases, BI dashboards, and the dependencies that exist in between. The team at Bowery Farming was able to replace problems with solutions:
Before, they could only see data lineage at the table level with dbt docs. After implementing Select Star, they can now see lineage at the column level to ascertain dependencies between data tables and dashboards — quickly, clearly, and comprehensively. Data scientists are able to check how changes to the dbt projects would affect downstream dashboards. And by anticipating issues early, they can prevent dashboards from breaking instead of fixing them after the fact.
Internal team communications improved once they stopped being dominated by questions about data provenance. “The built-in integration with dashboarding tools (Mode Analytics in our case) means that we can be confident where data changes will be surfaced”, said Travis. Select Star helped optimize how the data team functioned internally and within the larger organization, especially as the company scales and visibility across all data points becomes more mission crucial.
Data scientists can now make sense of the data warehouse on day one instead of weeks or months after starting. And by improving the efficiency and efficacy of the data team, Bowery Farming can scale that team upwards to support a business (and data warehouse) growing by leaps and bounds.
The addition of Select Star has been a game-changer for the data team. “The ability of Select Star to offer column level lineage has been incredibly helpful in determining dependency flows within our dbt project during development,” according to Travis.
Select Star is the first stop whenever someone on the data team has questions about the provenance or availability of data.
Travis Dunlop
Senior Data Scientist and Engineer at Bowery Farming
Integrated data operations & management supporting the business growth
The data team works closest with Select Star, but the results and the beneficiaries extend across the entire organization:
The data team deals with significantly fewer support tickets because dashboards rarely go down. Even when they need to fix broken dashboards, they spend 1-2 minutes looking at the lineage instead of 30 minutes to an hour reading through the SQL queries and dbt models underneath.
The data team has plans to utilize Select Star’s Client API, among others, in their continuous integration (CI) system. Once complete, the team can automate end-to-end testing throughout the entire data stack to ensure data integrity and accessibility while investing minimal time and resources into the process.
Supplying the business stakeholders with the exact numbers they need when they need them — and keeping that pipeline flowing without fail — helps Bowery Farming turn their ambitious agricultural concept into a sustainable and scalable business.
Bowery Farming has recently announced plans to open its sixth and seventh growing facilities in the coming year, adding thousands of new sensors and millions of new data points to its ballooning data warehouse. Select Star ensures that data stays organized and accessible, keeping growth on track and streamlining business operations.
“Select Star is the first stop whenever someone on the data team has questions about the provenance or availability of data,” says Travis.