Can a system help us build better products?
Join hundreds of practitioners and leaders like you with episode insights straight in your inbox.
Checkout our brands or sponsors page to see if you are a match. We publish conversations with industry leaders to help data practitioners maximise the impact of their work.
You can’t talk to your stakeholders if your pipelines are down. You can't get good quality sleep if your data products brake in the middle of the night. You can't earn the trust of your stakeholders if it takes you weeks to make even the smallest change without braking things.
Today I learn about DataOps from Christopher Bergh CEO and Head Chef at DataKitchen. Chris has more than 25 years of research, software engineering, data analytics, and executive management experience.
Chris is a recognized expert on DataOps. He is the co-author of the ‘DataOps Cookbook” and the “DataOps Manifesto,” and a speaker on DataOps at many industry conferences. At various points in his career, he has been a COO, CTO, VP, and Director of engineering. Chris has an M.S. from Columbia University and a B.S. from the University of Wisconsin-Madison.
Events coming up - Data Teams Summit
The Data Teams Summit is coming up on January 25. It's an online live event of peer-to-peer sessions. I'll be hosting a panel discussion, "What habits do successful data engineers have?" at 11AM PST.
Visit https://datateamssummit.com/ and register for free.
Join the Discovering Data community!
Do you want to turn data into business outcomes and get promoted? Discovering Data just launched a new Discord server to connect you with people like you. Discover new ideas, frameworks, jobs and strategies to maximise the impact of your work. Data can be a lonely and challenging career, don’t do it alone!
Request access now: https://bit.ly/discovering-data-discord
Do you want to showcase your thought leadership with great content and build trust with a global audience of data leaders? We publish conversations with industry leaders to help practitioners create more business outcomes. Explore all the ways to tell your data story here https://www.discoveringdata.com/brands.
Want to help educate the next generation of data leaders? As a sponsor, you get to hang out with the very in the industry. Want to see if you are a match? Apply now: https://www.discoveringdata.com/sponsors
Do you enjoy educating an audience? Do you want to help data leaders build indispensable data products? That's awesome! Great episodes start with a clear transformation. Pitch your idea at https://www.discoveringdata.com/guest.
💬 Feedback, ideas, and reviews
Want to help me stir the direction of this show? Want to see this show grow? Get in touch privately or leave me a review with one of the forms at discoveringdata.com/review.
Your ideas help us create useful and relevant content. Send a private message or rate the show on Apple Podcast or Spotify!
Loris Marini: So today we're here to talk about a culture of learning and systems of learning, and I'm here with Christopher Bergh. Christopher is the CEO and head chef at Data Kitchen. Chris has more than 25 years of research, software engineering, data analytics, and executive management experience at various points in his career.
Chris has been a Chief operating officer, Chief technical Officer, VP director of engineering, Chris as a master, of science from Columbia University. Bachelor of Science from the University of Wisconsin Medicine. Chris is a recognized expert in data ops. He's the co-author of the DataOps cookbook and DataOps manifesto, and a speaker on data ops and many industry conferences.
He began his career at the Massachusetts Institute of Technology, Lincoln Laboratory and NASA AIMS research center. There he created software algorithms that provided aircraft arrival optimization at several major airports in the United States. And amongst the many things he's done, and one that stuck with me is that Chris served as a Peace Corps volunteer math teacher in Botswana in Africa.
so an absolute pleasure to have you on the podcast, Chris. Welcome and thanks
for taking the time.
Christopher Bergh: Oh, thank you. Thank you. Looking forward to the discussion.
Loris Marini: Alrighty. So data ops, whole realm of creating systems for learning. Maybe this is a bit of an obvious question, but I want to hear your take. Why do we need to worry about data ops and how does it help us do a better job?
Christopher Bergh: We do it by default, so we have systems that enable us to put things into production. Monitor things in production and iterate to get value from our customers. And so we do it by default. We just don't do it very well. And so it's as if we've built an assembly line, but it's really crappy in making sort of crappy cars.
And so how do we get better at that? the other thing I think is that the data and analytics field has grown quite considerably over the years. And there's a lot of people and a lot of tools. And from my perspective missing the point In a lot of cases we're obsessed with tools, We're obsessed with tech, but we're not obsessed with delivering customer.
And I'm a technologist. I'm coded my whole life. I like, I like this stuff, but in some ways it's not about the technology. It's about how do you actually answer a business question and affect change in the world and let's work backward from that. Presupposition, let's not spend time building architectures that then will do it or following project managements that then will do it.
Let's just go right to the problem. Let's find the problem and solve it. And the hard part is knowing what problem it is to solve, right? What is the insight that you're gonna generate?
Loris Marini: Yeah. Yeah, definitely. you touched on so many things there. Technologies run the world of data and analytics, and I'm one of those, and I've been learning or trying to learn to unlearn all the things that I
Christopher Bergh: I,
Loris Marini: that I picked up along the way. It's incredible like that there are super smart people that lead the efforts of turning data into know insights and informing the business strategy.
But the, it's almost as if we have the incentive system wrong. Like we think that success is to be able to understand complex technologies in a fast moving environment and the quicker we can put these pieces together and build something. That works and works reliably, then we are done.
That is the definition of success. But sometimes you gotta go boring to be reliable. Do, would you agree with that in general as a principal?
Christopher Bergh: Yeah. Yeah I've made many technical mistakes in my career by showing my technical prowess by building some really complicated thing when boring and simple would've been better, And I think the other principle is you just don't know what your customer wants. You three semesters of calculus and physics and and computer science and still you're talking to a business person who majored in drinking and beer in college and but
dunno what they're, Cause their model of the world, their conception of the organization and the business is naturally richer than yours.
And so trying to understand it takes a long time to really understand their model of the business in their head. And so to do that as a technologist, it's not about let's build a, let's spend a year building something, then throw it over the fence. It's much better to build something, build a little, oftentimes build simple, get it into the hands of your customer, and then learn really what they want. Because the, there is an ego gratification of building complicated multi box analytic systems with the latest cold tech that people are blogging about. And that's great. However, if you spend six months building something, was it really right? Did you overestimate the value that you're creating? a lot of times what data and analytic teams do in this way is they'll think that the customer needs 10 things.
They'll spend a long period of time. The customer will say, Great, these four, I really like the other six man. Not important, but you know what, there's two or three more.
Loris Marini: so you're less than 50%
Christopher Bergh: You just wasted all that time on the things that you didn't find valuable. So the search and enabling the search, and the other side of that is enabling your personal learning. And your team's learning on what really matters and trying to translate the, business person's rich theory of the business and your sort of theory of data and tech together.
It's a very hard problem and it's best done in short bits to, to maximize learning.
Loris Marini: I love that. I love that this is in full on, super aligned with the spirit of data. Like I created a podcast with that idea that we all going through a discovery process here. We're trying to understand what are the best practices, what is the best technology, but also what are we trying to do, like overall, what is the mission of the data team and how we can support our internal customers.
I love that we talk a lot about data modeling and this versus that, the star scheme on, there's snowflake. The, there's a billion ways you can tackle the problem of creating mapping concepts into structures in databases. But we don't talk about the business the model that already exists, that it's in the heads of business leaders.
And our job is to uncover that, understand how the business operates, and then map the data to the system, operate to the business operations. So try and help it, effective.
Christopher Bergh: it only matters if you make a difference, right? It doesn't matter if your company's got a 30 page project management methodology, and you followed it to the T, that's not success. If no one uses it and it doesn't have any effect on the world or likewise, if you build a cool technology system and no one uses it, it doesn't matter.
and I've done both those things in my career and it's not really great to work really hard for a long time and then it's just not used and have it sit there and you're like what did I just spend a year of my life doing?
Loris Marini: So bring me back to that day eight years ago when you decided, Okay, that's it. I need to build a company here and I need to coate Data Kitchen, and it has to be around operationalizing and creating systems for making the life easier of data engineers and data people. Walk me through the process. How did you mature that idea
Christopher Bergh: I guess I've I'm a working class kid from Wisconsin and it took me enough time to get enough money in the bank to take the risk to start a company, right? And like I had student loans and cuz starting a company you often have to spend a year not making any income. And I got to a point in my forties where that was financially possible for the first time.
So that's one thing. And then second is the company that I had worked for before had been acquired. And so I was there for a year and I decided halfway through I wanted to realize the thing that I wanted to do. I wanted to start a company. And what I had learned in that previous company, and it did basically out outsourced healthcare analytics.
So we got data from big companies like Allergan and J&J We had thousands of users. We integrated lots of data. We I had data scientists and data engineers and people doing data vis all working for me. . And in that I, I realized my life sucked for many years. It was very painful, right?
Loris Marini: Yeah.
Christopher Bergh: there were two things I just never could do at the same time.
And was produce perfect insight because lo and behold, you have an extroverted head of sales and you broadcast wrong data to their entire sales force. They will call you up and say mean things to you. Or the worst is they won't say mean things to you. They'll be passively aggressive and that's even worse.
And then the other hand is you had a basket of things you wanted to do. They always had more ideas. And so those two things, you wanted to go fast, throw a lot of ideas out. Learn but also run a very perfect factory. And so the kind of principles that we learned during that time about how to adopt, lean manufacturing techniques because I'd spent a long time in a software and field the idea of agile and DevOps.
Those ideas were all there. But the, in the entrepreneurial intuition was that because at the time, eight or nine years ago, it was all about big data and data lakes and data prep and like that eventually people would see the same problem. I had seen that you get enough running and your life is crappy enough that you've gotta think about the systems that people work in.
And so that's where we could solve this problem with software That
Loris Marini: Yeah. Interesting. And so the earliest the early days, did you, what did you focus on? What was your customer segment, and how did that evolve over the
Christopher Bergh: Yeah,
Loris Marini: last
Christopher Bergh: there was my co-founders, there were three of us, and actually we had a different idea for about six months. We were, that was way too far ahead. And so we couldn't get anyone to buy it. And then we're like we started to say maybe this operational analytics and we got a customer. We all, we are all technical, so we all do technical work.
Like I built a first version of the software. I acted as a data engineer. My co-founders did similar. We sold, we made the website and eventually we got enough money that we could hire an employee. We hired another employee. So we've always been a profitable company, but we've never put any money in ourselves or taken any outside money.
but we've always had this north star that the real problem. , the real pain in data and analytics, it isn't in what you do, it's how you do it. It's how teams work together, how you work with your customers, and that the, and we were just always unwilling to work in any other way because of our past experience.
And so we built some software. We started to get customers on the software, and then about three years into the company, we said, Oh wow, we really need to market this. and there wasn't a term for it. There was we internally, we were calling agile analytic operations, which is like way too long of a word.
and then we talked about is it DevOps for data science? Is it, And we settled on the term data ops because it was short. And and then we went to our first conference and I must have talked to 300 people and no one really got what I was talking about. And I'm a reasonable communicator.
Loris Marini: Yeah.
Christopher Bergh: finally I had someone I was talking to for five or 10 minutes and then I got an inkling. I said, Hey, do you have a software background? He goes, Oh yeah, I was a developer for 10 years. And I said, Oh, we're just applying the ideas of DevOps to data and analytics systems.
And he goes, Oh, why don't you say that right away,
Loris Marini: Yeah. Ha.
Christopher Bergh: And so that's where we wrote the manifesto. Actually, I wrote the manifesto on the flight home. Because I, I had to communicate exactly what we were saying simply. We wrote the Wikipedia article and then we hired a content writer and all that stuff went in a book. And so we really realized that the ideas first.
because we're a closed force company we have to make money to pay salaries. Our gift to the community is our books and the ideas. And hon honestly, like this year we had a, we have a three hour video training on data ops and certification. We've had 3000 people do it.
Loris Marini: I should do it. I didn't even know that existed.
Christopher Bergh: Yeah. If we google data ops, like most of the half the first page is stuff that we've written. So it's our search engine optimization's good.
Loris Marini: so fast forward 16th of November today 2022. The customer segment now, so who is the ideal customer for Data Kitchen?
Christopher Bergh: it's we tend to sell the large enterprises who can afford and we sell a single piece of software for 50 or a hundred thousand dollars yearly subscription. And they tend to be teams. Sometimes they're very small teams with five or six people who are super aligned to their customer in delivering value.
Other times they're 300, 500 person data and analytic teams and they're trying to actually get more aligned with their customer cuz they've gotten off on, we have six month projects and that's how we deliver. And recently, just in the last year we've really been trying to search to find how do people approach. Data ops, their own steps to get to this more iterative way, development, more customer focused. and a lot of people have a bunch of ads built systems already. And so we're we built a new product that's really easy to implement, really simple that sort of gathers data about what's happening with your data in your, in production data and analytic systems and tells you before your customer sees it if something's right or something's wrong, if something's gonna be on time or not, it's more.
And so after all these years, we come to really believe that sort of observability led
Loris Marini: Observability.
Christopher Bergh: and data people get convinced data. And over the past years I've had literally hundreds of conversations where people are really into our software, but like somewhere something else comes along that they're not going to implement the ideas.
They're like, Oh, we just had a crisis, or we gotta go to the cloud or we're doing a snowflake conversion and it's but think that happened in software too. The DevOps idea took a long time to become mainstream.
Loris Marini: Yeah. And for those that are listening, that may be new to the the idea of data observability you might be, wanna start from episode 42. We covered some of that stuff in a conversation title, avoiding data catastrophes with some of our cook. And we we shared some of the insights and, life experiences we both had that led to really believe that we need systems to alert us when things are wrong.
Before before that stuff Hits the customer. Large enterprises a piece of software. So the way that this data kitchen integrates, if I'm thinking about what's in probably everybody's mind at the moment, at the Modern data stack, the idea of, just go for services that extract data.
They aggregated, they allow you to transform the ELT stuff and then serve whatever it is that you're serving downstream. Everybody has that modern data stack roughly in their heads. In terms of the four boxes, how does data kitchen fit in? Is it an extra layer on top of it? Is it on the side?
Do you need, do we need to spend time to integrate it or is it turnkey solution
Christopher Bergh: So think of it in two parts. So typically the modern data stack, I wish it was just four boxes. You look at the pictures and it's 16 boxes of databases and servers and tools. And so one part is if you think about the journey that data takes, so it goes from whatever source system in your organization, and maybe it lands in a bucket in the cloud, maybe it goes through through snowflake in multiple zones, and then maybe you have a model that applies to it.
Maybe you have a visualization of it, and then finally it ends up in a catalog, right? And along the way there's some security so that journey, that data takes can be problematic because that's the factory in some ways. And it's a windy path. If you think of the, the sort of food factory you're working is probably pretty linear, right?
But data journeys go all over the place and then they've got sequencing issues and some are streaming and some are batch. the customer sees the end result, they're gonna see the report and maybe they're gonna see your data governance tool with the catalog updated, right? And they're gonna tell you if something's wrong or complain.
And so what I've learned is somewhere in my career, somewhere along that journey, things break and you could have very perfect raw data and something in the factory gets mucked up and
Loris Marini: This is so interesting. So the topology of the path, like how
Christopher Bergh: yeah. Yeah. And we call it a data journey. So like you could say raw data's the problem. That's definitely a good chunk of the time.
You could say, I've put the raw data together. , and that's the problem. Or the data could be perfect, but somebody configured tableau with a calculated field that's wrong or everything's perfect, but the model is somehow like off because the training data set needs to be set and like you really need to redo the model.
And so every part of that journey needs to be checked and tested and observed. you need to, once when you find a problem, you need to hone in and say it's here versus there because along that journey there's this sort of software rule called Conway's Law, which means that the structure of that journey is often represented by how teams are organized rather than the natural flow.
And bigger companies have hub and spoke models. Smaller companies, they put 'em in one team. You've got self-service users, you've got data scientists. that has the natural organizational finger pointing when it's Friday at 4:00 PM and the VP says Something's wrong. And it's whose fault is it so I can go home?
Loris Marini: Yeah. the data journey. Hey I'm thinking about, what a see at work, right? Like the company I worked for recently built a factory from scratch. They had to expand, and they realized that the only way to expand was to build from the ground up.
it was a seven years project, lots and lots of money to build.
But the way you approach a project like that is, first of all, you know the design part. You wanna know exactly what you're building, why you're building it, what is going to be the impact on the business. You wanna source the capital, yes, but also the expertise. And some of these things are complicated.
They're not off the shelf. You need to do custom builds. There's this entire section in our factory that is many floors tall, and it had to be built while they were building the concrete around it. You can't just, put the piece of equipment in.
So it can be extremely complicated. Yeah.
And if you trace the journey of the product from the input to the raw ingredients, to the final thing that goes on the pallet, on a track it's convoluted it's not a linear path. It goes up, it goes down, it comes back, there's many, steps yet the factory runs really well and we have a whole, a system of analytics that allows us to track it constantly when there is a fault and operators that know how to, fix the problem when the problem arises.
We get that in manufacturing for some reason. in data teams. We don't think of. Systems like a factory rather. We just hook stuff together. We completely skip the design process. Someone asked us for some insights and we trying to scramble together those insights. And so in the kitchen, when you think about like a restaurant, I, my, my uncle used to be a chef, then ran a kitchen.
So I know the mess that goes into a fast moving, tickets coming in and now you need to make a carbonara for 20 people, right? And you just, you that information came in now and they expecting everybody is expecting to eat in 20 minutes or 30 minutes. , so
Christopher Bergh: Yeah, yeah.
Loris Marini: that's what it feels to be a data engineer these days.
Christopher Bergh: does. It does. Yeah. It's hard. And like we did a survey of 700 data engineers with data.world, and 78% of them were so stressed. They wanted their job to come with a therapist, and 60% of them wanted to quit And honestly I wasn't surprised at all. That's a, and it's a hard job, right? Because yeah, you are, you've gotta build a factory that produces food or pet food without any problems. And consistent, right? Consistent. You can't your carbonara can't take tastes vastly different from the last time you had it.
But on the other hand restaurants will have a new meal every week, but like you're judged in data and analytic by your last new meal, you're judged by the amount of new insight you create. Everyone forgets that okay, here's the reports they're running, they're good, I trust them.
But like most data and analytic teams are in the insight generation business, the idea generation
Loris Marini: right.
Christopher Bergh: And that's what I think a lot of people love. We got into data because it's a creative field. field But when you have your factory and it's breaking down and you don't have these convoluted data journeys with complex relationships, and you don't have that mission control over the whole thing. It makes a lot of stress for people. And then what happens is you end up being a repository for blame because like things go wrong, you've gotta look at it you've gotta figure it out and then it takes weeks to fix it.
And so what I've learned honestly, is observing the whole system, checking every piece along the way. You check the data, you check the integrated data, you check the models, you check the visualizations, you even check. , the governance and security and all those things, you prove to yourself that everything's right.
And then when, on the odd case, when things do go wrong, you admit it and say, Yeah, it's wrong. We're gonna put in a test. So it never happens again. And over time, like our customers, they have thousands of data checks running and thousands of model checks and visualization checks all running against their production process.
Because you want to manage by exception, not by watching things.
Loris Marini: Yeah, definitely. That, that is not scalable. extremely boring, poor use of of money. Yeah. Interesting. So the speed of change he's is I think what makes, differentiates a physical real world factory from an intangible factory that creates intangible products like data and I never thought about it that way. yeah in a real factory, you're trying to optimize for effect, for efficiency as you wanna produce like high quality product at the lowest possible cost so that you increase margin, that's a whole point. But in a data factory, You're trying to do that, obviously, cuz you gotta manage your cloud bills and you wanna be effective and efficient. But nobody's happy with you with just looking at the data product that you created a month ago and having it reliably showing up in their inbox that is taken for granted. It's your next thing, it's your next insight, It's your next
Christopher Bergh: And then that becomes, So the first problem is more of an observation problem. The second problem of how do you actually change things quickly is more of an automation problem. How do you automate the ability to. And judge the impact of a change. And that that has elements of testing and automation and managing environments.
Orchestration, a lot of pieces are involved in that. And likewise, there's a lot of diversity in how different teams, the path that teams operationalize things, their path to production can vary. You'll have a centralized IT team with a formal process. You'll have self-service teams just pressing a button on their desktop.
but that ability to iterate quickly and change things quickly, pick up your factory, take a piece, change it has couple aspects that you really want. One is that you need to, when you pick up that factory virtually and change it, you need to judge the impact of your change on everything downstream or the steps beyond where you've made the change.
As an example, I change the name of a table
Loris Marini: Mm-hmm.
Christopher Bergh: Innocuous, right? Did that break the report? Did that break? There's some o other sequel that's building some kind of aggregation table. Did that break the. What about the governance? Like how does that all work? And it's like really easy to do for a data engineer, like just alter table, blah, blah, blah.
It's one line of sequel. But it has this ripple. I have a 24 year old son now who joined Amazon as a software developer, just like his dad. I didn't join Amazon, but I started as a software developer. Very proud of him. But his room was covered in Legos when it was 10, it was a mess.
And then when he was 16, I was scared to go in his room cuz it smelled. And he got his job within two weeks he had put code into production and some Amazon backend financial thing. And like, how did Amazon trust my son? I didn't, I don't trust him. Not that I don't love him, but I don't trust him,
Loris Marini: Yeah.
Christopher Bergh: And so what they did is they built a system around my son. So he made a little change and the system said, Red light, green light, It'll work. It won't work. And you don't have to understand the entirety of the data journey. You don't have to understand the downstream impact of one change on a bigger system.
You just have to look at the red light, green light, and then everything else. The transportation, the movement, the version control is all taken care of. And we need that in data and analytics. We need the red light, green light.
Loris Marini: the peace of mind that comes with a system like that. I'm so jealous, , to be honest.
Christopher Bergh: it sucks. And when you don't have it, it sucks so badly.
It's like your life is hell. Cuz then you're like I spent years and years doing this. You hope when you go into work that something is not gonna break. You're hoping for individual heroism on your team. And honestly, as a leader, I made the mistakes. I fired people because I thought they were incompetent.
And if you actually read Deming the guy who's, factory Toyota production system, lean, he says that when there are problems, 94% of the time, it's the system that people work in, not the person itself. It's a process cause not a special cause. And like when we fired, When I fired that person, I was thinking, oh, they're incompetent.
But really I was the incompetent one. I had not built a proper work system for that person to make a change and see the impact
Loris Marini: Safely and quickly. Yeah. So there's many things here around change. I'm writing in my notes impact of change, speed of change, and cost of change. Cause we talked about the speed of change and it's a must. Otherwise we owe without jobs very quickly. We need to innovate, right? That's a whole point of data.
Teams we have The cost of change. That's the difference with a conventional manufacturing plant. Putting in a new system, a new vessel, a new cyclone, that those are incredibly bulky pieces of equipment. You don't just push a button and create a new system or a new pipe. have your reliability teams, your engineering
Christopher Bergh: Oh yeah. I mean, it's a big
Loris Marini: it's a whole, there's a whole thing that has to be planned and managed, takes days or if not weeks, depending on how big it is.
Sometimes even months in data, it's literally just you hop on a text prompt and you change one line of sequel and you made. You made a change, like you potentially cloned create a new view. Like it takes that little, so it's extremely cheap to make a change, but the in tracking the impact of that change downstream and upstream is incredibly hard if you don't have systems.
So it's a, it's almost feels like a conservation. There's a conservation low here, up play of complexity. You get it cheap, you can change it fast, but you gotta have the systems otherwise it turns into a mess really quickly.
Christopher Bergh: Yeah. That's the key word here is we, the one way to view the insight of data ops is that we're building complex systems, right? And the complication isn't in the data, the complex is in the systems that are acting upon data. And so data, people tend to think about data and all that stuff acting upon data, servers, tools, code.
Man, it's not
Loris Marini: Ask your DevOps engineer
Christopher Bergh: Yeah. But I have a very inverse view. I think, I don't think data's that important. I think if you build good systems that can observe things, the data journey through production and tell you if something's wrong before your customer sees it, if you can change it and not judge the impact and not get any regressions, those two things, then you start to be able to make really fast changes, really small changes, and.
And your customers are happier, you can go to work. You can not dread. Like I, for years, I had the morning dread going into the office. when I first did data analytics, like 2005, I used to sit in my car and look at my Blackberry and just take a deep breath going, Oh, it's gonna go wrong,
And it's just, I hate that , I just hate it.
Loris Marini: Definitely. We are very stressed folks. We have another maybe 20, 25 minutes together. Wanted to explore a couple things. First, the, what can we do or what are some of the suggestions that you feel to to give to that person, The person that is looking at their smartphone and dreading to work in the office or slack?
Christopher Bergh: Well, one of my, one of the things I did was just, I convened a quality circle where I got a small part of my team together and I took the shame out of problems. I said let's just write all the problems we've had on a list and let's pick one every month to fix and to fix in a way that it would never happen again.
And so that ended up meaning that we started to observe and test production. , the time. So we had a huge amount of data tests and we added more in different types in different locations. And this was back sort of 15 years ago. And then the other part is we were always trying to do things faster.
And it was about observation and testing. Then it was about automation. Could we help automate deployment? Could we help automate the pieces? Could we give that ability for the developer to make a change the data engineer, the data scientist to make a change and judge exactly what happened from his or her desktop.
And those things can be started quickly. Like you can add data tests in your data however you want. Like
Loris Marini: Pretty much. Yeah. Yeah. With whatever technology you using now, there's definitely, there's this test suite of
Christopher Bergh: there's a test suite and and testing's not hard and just, try to get your system in production approved. tell you that it's working and every time it's not working, fall forward and say oh, we had a problem. Great. That's an opportu. I got this tired phrase, Opportunities to improve instead of opportunities to shame.
And every problem's an opportunity to improve. And for years I had to turn, I had an organization back 15 years ago, so beat up like I, I had I took a job and within three months I had a data he's a really smart guy. I went to an Ivy League school. He cried in my office cause he was so upset and I had to remove the shame in the organization in order to make it make it successful.
Loris Marini: Yeah. Yeah. It's a really good point. Opportunities to improve is gonna definitely be a
Christopher Bergh: Like I, I love your errors, opportunities to improve.
Loris Marini: Yeah. Cuz failures, mistakes and failures are always painful. You, no matter how, if they're not painful, it means nobody's using them and they don't matter. met her, they're gonna cause some pain. So you want that pain to to come in as early as possible so you do something
Christopher Bergh: yeah. It's, it gets you learning. The only problem is making the same mistake three times in a row. If you make a new mistake, that's great. That means you're trying
Loris Marini: That's madness. Chris, one thing that happens a lot when you are, and I'm sure you've felt that as well, when you are doing the stuff firsthand, you're doing your modeling, you're doing your piping, you're doing your infrastructure. and this is true in software as well, it's been truthful 40 years.
Testing has always been seen as that, yeah, we need to do it, but it's not, it's a nice to have from the perspective of the project manager of the inexperience software, team manager because those that have experience and they know how quickly tech debt builds up and how quickly software becomes unattainable.
They know the importance of testing and they will try to enforce it. But we can't assume that in data, especially given that one we are 10 years behind and two there seems to be so many stakeholders involved and many with different backgrounds that are not. That, backgrounds that don't involve any type of software or coding I think it's safe to assume that they don't really have an appreciation for what deck debt means and how it feels and how quickly it can build it up.
Christopher Bergh: Yeah.
Loris Marini: So how do we get these people to understand that testing is not about building perfect systems? It's quite the opposite. We wanna be scrappy, we wanna move quickly, we want to But for us to do that, we need to have some sort of understanding of what we are dealing with, what is the current state of the system, and the system is complicated and if we don't automate it, we're never gonna get that view. in your view, do we have to bypass the project managers and go directly to the chiefs that talk to the business and make them understand, hey, testing is not something I can expect my people to do on Saturday and Sunday because they literally fear Monday morning testing should be baked in as part of the team, and we need to allocate resources to do, otherwise we're not gonna move forward.
Christopher Bergh: yeah. Yeah. I think that's a really good point. And I argue that every, everyone who's doing data work has one of their goals is to work themselves out of a job. You don't wanna work yourself into being the only person who knows this pipeline or this model. because then you can't take vacation.
and I've had this sort of heroism breeds quitting and quitting breeds chaos. And so from a management standpoint, it's really bad to have a culture of heroes cuz they're gonna walk away on you and then try to get another hero to take up a big hairball like complexity.
So that's one thing. And then second is testing itself and tech debt. all the other things that go into building good complexity like library things and making modules. Those are discussions that you should have with your business customer saying, Look, we've, we have some tech debt, we have some governance debt.
In this week's sprint, we'd like to cover it and then they'll say, No no, I gotta get this out. It's really important. End of quarter. And then three weeks they'll go out and say, Yeah, okay, Why don't you guys have brought this up a bunch of times. And there is a relationship between your business customers and what you need and that they do understand after a while what you're trying to do is reduce the chance of failure in the insight you're giving them and you're trying to maximize your own team's productivity.
And they can understand that from a business perspective and tech debt and being able to refactor things that you do. Those are demonstrable business benefits that you can explain to them in addition to just cramming. But unfortunately, organizations. get in a corner where they just wanna stuff the data channel with their requirements, cuz they, they feel like they're never gonna get it again.
So they ask for everything. And so as a leader, it to develop a relationship with your counterparts. And that can also be helpful if you're delivering something not every three months, but every three days or every week.
Loris Marini: That's right. Yeah. Yeah. Cause then you have way more data points and way more Yeah. Stories that you can tell. And thinking about stories, one of the things that data people hate is not hate, but like with less strong on communication. We are way stronger in software and coding and machines and
Christopher Bergh: Yeah. We all are, Yeah. , That's why we got into the field. But my, my thing was work with things, my, my fifth grade aptitude test was work with things and maybe people.
Loris Marini: When you say work yourself out of the job, I remember working with this engineer and he literally he was one of those stereotypical software developers that you imagine, an, a comic book like Hood on big headphones, 5 billion screens in front of him, mechanical keyboard, And he would He would just lose the sense of time for hours and hours when he was doing stuff.
And I'm talking about the low level stuff that I don't enjoy doing. Open a shell and write a bash script and a bunch of Collectional and create an automated system to extract data all in Bash. And I'm like, What are you doing anyways, like next level, right? Way beyond my abilities. But the point is, meeting time, like you could see his face like , he did not enjoy every single minute of that.
And there's a lot of people like that
Christopher Bergh: And yeah, I'm like that, like my job as a CEO is to communicate, right? So I gotta go off and do podcasts and talk to people, but there are plenty of days I wish I had eight hours where I could just like, mess around with code. And getting on the plane of abstraction is so much fun.
And staying there and really getting your head into the problem, you can be very productive. I think the challenge as a technical person is sometimes we want to get lost in our abstraction and spend several months on it and not get any impact from the real world. So we've gotta force those people on our teams who would rather have a hoodie and a mechanical keyboard.
to get some feedback from their customer quickly. And so whether it's a product owner or the customer directly, don't let them go off for three months to build the greatest thing. Have 'em ship small bits of value every week because as we talked about in the beginning, the problem with us and our view of the world as technical people is that we spend a lot of time on how things work and not why
Loris Marini: Yeah.
Christopher Bergh: at all.
And the business people are all about why do it at all? Why do this versus that? They're all trying to optimize like which way to why. And so these how people and why people that were like, apples and oranges, we just don't mix very well. Yeah.
Loris Marini: Yeah. And for the listeners of this show, I'm sure that there's a bunch of recent episodes where we talked about the problem, like the process of understanding what people want and especially the connection between data and meaning. The, of the recent episodes are with Ashley Faith.
We talked about what's a data therapist and why you need one. That was episode 45. We chat Sanderson, we took more of a data warehouse focused approach on semantics, and that was 46. Jesse Alman, we talked about taxonomies, ontologies and true data mesh in episode 40. The reason why I'm mentioning this once is that I have a question for you, Chris.
So with all of that experience building systems to make change, Quick, fast, and safe. Do you see any of those principles applying or being applicable to the world of concepts, meaning and knowledge management?
Christopher Bergh: yeah, I guess I think of. you build a data system, you're building an ontology of the world, right? And the ontology can be reflected in the schema or the data definitions, the catalog. And so I think those are deployable chunks of code. So let's just take a really simple example.
You got a database full of tables, so you're gonna add a new table in, right? You're gonna join it to another table. You're gonna have it in a report, right? And then your data catalog's gotta have that new table and that new join and the new attribute of the dimension that you've put in. So all those things, you want to have table definition, the sequel that's doing the transformation, the visualization.
You wanna deploy all that stuff at once and before you deploy it, you wanna run a, you wanna put a whole bunch of really good test data in it and run a whole bunch of data tests against it to make sure it's right. However, the last mile is, , the knowledge the metadata. So what is the table?
What are the columns? Where did it come from? And I guess I see governance as code, as a really important part, whether it's code or configuration. I just think you should deploy the metadata, the meaning of that table, the meaning of the join and the new attribute you added to the dimension. Those things should be deployable or as with your other stuff or else it lags and there's too much lagging between and data.
And if you see that like you're trying to maximize your ability to deploy everything, and that includes catalogs the semantic part of data as well as even things like security credentials. Those things just make a deployable unit of it all.
Loris Marini: exactly. Yeah, I totally agree. So the combination of data and metadata should fall within the same sort of systems. The system should be able to handle both, because what happens if you don't have metadata is now you are able to push data very quickly through the system. But what that, what does that data mean?
From a user perspective, do you know what it comes from? Do you know
Christopher Bergh: Do you know where it comes from? It? And sometimes it's okay to cheat too. Sometimes it's okay to say, I'm gonna have, I've got a thousand customers on my data system and I've got one customer over here asking these really crazy questions. So I'm gonna give him a version of production for him or her only.
And I'm not gonna have all the tests in it. I'm not gonna, I'm gonna do the data transformation, I'm gonna do the visualization. I'm gonna do it, get it to them tomorrow afternoon, but it's their version and they can play with it. And I think that's fine. because Because that maximizes your learning and their learning.
And sometimes they'll go, Ah, that's not what I wanted. Or they'll say, I want this, and three other things. And occasionally they'll say, This is perfect. I want this to go out to the other thousand people. And do more work. And so the value of spikes and trials and management of variations of systems, I think is part of what we should do in automation.
We should be able to have AB tests, we should be able to have in the cloud especially enables that to have lots of variations of what's in production
either in production or for different groups of people or even developers. So the manage, I think the management of environments and variations of environments is really an important task in, in data and analytic systems.
Cause it, it helps you learn. Mm-hmm.
Loris Marini: And that we just mentioned in Indu industrial setting is what the r and d team does, right? They come up with a, new recipe, with a variation on a product, and they wanna run it small scale. They wanna be free to just mess around with it, right? They don't wanna necessarily run it through the factory yet.
That's why they are, they wanna be lean, they wanna be fast, they wanna learn. But once they land on a recipe that might work, then they need the support from whoever runs the operations in the factory, the trial, the thing at scale. And when you do things at scale, things are always different. That's true, I think, for physical products, but it's true for intangible
Christopher Bergh: But I think there's a relationship too in that One of the challenges in software is that the management of variations ends up being a lot in the sort of user interface or user experience, right? I'm gonna tweak the UI and new things comes up. You have that in data, maybe like with the report, but you're also varying on the data itself.
Like I put these two data sets together. Do they predict, Oh, this is how I show the prediction. I put, Oh, that didn't actually work. I need to put another data set together. Now I get the prediction. So you've got these two cycles. And developing solutions that may not be perfect, that's where the lack of perfection is important for people to, to embrace.
And you're important for customers to say, this is not really perfect. It's not, the data may not be right, but I'm doing it to get your feedback and
then learn from that then.
Loris Marini: yeah. What I find fascinating, Chris, is that the learning process, it's almost as if it extends from a much wider range in data and intangible and knowledge. Let's call it intangible assets compared to physical ones. Because once you if you go back to the factory, say, producing Candies have maybe a brand name, they've been on the market for 30 years.
When you buy that brand of candies, you know what you get, the value is there and there's no really ambiguity around it. You just look at the candy. There's some marketing, there's some artwork, there's some claims, right? It's relatively slow moving. Maybe they make a change, new flavor, how many times does it happen?
And if you ask five different people in the room, they might have different preferences around and opinions around that candy, but they all have the same system to appreciate, roughly like a level of acidity. Like sweetness, Like these things we can agree on, we can have a conversation.
That's why we become, experts in wine tasting because the hardware that we are provided with to appreciate physical stuff when it comes to food is pretty much the same. But in, in data it's so much different because we, it's the interpretation of a piece of information is never just, it never just comes from the information itself, from the data set, the table the number is that plus our own personal experience plus.
Prior beliefs, what we believe is true and what we learn in the business. So there's so many other layers that are invisible. And so unless you go through that process of hitting the wall and go is this working? Are we on the same page? this even making
Christopher Bergh: Yeah.
Loris Marini: being able to adapt it, You're never gonna get
Christopher Bergh: You're trying to affect a person who's not a data person. . And so sometimes that means it's a visualization problem. Sometimes that means it's a story problem. Sometimes that means it's giving 'em the right data set so they can pivot it around in Excel cuz they're a closet data person.
Other times they're just gonna ask you a question and you gotta answer it and it's a trust problem. And other times, no matter how good you do, they're not, they're just gonna use it. They're going cherry pick their, the data to, to confirm they are already existing intuitions,
Loris Marini: Yeah, exactly. There's all this cognitive biases that we have to deal
Christopher Bergh: Yeah. The cognitive biases. And I think what's really been good is I think we're getting a lot better as a world and looking at data and trying to understand where data fits and where it doesn't, where it's predictive is not, but it's still hard for a lot of people. A lot of people aren't good statistical think.
A And so how we convince, cuz that's the other big problem in data and analytics, is how to convince when the data's perfect, the results perfect, the operations are perfect. How do you actually convince people to actually do something with the data? That's like a huge problem, and it's literacy and democracy, a user interface problem. It's really quite complicated.
Loris Marini: I think it's a beautiful mess. We got 30,000 miles up and we try to look at the whole picture. What's happening? I'm trying to summarize in like less than 60 seconds, we've got agents that make decisions and act in environments, namely human beings that try to make businesses, which means providing value in exchange for cash.
We have as part of those organizations, other agents, other human beings that are very technical, all of them make decisions 90% of the time based on their emotions, not their, the facts. And yet we're trying to build systems to support decision making based on facts, which is already in itself like a huge challenge.
The engineers that. These insights that manipulated data, they often don't see, they spend time building systems for them to work much effectively, more effectively. Sometimes they don't have funding. Sometimes there is no simply no time. So they end up living this very miserable lives because what they do doesn't work impacts production.
People get upset and they just hope that a therapist would materialize in front of them as quickly as possible. What else did we touch on? We touch on the intrinsic difficulties of understanding the intangible and intangible factory as opposed to how easy it is to get a real factory.
We talked about meta, meta data management and meaning, and being able to manage that as part of the data itself as part of the one single package. And all the cognitive biases that go with the
Christopher Bergh: Well, We've touched everything today. Wow. It's like there's nothing left of interest.
Loris Marini: it's, it is a beautiful mess. I'm loving my job.
Christopher Bergh: a beautiful, it's a very human mess, right? Cause like we're we are trying to help people and in some ways we're a monk and we're this, we all believe in these we're a cult of data, and we're trying to get people who don't believe the cult.
We're trying to go out and convince people that this is right. And I think that's okay. we all tend to believe that data driven insights are right. we all know in the back of our mind, they're not always right. There's biases and sometimes intuition is better, but like the battle between data and intuition everyone who's listened to this podcast would say Data's gotta win.
But in reality, as you say, intuition social proof analogue arguments all all play a role in how people actually end up making decisions. And that beautiful mess is, It's interesting, right?
Loris Marini: Yeah. Yeah. It's a reason why I'm here. I'm still talking about this stuff, and I think I'm gonna be here for a long time, because definitely
Christopher Bergh: And it's.
Loris Marini: Yeah. Yeah. The more I unpack, I peel off the layers of the onion and the more layers I find, I'm like, Wow, this is incredibly
Christopher Bergh: But that's, to me, that's what's good. Why it's a good career. Cause you don't wanna have a career where you like pay off two layers and you hit something solid and Okay,
Loris Marini: done.
Christopher Bergh: with data, you can cut across industries, you can cut across psychologies. And you can cut across technologies.
And a lot of that's really to me as a learning person, I find that mix very interesting.
Loris Marini: Amazing. Chris, just for the listeners that stuck with us for 50 minutes plus one question I wanted to ask you and we didn't have time to cover, but maybe we can go briefly through it. I really wanna know your take on this. What are the, what would you say are the pillars, the core concepts that you baked in, into data kitchen when you developed the first version and then all the way to all the versions
Christopher Bergh: The first pillar was that run your data and your production data and analytics systems like a factory. So that's the pillar. And so the first step is to observe and test that factory without changing it. Okay? Then the second step is once your errors go down, you're not getting yelled at, your morning, dread goes down.
Start trying to maximize your ability to. change any piece, like it's cycle time. if, and so it becomes, if you can maximize, minimize your error rate, increase your cycle time with low risk to get things into production. It turns out both those things drive huge increases in your team's productivity, like five or 10 x literally.
And for all the dumb reasons, you're not wasting time in meetings. You are maximizing the amount of work you, you don't have to do. You're learning really what your customer wants. So error rates in production down cycle time to deploy into production, up with no risk drives big productivity enhancements.
And we do that through software.
Loris Marini: There you go. Running like a factory and minimize your cycle time. And
Christopher Bergh: And, and you're and you're gonna do a lot more fun stuff basically. And your life's gonna suck less.
Loris Marini: Yeah. And maybe even, be recognized for the work we do. Which is
Christopher Bergh: Yeah. That's a
Loris Marini: a lot of hard work. Maybe in 10 years. In 10 years for now, we gonna get there and get a therapist. Get a therapist. Cool Chris. So if I want, if if I fit the brief, I'm in an enterprise, I have a medium small, large team, seems to team size is now really a big leading the kiddo.
Who's gonna be the perfect customer for you? What, how do I get started? How, what's my next step?
Christopher Bergh: Uh, Just go to data kitchen.io. Fill out a form. there's two paths. One is if you're interested in the concept of. Running like factories and agile and DevOps, and lean and data ops. We have a whole bunch of resources. So we've got two books that you can download for free on our website.
Lots of pictures, not too boring. We have a three hour certification program that 3000 people have done this year online. And that's a good start on the concept. And then if you wanna start actually implementing the concept with our software, just fill out a form and we'll follow up and we can get you going really quickly.
Loris Marini: Right. To get a demo
Christopher Bergh: You get a demo, you gotta start using it.
Loris Marini: Yeah. The three hour certification program I'm super interested in. I wanna definitely dive into that. Can I Is it, does it run, is it on demand
Christopher Bergh: Yeah, it's just on demand. There's some questions and you get a certificate that you can put on your LinkedIn profile. Very important.
Loris Marini: That certification program goes through the fun, the fundamentals of
Christopher Bergh: Yeah. It's the fundamentals of the ideas of data ops.
Loris Marini: So you expect to be able to talk about ops to different stakeholders and at least
Christopher Bergh: Yeah. Yeah. And a lot of it is, it's not like what our software does is a lot of it is the why, the benefits, the how. It's a lot of the stuff that we talked on, because this isn't these sort of ideas. Unfortunately, there's a lot of good academic programs now in data and analytics. Master's degrees, bachelor's degrees, but still there's not a if you get a degree in computer science, you get, you have a software engineering class where they touch on these ideas.
There isn't a, engineering the data and analytics system class that they should touch on these ideas. So,
Loris Marini: Yeah. Yeah, definitely. That's a gap for sure. Fantastic. Chris, I'm gonna add those links and show out Chris, this has been such a, such an interesting conversation. The world of factories from, we talked about keyboard food and data and like introspection and psychology and teams and politics pretty impressed. I'm pretty impressed.
Awesome Chris. The best way to get in touch with you LinkedIn
Christopher Bergh: Yeah, like the email, LinkedIn, whatever. You've got my email address, so
Loris Marini: Okay. Yeah. Not for our listeners. Okay, perfect. LinkedIn. And yeah, I'll talk to you soon. But thanks again for being on the show.
Christopher Bergh: Oh yeah. Yeah. I appreciate the opportunity.