In this short solo episode Loris speaks about the lessons learned so far skating at the boundary between business and technical leaders.
Join hundreds of practitioners and leaders like you with episode insights straight in your inbox.
Checkout our brands or sponsors page to see if you are a match. We publish conversations with industry leaders to help data practitioners maximise the impact of their work.
Your ideas help us create useful and relevant content. Send a private message or rate the show on Apple Podcast or Spotify!
[00:00:00] I can't believe that I'm sitting here, and this is episode eleven!
This just blows my mind. I started a data project... what was it? It was September I had the idea in June last year. It took me a couple of weeks to record the first five minutes and then two months to listen back to my own voice. And for the first episode, there was no plan, we didn't do any brainstorming we literally just turn on the mics, because you've got to start from somewhere, right? Otherwise, you'll never start.
And I remember I was wearing a blanket, I was actually surrounded, wrapped like a roll inside a blanket in the bathroom of our one-bedroom unit. And the reason was very simple, the bathroom was the only room with a door that wasn't the bedroom. Terrible place to record a podcast. I can tell you what the echo in that room was, hence the blankets to try to control that. It was a crazy time, my daughter was six months old, I was sleeping next to nothing, and I was trying to launch a podcast. And now if I look back, wow, I mean, today I'm sitting in an office!
[00:01:30] We have a much bigger place, my daughter is one and a half now, she sleeps through the night, so a big shout out to her. I know she doesn't understand yet, but hopefully, in 10 years from now, she'll re-listen to this episode, and be like "thank you dad for the shout out", so yeah, you're welcome, but it is true, you are sleeping a lot and that makes my life a lot easier! So there you go.
So why am I here today? Why are you listening to this? I want to take a little bit of a step back, look at what The Data Project achieved by starting from the beginning.
So what was the intention and how did that change if at all? The motivation for this podcast was to structure the conversations that I used to have during coffee breaks or over a beer on a Friday night at work with colleagues from different departments. There were folks from marketing, from sales, from customer support, people that typically don't hang out with engineers or with scientists but they should, a lot more and more often.
[00:03:00] And what I learned in those moments is that the perception that they had of data and what it means for the business and what it means for their own productivity in the day-to-day was very different from the one I had as an engineer. I was more concerned with quality, availability, and efficiency.
And then when I wore the other hat, the data science hat, I was more concerned with finding answers, you know, and testing my hypothesis and making sure that I was telling something to the CEO and was the right thing, backed up by the evidence available in that moment. So a very, very different mindset.
People in customer support were more focused on "Oh, I don't need insights, okay? Tell me the number I need, four or five basic columns because when I'm on a call or when I'm on a chat with a customer I need to know what the context is around that ticket or request". So it was very, very basic. It's not even BI, it was just making data available, making it easily accessible.
And that was very insightful to me. Because I was like, okay, there are low hanging fruits that we could easily grab if we had, a strategy, a data management strategy in place. That was the first hint that data management was actually something really, really important.
[00:04:30] I didn't know the term yet. I had an intuition for it, but then I discovered the field of data management. And it's been going on for so many years, way more than.. perhaps it's probably older than me and yeah, there's the data management association or DAMA with chapters all over the world.
And they do talks on the topic, they publish books. And so all this knowledge that I was exposed to in the last six months would not have been possible if I didn't start the podcast. Ultimately, knowing that there's an episode coming is what stimulates me to read books and reach out to people, and all that. So that's really, really exciting and I'm just loving it.
So I want to keep doing that and I've been working to find the most sustainable and efficient way of doing it. Because it is a lot of work, way more work than you can possibly imagine. If you're not a podcaster, you don't know right, when you start doing it. And you're like, whoa, okay, this is huge. So little by little step-by-step, I will get there. And my hope is to continue with the current release cycle. Every three weeks is nothing, too fast or too slow, but it's something I can manage and I'm hoping to keep doing it for a very long time.
Back to data management, why am I so passionate about this topic? It's because I think that data management is an obvious thing, but somehow it's not. [00:06:00] And what I mean by that is that if you think about it, information is an asset. Information comes from data and therefore data is an asset, but data by itself is not really useful. You've got to manipulate it, you've got to put it in a format, in a structure, that makes it easy to consume. And you've got to link it back to the real business case and, to the real world, and to people ultimately.
And so caring about data is really caring about people. It's about caring about the relationships they make up a business. There is no business without relationships. And you could run a successful business if it's just you in on a small scale without any data management. But if you're talking about a large business and you operate across multiple regions, and you have lots and lots of customers and lots and lots of people, then there are symmetries between what people in the organization know and what the systems know and there is fragmentation everywhere.
And so establishing something super simple, like how many units of that product do we sell to this customer, it's something that becomes like a week project. And then at the end, nobody knows whether you can actually trust those figures or not.
[00:07:30] This is incredibly frustrating for anyone in the leadership team, but imagine being a CEO asking a very simple question like that, and struggling to get an answer. It's not fun at all. This is from the CEO perspective, but from the data team perspective, we lose a ton of time to massage, manipulate, clean, all the data preparation necessary to do even the basic analysis, like an aggregation or a distinct count...
You start wondering, does it even make sense? It doesn't make sense to keep doing this preparation work over and over every single time. Shouldn't we think about incorporating this in a Data Management program that we run and fund within the organization, so anyone that wants access knows where the data is, they know that they can trust it because there are people paid to make sure that the data is trusted?!
And yeah, it makes sense. And if you dive into Data Management, you start pulling the layers you'll realize that there's Engineering, Architecture, Design and there's Governance.
And data governance, in particular, aligns really well with The Data Project because it is about behaviour and accountability and the people-side of the management of this intangible asset. So you'll hear me talk about this a lot more in the future I think.
[00:09:00] Another aspect that I find especially interesting is the economics of information and how organizations understand the monetary value of data as their fourth intangible asset. My interest in this area started a couple of months ago and I was really stimulated by the book of Douglas Laney.
I'm sure you heard me mention his name a couple of times on this podcast. Doug has been working for Gardner for almost 20 years and wrote this wonderful book, which I am still in the process of reading and absorbing. I think it's really, really valuable for anyone working in data, whether you're a business leader or a data scientist, he takes this focused approach on the economics of information and the lack of standards when it comes to managing it as an asset. So you'll hear me talk a lot more about Data Governance in the future.
Something I would like to do in future episodes is to explore the connection between information flow and the rate of evolution of a species, any species, whether it's a biological organism or an organization made of human beings, which in a sense is a kind of a biological organism...
[00:10:30] But you know what I mean, it's not one biological entity it's more like an ensemble, a collection, of biological entities. And it would just be interesting. I'm sure there's a ton of research and bright minds out there that have studied this problem. I can see connections with complex systems, with people that study the brain and how different cells in the brain talk to each other and specialize and evolve while staying connected.
And I feel like there's an interesting parallel here between the topology of an information processing system, like the human brain and the topology of an actual organization, which is made of many different domains, each one with different priorities, but ultimately all constrained by the ability of energy, which in this case is money, in the case of an organism is more like chemical energy.
But I can see that there's something interesting there to discover, I'm just looking for the right guests. So if you yourself want to take a stab at it, or if you know someone that would be the perfect guest, I'd love for you to recommend them.
Another aspect I wanted to cover is communication and engagement. The first episodes were just published on LinkedIn very shyly, once per episode. And then as I started gaining a little bit of confidence, I went out there a couple of times and then three times. So as you know, the publishing schedule for this podcast is every three weeks.
[00:12:00] And a dear friend of mine gave me the idea of extracting the best quotes from an episode and publishing regularly once a week on LinkedIn. I started doing that, I added Instagram, so thank you to the Instagram followers, if you're one of those. Even though the channel is very new, it's a baby channel, but it's growing which is nice.
There's a page on Facebook as well, and the same content is on Twitter. I think this trend is going to continue because it's a good way to reach a different type of audience, and so that's incredibly useful.
I must say that LinkedIn remains the number one channel to engage with the community. And so. I received a number of messages that is a growing conversation around the teasers for the episodes, which I am glad that people are appreciating, so those will continue. A podcast is a great medium to take a topic and dive into it, but apart from the interaction between the host and the guest, it is a one-way type of communication.
You're listening to the stream now, and I thank you for that by there is no direct way for you to get in touch with me if not by writing an email to me, but you gotta remember my email address email@example.com. Also, there are resources that I come across that are really useful and I'd love to share them.
[00:13:30] I'm going to tackle the problem in two ways. One is I'll share more resources on LinkedIn, but again, not everyone is on LinkedIn - to my surprise. The second is through a mailing list. As I experiment I add new things, the YouTube channel is new. Actually, that's worth talking about. I started posting on one of my three YouTube accounts and I realized that I had three, which was absolutely nuts, so I had to close down the others.
Now there's only one YouTube channel, I will be there Loris Marini | Data Foundations and I will keep publishing all the episode teases as well as the full episodes in video form. I hope to have the time to keep doing that in the future. But that's going to be only one part of the channel, and this is what really thrills me.
There's going to be a section on the channel where I take a topic and I summarize the way that I understand it in like three or four minutes. I will use these videos to stimulate a conversation on YouTube directly. As you've seen the last of the first 10 episodes, I followed my gut feeling, my instinct. So the podcast evolved organically depending on the people that I came across, the ideas that I'm exposed to.
[00:15:00] I still encourage you to get in touch. There's a form on the website or you can shoot me an email and let me know your thoughts!
I also noticed that there's a ton of people that have been asking how do I manage the podcast, what's my workflow. And again it has evolved a lot, but I think now I reached a point where it's predictable, I know what are the steps involved. And it's incredibly long, it's way longer than I thought but I improved my efficiency a lot, especially in the last three episodes, I think there's value here and I'm planning to share these lessons with the community.
So keep an eye particularly on LinkedIn. I will be posting videos basically every week with short tips and tricks, the behind the scenes of the podcast. And this is it for the future enough with the planning, because ultimately you gotta keep it agile and lean and go with the flow. So I'm not going to plan too much, but that's kind of the intention that tells you, where are you, what you can expect out of the future episodes of the podcast.
[00:16:30] There's one more thing, and is the need that I felt of talking about this topics is shared with a lot of people. And that's what excites me. I saw that there is a resonance there and I keep banging this chord and see what happens. But the more I play this tune, the more I see people that go like, "Oh yeah, you know, we should have a chat about that". So that's really, really good. I'm dying to tell you the names of the future podcast guests, but I'm not going to do that because they confirmed it, but it's still kind of in the making, we're not entirely sure what we're going to talk about.
So I'm going to just shut up for now and I'm going to dedicate the next 10-12 minutes to take a look at what we discussed in the last 10 episodes and kind of create a summary. So if you really listen to all of them I think you can skip this part, but if you want to follow me through this overarching larger story of the data project, then let's jump to it.
The first episode was just me setting the context for this project, the "Start Here" piece. But the real first episode, episode two is data science challenges with Stephen Pollack. We talked about the knowledge asymmetry in data and the role that he plays in the success and failure of that projects, the differences between the scientific mindset and the business mindset, and ultimately what can be done to make sure that data products are more useful for the end-users.
[00:18:00] Episode three was with Humberto Stein Shiromoto on doing science with data. Humberto is a senior data scientist, currently working at Telstra. We talked about a number of interesting things, including confirmation bias, the process of testing hypothesis, the difference between data and software Engineering, and so DataOps versus DevOps, the Dunning Kruger effect and many other topics.
The next one (Episode #4) was with Francesco Scoto. We took a bit more of a business focus there, at the time we recorded he was working as a business analyst, and he shared a number of ins and outs of the actual life of a business analyst. Particularly the challenge of dealing with a huge pressure from the business, you've got to come up with an answer really quickly but you rarely have access to reliable data or complete datasets. So you're gonna make a lot of assumptions, and the only way you can do that without hurting yourself or the business is by knowing how the business works. And so knowing the actual link in the real world, between a column in a database and the actual customer that's paying. So that was really insightful.
[00:19:30] Episode #5 was with Luiza Menezes. Luiza was at the time working as a product analyst at B2W Digital in Brazil. And that episode was more about what does it mean to work with fragmented data and the frustration she felt and how she coped with that. And she'd reacted by basically spearheading a business glossary project that was part of her vision for a data management program. Then she changed her whole and moved to a more information-mature organization, and I'm really glad for her.
Episode #6 was a bit of an experiment with Rebecca Lodin. We talked about manufacturing cause she's been working in manufacturing for a long time and data in manufacturing. The first two-thirds of the episode is really about the advantages of using data as a way to improve the efficiency of any manufacturing plant, but the last third is really the one that excited me, and it's about the importance of a psychologically safe environment for innovation and getting ideas to flow freely and progressing by combining different perspectives, as opposed to sticking our head inside of an echo chamber and, uh, convincing ourselves that we're doing the right thing. [00:21:00] So really interesting. There's a lot going on there in terms of ego. So I titled the episode "data in manufacturing and the courage to be data-driven".
Then I took a kind of sidewalk along the topic of data modelling and database design with Nadia Jury and with Jonathan Brooks, I met both of them through DBT Slack. It's a Slack channel created by the folks at Fishtown Analytics, which are also the maintainers of the DBT package, a modern way let's say of managing data models in a modern data warehouse.
What does it even mean? It means, to summarize it, that you've got data scattered across a number of different systems, some are SaaS apps, some are in-house, or legacy systems. You want to bring it all together, standardize it and structure it so that you can identify relationships between these different parts of the business and create a single source of truth, a single view on the problem, and actually act on it and you can trust it at scale.
[00:22:30] And so how do you do it? There are many different ways to tackle the problem, the most modern of the approaches is to combine the data into a modern data warehouse, which is optimized for analytics and not for operations, so very flexible. Very unstructured in the way you interact with it. You can query data in all possible ways you can join in any way you have in mind, there are no rules. Whereas in traditional operational databases, sort of transactional databases, there's an index, there's a main index in the table, there are strict rules for how you should query the dataset. And there's a reason for that, right? Transactional databases are used to power a website or an app, so the response time has to be the shortest possible. Whereas in an analytical database you don't care if it takes a couple of seconds to get an answer, what you want is the ability to ask whatever question you have in mind. So it's a very different type of technology.
In both episodes with Nadia and Jonathan, we looked at what is working in the space of data warehouse design and how DBT in particular is making the life of the Engineers a lot easier. And how it actually allows the whole exercise to be way more accessible for the business.
[00:24:00] Everything is in SQL, there are not complex pipeline hidden away or behind some strange code. It's just SQL. We're moving towards a future where SQL and Excel will be both part of the same minimum knowledge required by any business leader if they truly want to be data-driven. So that was really interesting.
Then episode #9 with Gilbert Eijkelenboom, you hear me smile because I had a wonderful conversation with Gilbert: People skills for analytical thinkers, titled after his book. And there was just an incredible adventure, some of my life experiences, his life experiences, his journey with the cold showers and ice. And it's a tale of our attempts to introspect on the way that our brains works and try to trick ourselves into exhibiting the best, more productive, behaviour that we can possibly get out of our system. So that was really good.
And episode 10, which is the last of the decade, was with Scott Taylor The Data Whisper. We took a completely different approach, we forgot about Engineering and Design and Architecture, and we just looked at the why, you know, why do we need data management in the [00:25:30] first place? That was specially tailored to business leaders, again, titled after Scott's book "Telling your data story". And that's been an absolute, an absolute pleasure.
I learn so much from Scott and you've probably heard my language change over time, because every time I read a book and I absorb it I go like, huh, this is interesting. So this is part of the effort of bringing data closer to the business, which is where data really should be. Um, and so that's it, that's the first decade.
You're listening now to episode 11 and episode 12 is ready and it will be released one week from now, this is breaking the pattern but it's okay. And that will be with Ravit Jain, Ravit is a community manager at Packt Publishing, and we try to explore together what can we learn from the mindset of a community manager and use it to build and nurture data teams that are effective that actually help organizations drive better decisions using information.
[00:27:00] Hopefully I'll see you there, and I hope you enjoy the rest of your day!