Bill Inmon: The immense business value of textual data

Loris Marini - Podcast Host Discovering Data

Learn how to extract business value from text from the father of the data warehouse: Bill Inmon.

Also on :
Stitcher LogoPodcast Addict LogoTune In Logo

Join the list

Join hundreds of practitioners and leaders like you with episode insights straight in your inbox.

You're in! Look out for the next episode in your inbox.
Oops! Something went wrong while submitting the form.

Want to tell your data story?

Loris Marini Headshot

Checkout our brands or sponsors page to see if you are a match. We publish conversations with industry leaders to help data practitioners maximise the impact of their work.

Why this episode

There is enormous and largely untapped business value in text. But managing and modelling this type of data is riddled with challenges. Today I learn all about textual data from the father of the data warehouse, Bill Inmon.

Bill has written 65 books published in 9 languages. He won many awards and was named by Computerworld as one of the most influential people in the history of computing.

His latest book “Text Analytics Simplified” is a guide to extracting business value from text.  It’s a must-read and you can get your electronic copy on Bill’s website at forestrimtech.com.

Join the Discovering Data community!

Do you want to turn data into business outcomes and get promoted? Discovering Data just launched a new Discord server to connect you with people like you. Discover new ideas, frameworks, jobs and strategies to maximise the impact of your work. Data can be a lonely and challenging career, don’t do it alone!

Request access now: https://bit.ly/discovering-data-discord

For Brands

Do you want to showcase your thought leadership with great content and build trust with a global audience of data leaders? We publish conversations with industry leaders to help practitioners create more business outcomes. Explore all the ways to tell your data story here https://www.discoveringdata.com/brands.

For sponsors

Want to help educate the next generation of data leaders? As a sponsor, you get to hang out with the very in the industry. Want to see if you are a match? Apply now: https://www.discoveringdata.com/sponsors

For Guests

Do you enjoy educating an audience? Do you want to help data leaders build indispensable data products? That's awesome! Great episodes start with a clear transformation. Pitch your idea at https://www.discoveringdata.com/guest.

💬 Feedback, ideas, and reviews

Want to help me stir the direction of this show? Want to see this show grow? Get in touch privately or leave me a review with one of the forms at discoveringdata.com/review.

What I learned

Share on :

Want to see the show grow?

Your ideas help us create useful and relevant content. Send a private message or rate the show on Apple Podcast or Spotify!

Episode transcripts

Loris Marini: Textual and unstructured data has enormous business value, but managing and modeling.

This type of data is riddled with challenges. So today I learn all about textual data from the father of the data warehouse. Bill Inmon. Bill has written 65 books published in nine languages. He won many awards and was named by computer, World as one of the most influential people in the history of computing, his latest book, The Text Analytics Simplified, is a Guide to Extracting Business Value from Text and it's a must read.

So if you can get your electronic copy now, I would strongly recommend doing that@foreststream.com. As usual links are gonna be in the show notes, but let's jump into this conversation. Bill, it's an absolute honor to have you on the show, so thank you. Thank you for making the time for being with me this morning.

Bill Inmon: Thank you, Loris. It's my real pleasure.

Loris Marini: All righty. So Bill you, there's a probably, would need 50 hours of podcasting to cover a 100th of the stuff that you published and created in the last even decade. But we only have 50 minutes. So I wanna focus today on text in particular, and I'm interested in your story. Can you give me a little bit of a background?

Why text? you wrote the book on data warehousing. Everybody's studying that book. I've read part of it and definitely need to do more catching up. Why text and why unstructured data.

Bill Inmon: Surely Loris. A long time ago I was sitting at my desk I I was wondering about something, and that is the fact that as important as the data is that we. in our profession call data today. It's not the majority of the data in the corporation. You go into corporations today and the vast majority of the data is found in the form of text.

And what happens is that information, because it's in the form of text, isn't used. In the corporation today and would be one thing if that data weren't important, but I'm gonna tell you, there's very valuable information that's found in the form of text that isn't being used today.

So I, I began to ask myself the question why is it that we are only as professionals looking at 10% of the data in the corporation? And so that's the question that started all of this off. 

Loris Marini: Wow. And yeah. Yeah. Tell me more, because I remember briefly a few weeks ago, we caught up for a bit of an alignment on this one, you mentioned a difference between unstructured textual data and unstructured non textual data, and there's a big difference there, isn't it?

Bill Inmon: Oh, there's a big difference there because in the corporation today, there really, when it comes to unstructured data, two kinds of unstructured data, there's textual data, which is unstructured. But there's also non textual unstructured data such as that data that comes from an analog machine or an IOT machine in order to understand analog.

 You need to think of, or I think of a surveillance camera in a parking lot. If you think about a surveillance camera in a parking lot, that surveillance camera is taking a snapshot every one 30th of a second, and quite frankly, throughout the day and the week that the surveillance camera is taking pictures most of the vast majority of the pictures it takes are very uninteresting.

It's either of an empty parking lot or someone driving into the parking lot or leaving the parking lot. The only time it gets to be interesting is when there is an accident or when somebody's breaking into a car. Then the in information captured on the surveillance camera becomes very interesting. So when you think of analog data you can think of vast amounts of data that have little or no business value whatsoever.

And a little bit of data, which has great amounts of business value to them. So that's the other kind of data that you find in the corporation as well.

Loris Marini: So we texted them. It's different. So textual data has, it's not like video, so you don't have 99% of the actual data being useless, but I guess the percentage is flipped. Am I correct in thinking that? 

Bill Inmon: Somewhat there's a lot more useful information in text, but there's also non-useful information in text. If I were a young man asking a g a young lady for a date on Saturday night, and I were to send her an email saying, Let's go out at seven o'clock on Saturday night, that information doesn't have any business value to it.

But so that, that's a form of text that, that also has very limited or no business value whatsoever. But there's a lot of other texts. It's got great amounts of business value to them. And I could go through my list of some of the forms of text. One of them is me. Medical records. records are found in the form of text.

And when you look at medical records you find a wealth of information. and it's interesting if you look at medical records are designed for one doctor and one patient for the doctor to read what's going on with the patient and find out what the problem might be.

And as far as that's concerned, medical records work quite well. In this day and age of Covid, we don't need to look at one medical record. We need to look at a hundred thousand medical records, and we need to look at factors such as a person's weight, a person's age, a person's gender, a person's race, a person's medication they take, whether the person's had cancer, There's a million conditions that the researcher needs to be able to understand. Covid. And those are all locked up in the form of medical records. And because they're in the form of text, it's very difficult for the analyst to discern what's going on, but medical records are only one form text.

It's important. And that's the tip of the iceberg.

Loris Marini: Yeah. So if, it's fascinating to me because text isn't our natural way of expressing ideas and communicating facts and things. So , we don't wake up in the morning and send. To a table with numbers, to a manager, to our colleagues. We send an email, right? We have grammar, we have language.

When I think about bridging the gap between what's in databases, in systems, and what's in people's heads text is the obvious channel. What's really hard about text is that I personally, I haven't seen a ton of material. To educate engineers and data modelers and anyone that deals with text front line on how to do that text.

And I know that you have an interesting story there. Tell me a little bit about that, because you've done that journey. You've gone from tables and mastering the modeling of the data there into to understanding. Geez, there's a lot in text, and text is different. So what was your story?

Bill Inmon: Text is fundamentally different from the data that we model in the corporation. I came from a background of data modeling and understanding structured data, and when I got over into the world of text, I found it to be very awkward when it comes to dealing with text and the reason why.

Is text is fundamentally different than the data that most of us are dealing with. Let me tell you some of the ways that it's different. Number one, when you look at text, you've got to have context. And the context of text is quite frankly, much more difficult than understanding the text itself.

That's one important difference. Another important difference is the confusion that comes with text. You can take something simple. Like the word fire. Now, what does the word fire mean? It can mean something where you set something on fire. You had set a building on fire.

It can mean that you've lost your job involuntarily, you've been fired. It could mean that you have a gun and you've fired the our language is filled with words that have all sorts of different meanings. Again, depending on the context. When you take a look at the difference between data modeling and and dealing with.

text It's the difference between an internal look at something and an external look at something. Data Data modelers look at things internally. They look at the corporation, they look at the internals of how the world works within the corporation. That's typically what data modelers do. But when you deal with text, you have an external vision.

You, you have, the way that anybody walking down the. Can use text with that you have a fundamental difference between an external view of the world and an internal view of the world. And so if you try to apply data modeling techniques to text, I'm gonna tell you right now, it doesn't.

I've tried, I've failed, and I've learned from my failure and it doesn't work. What What we have instead of a data model is something called a taxonomy or ontology. And we have something else called inline contextualization. Now, between a taxonomy and ontology and an inline contextualization, you get a very nice viewpoint of the world of text.

But these two things, a taxonomy and inline contextualization are very different from anything that you find in a data model. So for all practical purposes, a data model model is to structured data. What an a taxonomy is to textual data but but a data model is not the same thing as a taxonomy or inline contextualization.

Loris Marini: I'm thinking that with a tabular. Structured data. We are a lot more in control. I feel like we know what the columns are.

We know the data type. We know the name of the column. If we have lineage and we have observability, maybe we even know how we got there, right? But. Fundamentally, we are trying to make things clean and remove any unknowns from the data set. That's why we talk about data quality and cleaning the data set and ensuring that there are no missing values.

And with text, you can't really do that, can you? So it's like almost a different mindset.

Bill Inmon: when you're dealing with the corporation, you're dealing with how the corporation understands and perceives its data. But when you're dealing with text, you're dealing with anybody in the world. You're dealing with the outback. You're dealing with people on the ocean.

You're dealing with people that are old. people that are young. People that are recently learned our language. It can It can be anybody and it can be anything. And so there is. no Preconceived notion of structure when it comes to text and that presents a real challenge.

Loris Marini: There is no preconceived structure when it comes to text. I love that. fundamentally that is the The reason why text is so hard, there's no alignment we can do, because Sometimes I don't even know who wrote that piece of text, and even if I did, I can't really call a million people and sync up, Hey, okay, with this word we mean this.

So I'm left to whatever. Is the state of the comprehension of the language, of the grammar at that particular point in time in the world across nations, across borders. So how the heck do you , it sounds like an impossible task, How do you go from there to the meaning? 

Bill Inmon: When you try to apply an engineering mindset to this, you go crazy. I'm gonna tell you right now, and I have nothing against engineers. In fact, I'm probably an engineer myself. But the engineering mindset of having everything finite, everything neat and orderly, and it doesn't work.

I'm gonna tell you right now, and in fact, let me tell you a little story. I work with a wonderful lady. works for my company. And, but she's a perfectionist and it drives her crazy when you deal with text, you have to deal with imperfection. perfectionism in the world of text gets you nowhere.

And and she and I have. Long discussions on many occasion her perfectionism is not right because when you're dealing with text, if you get it right 90% of the time, that's what you have to learn to live with. And engineers don't like that. Engineers want to get it right a hundred percent of the time, and that last 10%.

I'm of the opinion it can't be done. And I don't use the word can't be done lightly but if it can't be done, I sure don't know how you do it. And so there's a different mindset when you're dealing with text. perfectionism is an enemy when it comes to dealing with text.

Loris Marini: Yeah. And Bill, another attitude that engineers always have is this tendency of looking for opportunities to automate stuff, right? So we write code, we hope that the code does a bit of work for us so that we can sleep or maybe do other write other code to automate others, other things in the hope, obviously, that one day we'll stop working or maybe we'll relax, which is never gonna happen.

We know that. But that's the dream that we all live as engineers. The question to you is, what's the role of automation in all this? I'm thinking of an NLP, obviously, but can we automate context extraction and up to what point we can?

Bill Inmon: You, you've mentioned nlp, so let me let me talk with you about NLP Natural. Is processing. When we started, we took a good, long, hard look at nlp and here's the conclusions we made. NLP is great for studying language. If what you want to do is study language, then NLP is something you're gonna really enjoy.

And love, but NLP is not a commercial product. That the world of people that have tried to automate nlp the world is full of people that have failed. The largest single failure that I'm aware of was IBM Corporation and something called Watson. IBM spent a number of years and it's estimated they spent over $2 billion, not million, $2 billion on Watson.

Before they gave up on Watson and all Watson was a trying to take NLP and commercialize it. And was not the way to go if you want to have an automated commercialization of the world. So we started off looking at nlp. We decided that NLP was not the right solution if you wanted a commercial product.

product Without going into a lot of detail and making this sound like a sales pitch, yes. We have automated the process of looking at text. We call it something called textual etl. And it is a commercial. product It is everything that NLP is not. Whereas NLP is expensive. We are inexpensive, whereas NLP is complex.

We are simple. Whereas NLP takes a long time. We are very fast, we don't require an army of data scientist either. And so now you say, Gee, Bill this is interesting. How long this take you? We've been working on this. You may. find it embarrasses me to admit this.

this is my 23rd year of working on this. 

Loris Marini: Wow. 

Bill Inmon: If you think this is a simple problem, you are wrong. This is not a simple problem, but we've reached the point now where we have automated the ability to ret text and turn text into a database. And we do work for a lot of large corporations in the world.

I have to tell you over and. The reaction we get from corporations is, gee, this is clo as close to magic as I've ever seen. 

Loris Marini: Say it from you. Those, I tend to always shy away from anything that sounds magic ish, but say it from you. This, that word has a bit of a weight. I'm very curious to know more, tell me more about the product and the use cases. Is this for large corporations, for startups?

What's, who are you targeting? Who's the ideal buyer? Who's the ideal user?

Bill Inmon: Let me tell you some of the people that we are working with today, one of them is oil companies. Let me describe the problem. Oil companies have got oil wells. That, that goes without saying, and for each oil there are 20 to 30. Thousand documents for each oil well.

There are things about leases. There are things about the government regulations. There's things about the length of the the size of the the type of the where the well was drilled when the well was drilled there is this mountain of documents for each.

And you take a large oil company that's got a lot of welds that are out there, and what happens is they say, Gee, how in the world can I manage this mountain of textual information that I need to manage for a well? How can I look across my 5,000 welds that I have and find out such things? What kind of pumps do I have?

What kind of rigs do I have and when do I need to do maintenance? It's a huge problem. I'll tell you what it's like. It's like walking into a large public library without a card catalog. If you walked into a large public library, And wanted to find and you didn't have a card catalog

Loris Marini: spend a week

Bill Inmon: a week, a month or a year just going through, and even then, if you weren't careful, you might miss it.

in essence we do is we build a card catalog. For these oil companies now with a card catalog, they can quickly and easily go in and find what information it is out of the thousands of documents that they have, that's one use case. 

Loris Marini: Yeah. 

Bill Inmon: use case is in the terms of medicine and in, in the arena of records.

As I mentioned, the ehr, the electronic healthcare record. It's something that's designed for one doctor and one patient. But if you want to do research, if you want to look at a hundred thousand patients all at the same time you can't do that. And so what we do is we go in there we read the medical records we turned them into a database, and now we can say for a hundred thousand, the number a hundred thousand a.

But for a large number of patients what are the common factors? Who heart problems, who's had cancer, and things like that. 

Loris Marini: plenty of applications for the, for this technology. 

Bill Inmon: applications.

Loris Marini: Planning of applications I'm trying to, obviously I'm careful about and cognizant of intellectual property. I know that this stuff is proprietary and there's only so much you can talk about it, but I'm I'm wondering in terms of an intuition to give the listener Just a hint on as to why, NLP doesn't work and what my, what is the right way of thinking about the problem? Not so much of solving it, but how you should think about the problem of extracting context from text. Because sometimes it's hard for me. When I talk to my wife, misunderstandings happen every day, right?

There's a lot of context that we assume the other person knows what you're talking about. And then every once in a while you get that look that goes What the hell you talking about? I'm like, Oh yeah, sure. I didn't explain this, and this. And those are shortcuts that every human being takes, right?

Like at work and our private life, but especially at work, when we are all busy. We're looking for shortcuts. We're trying to save time, save energy, same cognitive power. So with all that in mind, knowing that we are lazy creatures for a very reasonable evolutionary, reasons , but how do you extract Context and how confident can an algorithm be that the context that they, that it extracted from the text is actually accurate.

Bill Inmon: That sir is a very good question. Let me tell you how we typically answer that question is when we go to talk with people about what we do. what we do has never been done before. Number one. Number two, they say, This sounds too good to be true. I can't believe you can do it.

What we do, and we've done quite a few of these we do what we call a proof of concept or a pilot program. And with a proof of concept, we take a small amount of their data. We process it for them and we show them so that they can sit there and look at it and once they see that, number one, it can be done, it can be done in a short amount of time, it could be done accurately and we always tell them.

Are we gonna be perfect in doing this? We never tell anybody we're gonna do this perfectly, because we don't think, I don't think text could be done perfectly. It's like the conversation with your wife on occasion. your wife understands something that you say. Incorrectly. So then you are gonna say no, I really didn't mean that.

I really meant this and that and the other. And we know that and that's the nature of dealing with text. So we do these things called proof of concept. We take the actual data, we process the data we show them the results we show them how quickly it can be done and how well it can be done Even then people sometimes still don't believe think that we are like the Wizard of Oz and have some kinda magic man in the background that's pulling levers or something, and that's not what we're doing. it's taken me nearly a quarter of a century to figure out how to do this.

Loris Marini: Wow. And then the sort of inputs that you need are. Specific documents or any document in digital document.

Bill Inmon: best kind of document that we get is a, is one that's representative of your business now in terms of the form of the document. We can take voice and transcribe the voice. We can take paper and read from the paper and pencil. We can take off the internet.

We can take off email. When ask us what. What's the form the document has to be in? I tell them, I said, I'm sure in the world somewhere there's a form that we can't handle, but we haven't found it yet. Actually, there is one form of document that we can't handle and that is handwriting.

We've not figured out how to manage people's handwriting. But once it's out of handwriting then we have yet to find it. Now, along that line, another thing that we can handle is multiple languages. Right now we, we operate in English, Spanish, Portuguese, Dutch, French, Italian, German and Arabic. We know that we can operate in Mandarin, Kanji, Korean, Cyrilic and other base languages.

We just haven't put them into the system yet for a variety of reasons. 

Loris Marini: Yeah. 

Bill Inmon: different languages is something that we can do coming from any source of data, again, I'm sure there's some source of data that we can't handle. We just haven't found it yet.

Loris Marini: And the output. When we say text done the definition of doing text well, is it, what's the output? Is it like a map of entities and their relationships in the business? Or is it an actionable insight? Or what are you looking for? What's, what can I expect after I press run?

Bill Inmon: There's two things that you can. Number one, you can expect a standard database. You can expect your text to come out looking like a standard relational database. That's the whole process of creating structured data from from unstructured data. But the second thing that most people look for is the visualizations that can be.

Once your text is in the form of a standard database, then you can do such things as a knowledge graph which is very popular. You can do dashboards you can do per charts. There's a whole wide variety of things. We found that most people, when they want results, expect to see something like a knowledge.

Which we can produce. Not a problem. Now when I say we can produce we take our output, translate it into either Neo four J tiger graph or one of the popular knowledge graphing technologies. And then let them do it. We don't replicate what Neo four J and Tiger Graph.

Loris Marini: Yeah. Yeah. Okay. So you just feed the data in the right format so that the these technologies can then take it, visualize it, and and allow for that user interface to.

Bill Inmon: That's right. And so when people ask us to do what we do we ask him, he says, What do you want? Do you want a database or do you want the visualization? And so most people want the visualization, but some people want the database.

Loris Marini: I guess the database is more flexible, so it allows you to then connect it to all the other systems and do more custom stuff if you have the capability in house. So maybe a data team or data scientist that. I wanna look into that text. The I've done in the past, a small project I was involved in, it was a three months sprint.

We were looking to extract meaning from the chats. Our support team was a SaaS product that we interface with a whole bunch of people raising questions about the product. And we ask the question, Hey, can we. Find a pattern in all these questions. We know what the top three are, but can we go further than that?

Can we analyze this text and find similarities? And Bill, I must say it fail. Failed in the sense that we had to do it manually. I was hoping to write some code to do it, and all the code that I wrote, Was absolutely useless. I found some similarities. Sometimes I thought maybe I should start from the verbs.

And again, too many verbs. People use different verbs to mean the same action. So there are synonyms problems there. Maybe I should have created a small vocabulary dictionary internally and translate. Words or verbs that sound similar, and I've signed them a code or a number. I didn't have the time to do that, but the overall feeling that I got after that exercise was like, Man, this is, I'm so incapable of those doing the simple, solving the simple problem.

So I definitely relate what you described before what did you call it? E etl.

Bill Inmon: Textual, ETL

Loris Marini: yeah.

Bill Inmon: And because it is the same thing as we know ETL today, except that it operates on text and so that's the name. We applied to it internally 

Loris Marini: is there any hope that, we gonna solve this with the ultimate challenge you think of? Ensuring that the systems we have know the same things that our people know and vice versa. Or is it was one of those goals there are too hard to attain in practice. Theoretically you could, but it would take too much effort, too much energy to achieve or should, maybe we should think about an 80 20 rule.

I don't know. How are you thinking this more generally? I'm like expanding out of text and more into the overall

Bill Inmon: You you are thinking. two to three years ahead of me. I'm just a lowly technician trying to do day to day practical things and we'll get to be able to answer your question five years from now. But right now we're learning to crawl and before we can run, we have to learn how to crawl.

So once I learn how to crawl and walk then we can worry about running.

Loris Marini: Yeah. Yeah, definitely . it's interesting to me because the sort of outputs you're describing, whether it's a graph or the actual tabular data, can easily be fed into a feedback loop system with real humans in the loop. So that. The whole process of knowledge creation and knowledge sharing can be faster, perhaps even more accurate because there are some parts of the whole process that are really boring, like sifting through emails or slack messages or teams messages.

That's, no one wants to spend hours doing that. So if we've got a piece of tech, again, can do that for us and can say, Hey. Look at this graph, Look at the concepts and how and their relationships, context, meaning, and relationships, right? The whole ontology automatically extracted from you while you were asleep.

Now take that and add to it your own experience, your own understanding of how the business operates and build on that. That would be incredibly useful.

Bill Inmon: A along that line I find it to be interesting is we are probably the best friend of and best supporter of data science of anybody. And yet data scientists look the other way. They run the other way when we show up on the door. And I find that to be really odd because what we're doing is setting the stage for data science to be successful and honest to God.

And I'm not gonna name names of people but I know any number of data scientists and I don't know whether they don't want help. I don't know what the problem is but data scientists have this peculiar attitude towards, opening the door for them to be successful.

Loris Marini: Interesting. I wonder what that is. I. Definitely wear their hats in the past, and I know that there is this huge chip on the shoulder and most data scientists that is coming fundamentally from the gap between what the business expects from them and what they can actually deliver. They're the scrambling oftentimes to brush up on the latest.

Technologies or algorithms or solutions. And they go deep. They go in the math. So these are people that are extremely brilliant. Most of them I would say don't have the business context. They need to work well. Those that do still lack the data they need to do the sort of cool stuff that they, the business is dreaming about.

And yeah the, so it's, it is gotta be hard to show up every day at work knowing that you are one of the employees that are paid the most. Knowing that the business expects a lot from you and realizing month by month, that frankly you're not capable of delivering to those high expectations.

You just scratching the surface, right? And always saying, The data's not there, The quality is not there. I'm missing this. And like it doesn't feel good. So they wanna prove themselves, I think.

Bill Inmon: people like myself that are reaching out a helping hand, They don't want to hear about a helping hand. At least that's been my experience. A along that line that, and not moving away from the data scientist. I'll tell you another phenomenon that I find to be really odd.

One of the things we do is something called voice of the customer. Listening to what customers are saying on the internet about a product or something like, I had this most amazing conversation with a professor at a university. This professor at the university, a well known institution said Bill, We've been studying customer feedback for years and years.

And he says, I can te, I can guarantee you, I can prove this to you with studies that we've done that 90%. Of businesses never listen to their customer. They don't even know who their customer is. And he says 10% of businesses do, they don't know who their customer is.

And in fact, I have a quick little story to tell you. I was in a a telecommunications the other. Talking with the people there, the technicians, and I was talking to technicians and I'm not making this up. I mentioned the word customer and this technician looked at me and says, Customer, Do we have customers?

I'm not kidding. Yes. Your company actually has customers. That's why you're in business. You idiot. But and un unfortunately, that is a true story. I wish that weren't true, but we've got these technicians out there that have been divorced from the business They don't even know they have customers or who they are or what they're saying.

Anyway, so data scientists 

try to reach out a hand to help data scientists. They don't want any help. We try to reach out to fortunately, there are businesses out there that do care about their customer. There are people out there that all companies that do. To solve their problems.

So fortunately for us, there's, there is a live audience out there. man, reaching the audience. There's an old analogy we use. There is a man in the ocean that is drowning and he's out there in the ocean. There's no one around. All of a sudden a boat comes along and sees the man and drives the boat over to him.

And they throw a rope to him to rescue him. And he at the rope and he says, Sorry, I only accept red rope. So here's a man drowning in the ocean and you're gonna save his life. And he says, Sorry, I don't like the color of the rope. You threw me.

Okay whatever.

Loris Marini: Enjoy the ocean then.

Bill Inmon: Yeah. Have a good time.

Loris Marini: Have a good time. I think this is I definitely see that as well. I'm trying to understand or put myself in the shoes of these people, and I think it's for better or for worse, the level of noise in the industry is really high. I keep saying that and some people maybe I should explain what noise really means to me.

It's just that. The space is crowded, The terms are misused all the time. we're at a point where you say a word, a technical word, data mesh, data fabric data warehouse, and it literally data product. It means different things to different people. Just yesterday someone approached me on LinkedIn say, You're know, private message and said, Do you think.

A self-driving car is a data product. I think it's a brilliant question, right? But it opened up a new way of thinking cuz I was focusing on, to me, data products are things that allow you to learn things that are useful and things that are usable. So if they have a potential to teach you something about your business, but they're not then you know they're not, they don't have a lot of value.

You have to spend more time to learn how to use it than the time, than the advantage you're gonna get back. But at the same time, if they teach you something about your business, Usable. Easy to use, but they're not useful because that's knowledge that you already have. The value is very low.

So I was just thinking about this three pillars of data products and I found that ambiguity that you mentioned before, a data product, is it. What is it like? What do you see with data product? I don't think we have reached a state in the industry where things like normal common terms we use every day on LinkedIn are generally accepted.

Everybody has a spin on them.

Bill Inmon: I agree

Loris Marini: and I'm the first, right? And I'm the first . So part of this podcast is really to try and talk to a whole bunch of different people with different understandings. And that helps me gain a bit of a map. Say, Hey Loris, you're here. This is, your understanding might not be perfect, but there's a bunch of reference points you can use to navigate this mess.

So we're gonna get there, Bill. We're gonna get there. It's just just, we have to be patient. I think. So barriers to. Connecting system to people. We talked about perfectionism. We talked about the need of proving themselves. So particularly for the data scientist in the room, the engineering mindset to now, that doesn't really help with extracting context in general, just managing text.

And then the usual human barriers, like having a growth mindset, relearning and unlearning what you already know, and going through that exercise as you'd done of jumping in head first, realizing that you don't have. The necessary, skills or your experience with structured data didn't help you.

wasn't enough to navigate and gain new insights, new understandings and make progress. And so what is the, how do you then make progress? You just run through it. You go through it, right? You can't jump, you can't get around it. , you gotta get muddy.

Bill Inmon: The proof is in the pudding. We again we don't try to tell people what we do. We don't try to tell people how great our technology is. We simply say, Give us your data. Let us. Show what you can, we can do with it and let it show you how inexpensive and how simple and how fast it is.

we've tried talking with people, we've tried explaining technology believe me, that approach doesn't work. The way that we're going about changing the world is simply to show the results and. say Okay, you want business value? We'll show you how to business value.

We'll show a doctor how we can provide him information. He can't get any other way. We will take an oil company and show them how to create a card catalog and what that does for their business. And so rather than talk about how do we do all these things, we simply say, let us show. Even then a lot of people are still skeptical But that's life.

Loris Marini: Yeah, skepticism is a healthy thing to have. Definitely. You don't wanna. In and believe the first one. But just on one fi final question that just popped in my mind. Bill Explainability is a big topic in ai. We don't wanna have black boxes in business. We wanna have boxes. We can open and can tell us clearly what's going on so we can ask questions, worry about what if scenarios I think maybe the answer is obvious, but because the output is a table or numbers or a graph, explainability is right there and then, right? There's not even need to explain it.

Bill Inmon: The results are explainable and obvious how the results got that. Is, you talk about a black box yes, we have a black box, but I'm gonna tell you, trying to open up that black box, you don't want to do that. It's taken me 23 years to figure out how to how to make that black box.

And I don't have, and you don't have 23 years to sit and figure out everything that goes on.

Loris Marini: Definitely. Awesome. Bill Inmon father of the data warehouse Bill, has been an absolute pleasure that the book is Text Analytics Simplified, right? It's on your website@forestroomtech.com.

Bill Inmon: the magic word. It's free. We don't charge anything for it. All we ask for you is to tell us where to send it to you at. And there's no cost to the book.

Loris Marini: Fantastic. So there you go. Looking for free resources that are high quality looking no further people get that PDF and maybe we can even kick off the conversation on LinkedIn and start, reading it together. If you wanna do a bit of a read study group, let me know cuz I'm always looking for new books to.

It, Bill. Fantastic pleasure. Thank you again from being on the podcast, and I look forward to our next chat.

Bill Inmon: My pleasure, Loris. Thank you.

Contact Us

Thanks for your message. You'll hear from us soon!
Oops! Something went wrong while submitting the form.