Episode Transcript
[00:00:05] Speaker A: ID the Future, a podcast about evolution and intelligent design.
[00:00:10] Speaker B: How does AI stack up when it comes to accurately representing the theory of intelligent design?
Today, I conclude my conversation with mathematician and philosopher Dr. William Dembsky about AI and ID.
In part one, we discussed some of Dr. Dembski's explorations in into the reliability and accuracy of large language models on the topic of intelligent design.
In discussing his back and forth exchanges with chatbots like ChatGPT and Bard, he shows us that this new technology has promise and definitely goes beyond what Wikipedia can offer for multiple reasons, but it will take work from us as well.
We have to hone our skills as investigators, we which means asking good questions and collecting other viewpoints and sources so that we can get a full picture.
In short, verify and then trust.
In this concluding segment, the discussion turns to truth and trust with AI. We conclude by looking at whether AI really will level the playing field for intelligent Design into the future.
Let's get right back to it now. You've also written about truth and trust when it comes to AI and large language models. You've written that although LLM powered chatbots can be helpful assistance, they can also be misleading to the point of being deceptive. We've heard stories, of course, of how AI hallucinates, simply makes stuff up in order to sound intelligible and to offer up an intelligible answer.
When it comes to the old Russian proverb, trust but verify, something Ronald Reagan fell in love with, you suggest actually approaching AI by doing the opposite. Verify first, then trust. Tell us why independent verification is essential when it comes to utilizing AI these days.
[00:02:03] Speaker A: Well, I mean, just. Yeah, it's interesting when you mentioned that Russian proverb, because that never made sense to me, you know, I mean, once you verify enough, you know, take a sample and you find that somebody has consistently told the truth, that's what verification does. It demonstrates the truth. Then I think you have a basis for trust, you know, so to start off with trust, I mean, you know, you are maybe giving the benefit of the doubt but still being cautious, you know, and then having to verify, you know, so, you know, Reagan knew what he was doing there. But.
But yeah, in terms of the need to verify what these large language models are doing, I think it's just in the nature of the beast, because they end up completing.
They keep adding these tokens according to how these neural networks have been trained and so they're going to be completing sentences and they don't care about truth. I mean, truth comes out because I think in the training, if you get these models spewing too much nonsense, too much stuff that is clearly untrue.
They're not going to be viable models. So the training has to get to a point where you're getting something that really does make sense and is on the whole true, reliable, but on the whole doesn't mean that there are no exceptions. And I think this is what you find and there are places where you have, it seems, more exceptions than others. Now I haven't been monitoring this that closely lately but I remember I was about two years ago I was looking for some quotes by biologists talking about how great biological systems are in terms of biological biomimetics and just the, the wonder of living systems, you know, and, but wanting to get quotes from secular biologists. And so I prompted Chat GPT to give me that and you know, I got, you know, so give me the quotes, tell me who it is and give me a reference. And so I was getting all these what seemed like really nice quotes and then I checked the references and the references were non existent.
I mean, you know, it was like they just the, you know, it just made it up. So, you know, so, and I think that's you, you just never know.
So you know, I think, you know, I'm not sure even confirming one LLM with another LLM is the way to go, you know, but you know, but you know that, that, that still probably would be some sort of check, you know, but I think you want some independent confirmation, you know, I think that's if they're, if they're factual matters and if, especially if there's a lot riding on it. And I think we, I think we had something like that just recently there was a big report this Make America Healthy again. And if I don't, you know, I say this, don't quote me on it, but here I'm saying this, but one of the criticisms I heard, I haven't confirmed it, I actually downloaded the report, but I think there were 500 plus footnotes, endnotes. But apparently it was said that one of the references had been made up. At least one of the references had been made up so. Suggesting that a large language model had been used in the construction of that document.
So you know, you don't.
Credibility is hard won and easily lost. And if you do this, you know, and I had a particularly embarrassing moment where I was speeding to get something published on one of my websites, got it out and it was egregiously wrong, you know, because I just trusted the large language model, you know, and so fortunately, you know, it got corrected quickly. But yeah, it was, this was, this was early on, you know, where I, I still wasn't. I was feeling my way around it. So. But I don't want to, I don't want to minimize my own culpability there. But, you know, but it's, you know, so you, you do want to watch it with these models.
[00:06:31] Speaker B: I'm glad you, you share those stories and, and you did. You do mention that in your writing, you know, that, that it is easy to, to just buy into it and take, you know, take and run with it. But you do have to confirm.
It's a good reminder to all of us, I think, that we have a responsibility if we're going to use this AI technology in any fashion. We do still have that responsibility to be an investigator, you know, and, and part of that is asking questions, you know, being critical of what you receive until you have verification that you can put stock in it. And it doesn't really matter what field you work in, even if you're a young student, you know, you've got to be skilled at investigation. And the heart of that is collecting evidence. You know, when I teach young people about fallacies and how to make sound arguments, one of the steps we discuss is the collecting of viewpoints, the collecting of different people's perspectives. Because not only does that help you refine your own beliefs and ideas, but it also helps you evaluate what you hear from other people.
So we can do the same with AI, you know, collect these viewpoints from chat, GPT, grok, bard, whatever you want to play around with, but consider it as a single source. And before you take it as truth, you've got to verify it. Now that does take work, though.
Do you think the whole ease of using these chat bots and AI is going to make us less willing to put in that work?
[00:08:01] Speaker A: Yeah, I think there, there is the temptation, because I think, I think this is one of the challenges with education that we get people to think on their own feet, because I think it's so easy not to go to these systems and help let them prop us up. I remember when I was studying Greek in high school.
Classical Greek, I was told, and I think Latin, but also especially the Greek. Greek don't rely on interlinear texts. So the idea with interlinear is you have the English and the, the Latin or the Greek right over each other because it becomes this kind of perpetual crutch, you know, And I think this is where I think a lot of education, the better education in this country is going to go where you've got a teacher and a student, no artificial intelligence, and basically you're, you're writing with pencil in a blue book, you know, and you're just having to compose things and do things right, right then and there, or with a keyboard that is air gapped. So you don't have access to any of these large language models or anything. But you know, it's interesting to your point about, you know, getting multiple perspectives. I mean, there's something, I think even in the book of Proverbs that was, you know, in the multiple, in the multitude of counselors there is with wisdom. I think it would be an interesting thing actually.
You know, is it the case that, let's say you take three of the most popular large language models, Chat, GPT, Grok and Claude.
Is it possible to use them to fact check each other, you know, and to get some sort of consensus and you know, is it, you know, let's say you've got an error rate of 0.1 or whatever percent, you know, with a given model by, by using them with each other, can you reduce that error rate? You know, so you just prompt with a bunch of different factual questions and whatnot, you know, and do you get a reduced error rate or basically are they reinforcing each other? Are they, are they all committing the same sorts of errors? I'm not sure if that sort of research has been done. I wouldn't be surprised if it has. But, but you know, I think for, for me, if I have to check something out, I mean, often what I'll have to do is, you know, I may do a Google search, you know, and then just try to go to some sources.
You know, I, I can do a Google Scholar search to find the articles in question, if they're articles cited, make sure that those articles actually exist. Exist.
If there are quotes supposedly from books, find the books, you know, where that appears. You know, a lot of books, you can actually, you know, I'm no longer in the academy officially, so I don't have access to good libraries.
So, you know, there, but there are ways I can go on the Tor browser and find books, you know, just about a PDF of anything I need if I want to change, check something to see if it's actually in there, you know. Yeah, you know, we, we find this. I mean, there are also urban legends that appear. I mean, you, I see quotes ascribed to Augustine or Aristotle or Plato, you know, which, you know, are just made up, you know, and I remember one quote from Plato. And I thought, yeah, this sounds really good, but that sure doesn't sound like, like, I mean, I read Plato in the original Greek in high school, you know, and I, I'm a trained philosopher and you know, I, I love his dialogues, you know, but I just, you know, but it's 1400 pages of dense text, you know, but then I just got PDF, I mean I have a, have two or three separate volumes, translations of his works and then I just did a search, you know, for that term or, or related terms, could not find it, you know, so I confirmed that it doesn't exist in his writings, you know, but sometimes that can be hard to do. So, you know, if, you know, I think there's also just asking further questions of the chatbot. Okay, you say that this appeared here. Okay, you know, but give me more details, you know, what's where, exactly, what's the page number, you know. Yeah, you know, and so then, you know, that can help with, with confirmation. But yeah, I think it is a challenge to confirm what these systems are giving you.
[00:12:30] Speaker B: And honestly, I mean I study technology and just its impact on humanity. And I'm just so concerned that we're, you know, gonna erode our own cognitive function the easier it gets.
[00:12:42] Speaker A: I'm, I'm not as pessimistic as you are or seem to be, you know, because I think the technologies, you know, they're always two edged swords, you know, so, you know, I remember watching for years as chess computers got stronger, but I, for a long time I doubted if they would ever match the very best human computers. And then they matched and then blew them away, you know, so, you know, so the best human world champion cannot compete with the best machine. But as a consequence of these machines, humans have gotten much, much stronger as chess players. And competition among humans has become very interesting. And I think this might be argued that this is the golden age of chess, you know, in terms of the number of grandmasters, people playing, also different types of play. I mean, it's not just the sort of tournament level where you give yourself a lot of time, but 10 minute chess, 5 minute chess, whatever, you know, and people.
So, you know, so I think the computer has actually helped make chess better, make chess players stronger. And I think the similar thing can happen with Chat, GPT and Bard. And Bard, Claude Grok, you know, if we use them judiciously, you know, so it's one thing, you know, if you ask it simply to define a word, but what if you, you know, say, okay, test my vocabulary knowledge, give me 10 words and let me see, you know, that you think are pretty difficult. Let me see how I do. Oh, if you only get three of them, that was probably too hard. Okay, make it a little simpler, you know, test where I am. Okay, now give me 10 words a day which will stretch me so that I can become better. You know, you can pose all this and it'll give you multiple choice tests and you know, you can, you can make yourself stronger. You can do math flashcards presumably, you know, with multiplication and division and whatnot. I don't, you know, don't quote me on that. I haven't tried it. But I have tried this little exercise with vocabulary.
So I think, you know, it's a, it's one thing if you just are saying you're giving an essay composition to do and you just let chatgpt write it for you, but it's another thing if you use it to improve your skill set.
If you say, okay, I'm supposed to write something on this, okay, here's a paragraph, okay, how could I have written this better?
And you know, I think it can give you feedback. It can be useful feedback. It's not always good feedback. I mean, I did one experiment where I gave it one of the most sublime passages in the English language that I know from James Joyce's early work, the Dubliners, the Dead.
And I said, okay, improve this, write this better. And so I wrote it better. Then I said, why do you think it's better?
You know, and then basically I said, okay, but this is James Joyce. How come, how could, how come you didn't recognize this as James Joyce? How could you improve it? You know, let it backtrack and said, okay, yeah, you're right, I messed up there.
But you know, I think with certain basic stylistic matters or you know, just evaluate this in light of strunk and whites, the elements of style. I think it probably could do that, you know, so I think there are lots of ways we could help these systems or get these systems to help us be better. I mean, I've been told, and I, I'm not a golfer, but I'm told that after the first, I don't know, 20, 30, 50 hours of golf play, most golfers don't improve, you know, and it's because they don't. I suspect it's because they don't do, do the sort of hard practice that makes you improve. I mean, it's just a matter, I mean it's, it's well known fact about performance psychology, but the type of practice that makes you better is not where you play to your comfort zone, but it's where you play to a challenge zone where you're challenged not to a panic zone, which is so, which demands so much of it you, that you just totally fall apart. But it's the same kind of challenge, you know, And I think we can get chat GPT to challenge us, but it needs to be used that way, you know, So I think that's the, that's the good side of the two edged sword, you know, but the bad side is that you can just get lazy and use it to get by. And I think this is where education is going to have to change. And I think part of it is going to mean getting rid of a lot of the administrators and having people who are actually with you in the classroom monitoring you in real time, what you're doing, evaluating what you're doing and making sure that you're actually doing real work, you know, and I think that's, that's possible, you know, but so I think, I think there's an opportunity here for improving education in a huge way. But I think there's also an opportunity for just, you know, making education pathetic, you know, and I think we'll probably see both.
[00:18:07] Speaker B: Yeah, well, I sure appreciate that, that optimism. You make some great points. Well, one key difference that you discuss between humans and AI is that we engage with a physical world as well as a world of abstractions and our knowledge of the world comes from both.
Now, AI only exists in that world of abstraction, that world of words and concepts. And because of that you say LLMs have no substantive connection to truth.
That's a powerful insight. Now you illustrate this with a light hearted hypothetical story. You say the sentence Alan stole Betty's purse is true if the people referred to here exist, Alan and Betty, if Betty had a purse and if Alan actually stole it. Those are things we can confirm as true or not. But if a large language AI model is trained on data that says Alan stole Betty's purse and also that Alan was framed for the theft, how does it decide what to tell people? What kinds of questions does this raise about trusting AI?
So you came up with some of these interesting questions that I thought were useful to ponder. You know, who's training the AI? Who decides what data to train it on? Who trains the trainers?
Is it important to be thinking about those things?
[00:19:25] Speaker A: Well, yeah, I think it is.
You know, it's interesting listening to you because it's, you know, I use these models a lot, but I mean, where I Use them these days tends to be more as kind of a universal synonym finder, you know, whereas used to go to thesaurus to substitute words. Now I'll use it to rewrite text and see, try to get ideas for making my text smoother and better.
So with something like, you know, to say they don't have a substantive connection to the world, you know, I think that's right. I mean it's the sort of argument, materialistic argument I made, you know, what is it about the collocation of atoms in your brain that tells you that something in the world is true? Well, I think there's, there's a closer connection with truth with these large language models because they are going to be trained on a lot of knowledge that we have. I mean knowledge is, you know, some philosophers will call it justified true belief, you know, so it's so a lot of what the data is going to be true and so there is going to be a connection with truth, with a lot of the things that are said. So I think I'd probably back off on that a bit.
But that said, you know, if you're trying to verify some particular clip claim in the world, you better hope that the, the things that would make that claim true, the sorts of subsidiary claims are baked into that large language model and are getting proper, proper attention, you know, and you never have a guarantee about that. So that's, that's the thing. So now, you know, it may, I don't know if we ever going to come to the point where you might say, well, but humans themselves are fallible. So do we get to the point where human fallibility actually ends up being less than these large language models, that they get more things right than we do? I don't think they do or I think there are these sort of systemic places where they fail, at least in my experience.
But I think then there's also the question, are we going to keep the human in the driver's seat?
Are we going to put the large language model in the driver's seat? And this then also gets to your point about, you know, interacting with the world. Large language models aren't so much interacting with the world as with the entire colloquy of language that they're all, you know, that they've been trained on. But you think of for instance, different, you know, self driving programs, you know, I mean they're going to be trained on a lot of data about how, you know, these systems should drive what, what sorts of obstacles they're facing and whatnot. So There does seem to be more of a world connection there that artificial intelligence, not large language models, but artificial intelligence is trying to grapple with.
So, you know, I, I'm not sure where it's all going to go. You know, how good these systems will get. I don't believe that they're going to achieve artificial general intelligence. I don't see these large language models, I see fundamental limitations in what they're doing, what they can do. I think they're theoretical reasons. You brought up me being a philosopher and mathematician. I mean, one of my main results is this law of conservation of information where basically, basically the idea is that if you get information output from a system capable of search, and these are really in the end search algorithms, then the prior input needs to be at least as great. So there's going to be a fundamental limitation on the creativity of these systems.
And we found that these systems, they tend to fall apart if they're trained on themselves.
So the view of Aristotle about God was that God is the perfect being. God thinks about nothing other than God because for God to think about anything other than God is to be thinking about something lesser. And so God thinks about the highest things.
And so these large language models, the idea is if they're going to achieve artificial general intelligence, they're going to become self contained and they're going to generate all this knowledge and they're not going to need us anymore. And what you find instead is as they generate new text and then get trained on that text, the models themselves degenerate. They end up becoming worse at responding to things and worse over time. And it actually makes sense because they're not getting anything original. They're in a sense regurgitating things and then having to find their sustenance on this regurgitation. You know, if you want to use an analogy, and I mean there have been a number of articles written on that and I think I've cited some of that in my work. I haven't looked at this lately, but I could easily get you several references. And yeah, George Montanez, I mean he would be somebody to get on your program if, if you wanted to look at that in any depth.
[00:24:47] Speaker B: Yeah, I should, I should pick his brain on, on these topics as well.
[00:24:51] Speaker A: Yeah.
[00:24:51] Speaker B: Well, just as we close, I'd like to bring it back to intelligent Design.
Just yesterday, in fact, in preparation for our discussion, I asked Chad GPT if ID is a valid scientific hypothesis. I just wanted to see if things were a little different than, than when you were looking at this a year year and a half ago, and it summed up its answer. This Is ID scientific? It depends on your definition of science. If science includes inference to the best explanation and the detection of patterns that suggest intelligence, then Idaho could be considered a scientific hypothesis. If science requires naturalistic mechanisms, testability and falsifiability in a strict sense, then Idaho struggles to meet those criteria.
Now, you wrote about AI possibly leveling the playing field, but again, it's been about a year now. Do you still think AI has the potential to do that for id? Have you seen any evidence of that playing out?
[00:25:46] Speaker A: Well, it's funny, I mean, with what you just read, actually that's quite encouraging because your prompt really didn't try to skew it in any one direction. I mean, I was dismayed that it said that Intelligent Design, if testability is important, that's going to be a problem because the sorts of patterns and inference to the best explanation, to my mind, imply testability. But still that it drew this sort of distinction between how you view science. I thought that was, that's instructive and actually very healthy. You know, I mean, science, the very term science that we, as we use it these days to, to the systematic study of nature, I mean, that was the, that's a term of use that's only come about in the last 150 years or so. I mean, before that it was natural philosophy.
People spoke about natural philosophy back in antiquity, you know, so people have been doing natural philosophy or science for a long time. And the sort of metaphysical underpinnings of science, you know, what, what is science ultimately about? What is what exists that science can tell us something about that's been disputed, you know, and so that at least came out in that, you know, so I don't think it was a perfect answer, but I think it's way ahead of what we see in Wikipedia. So that to me is encouraging. And then in terms of leveling the playing field, I mean, it's funny, you know, that because I used that. I think I was just waxing expansive then. I'm not sure I put it quite in those terms. But you know, I think properly prompted, I think these systems can give us a lot of useful information and, and we can limit the bias, you know, at least that's the sense I have, you know, and that may change, you know, I mean, if, if the people are behind these systems decide, hey, we're really, we're going to blacklist intelligent Design, we're going to make sure that it's discredited, you Know, then I think, you know, we may see a change, you know, and we. Yeah, and, you know, this is. We're, we're always reverse engineers with these systems. We don't get to look under the hood. We don't get to see what, what the people, the actual engineers are doing. So that's why we have to be prompt engineers. You know, we're reverse engineers. So we try to see what's. What. What's going on there, what are they doing with the data or potentially doing with the data, and what is it telling us and does it make sense? Is it bias free or at least more bias free than we might expect otherwise? So I think these are the sets of questions we need to ask.
[00:28:27] Speaker B: Well, and I think that was another benefit of your exercises and your writing in this, telling about your conversations, as you showed, you know, blow by blow, sometimes 12 or 13 attempts to go back and forth with this and that. That shows that, hey, we got to work at this. You know, we've got to be those investigators, be those, those questioners who can push and prompt and massage and, and really get to the bottom of things.
Even when we're using AI, don't just take the first thing it says and run with it, you know?
[00:29:01] Speaker A: Right.
For me, I would have to say it's a lot more fun dealing with ChatGPT and figuring out ways to prompt it and elicit information than getting on Wikipedia as an editor punching in a correction and then within minutes seeing it removed.
[00:29:23] Speaker B: Yeah, you're right there.
Well, Dr. Demski, I really appreciate your time today. This has been a great conversation.
We'll have you back on soon. I want to talk about being as Communion. I think that's an awesome book. And you've also got a monograph coming out soon along the same lines. So we'll definitely have you back soon to talk more.
[00:29:44] Speaker A: Well, please, please call me Bill. It sounds so formal. We've known each other I don't know how many long years now. And, you know, I just know you as Andrew, so just.
[00:29:51] Speaker B: Yeah, yeah. Well, I, I guess I do it for the benefit of those, you know, tuning in. But, Bill, it is now if you want to. Listeners and viewers want to dive into more of Bill's work, I'd recommend a couple of places. First, the recently released, fully updated and revised second edition of the Design Inference.
He co authored that with Winston Ewart. The second edition. And that's going to give you a firm grasp of the statistical model for detecting design that Dr. Dembsky, Bill developed back in the late 90s and for other articles on a range of topics including education, intelligent design, technology.
You can find them at billdemsky.substack.com I believe that's a great place to go there to get your regular writing. Billdemsky.substack.com well, for ID the Future, I'm Andrew McDermott. Thanks for joining us.
[00:30:48] Speaker A: ID the Future, a podcast about evolution and intelligent design.