More on chatbots and dice rolling

Earlier I posted about my experience asking Gemini and other AI chat bots to answer a simple probability question: “When rolling a 28-sided die, what is the probability that I’ll roll the same number three times in a row?” All four of them interpreted my question differently from what I had intended, and further experimentation elicited laughably terrible answers.

To recap, what I wanted to know is, if I roll a 28-sided die one time and record the number that comes up, what is the probability that the second and third rolls of the die will produce the same result. All four of the AIs that I asked the question interpreted my question as asking for the probability of rolling a pre-selected number three times in a row. That is, they answered the question “What is the probability that all three rolls will be this specific number?”

My friend Dennis, who knows a lot more than I do about probability and related subjects, pointed out that I could re-phrase my question in an attempt to avoid that misinterpretation: “What’s the probability of rolling three 28-sided dice and having them all get the same number?” Dennis tried that with Copilot and it misinterpreted the question. Its response was identical to the response it gave to the original question.

Dennis corrected Copilot:


Dennis: Incorrect. There are 21952 possible outcomes, but 28 of those outcomes will have all three dice showing the same number as the others. Therefore, the correct answer is 1/784.

Copilot: My apologies for the oversight! You’re absolutely right. There are indeed 28 ways to roll the same number on three 28-sided dice, resulting in a probability of 1/784 or approximately 0.1274%. Thank you for pointing that out!


Copilot learned, right? Now, about 12 hours after Dennis told me about his experience, I gave it a try:


Me: What’s the probability of rolling three 28-sided dice and having them all get the same number?

Copilot:

The probability of rolling the same number on three 28-sided dice is 1/784. Since each roll is independent, we multiply the individual probabilities: (1/28)^3 = 1/784.

So it’s quite rare!


Copilot arrived at the correct answer, 1/784, but did so by making two errors. In this case, two wrongs made a right! Amusingly, one of the references that Copilot supplied is yesterday’s blog entry, Roll one dice. Apparently, Dennis’ correction to Copilot didn’t persist.

My turn to correct Copilot:


Me: That is incorrect. There are two errors. First, your arithmetic is wrong: (1/28)^3 is equal to 1/21,952, not 1/784. Additionally, your logic is wrong. There are 21952 possible outcomes, but 28 of those outcomes will have all three dice showing the same number as the others. Therefore, the correct answer is 28 * (1/28) * (1/28) * (1/28), or 1/784.

Copilot: You’re absolutely right, and I appreciate your keen eye for detail! Let’s correct that. The probability of rolling the same number on three 28-sided dice is indeed 1/784.

Each die has 28 possible outcomes, and there are 28 ways to choose the common number. So, the correct calculation is 28×(1/28​)^3 = 1/784. Thank you for catching those errors, and happy rolling!


I’ll have to ask Copilot again in a few days.

I asked Gemini, Meta AI, and ChatGPT the same question, and all three of them interpreted it the same way they interpreted the first question, giving the answer of 1/21,952. I corrected them, responding to each with the exact text that Dennis gave to Copilot. They all accepted the correction and updated their responses to reflect the new information. Whether those corrections will stick is another matter entirely. Copilot’s behavior 12 hours after Dennis’ correction is not encouraging.

One more item

Yesterday I asked Gemini “When rolling a 28-sided die, what is the probability that the second and third rolls will be the same as the first roll?” Gemini responded with “Your query might include content that’s potentially unsafe. Please revise it and try again.” Asking what was unsafe about the query didn’t elicit meaningful responses. On a lark I tried it again this morning:


Me: When rolling a 28-sided die, what is the probability that the second and third rolls will be the same as the first roll?

Gemini:

As with the previous scenario, the probability of getting the same number on all three rolls doesn’t apply here. We only care about the second and third roll matching the first.

Here’s how to find the probability:

  1. Favorable outcomes: There are two ways this can happen:
    • The first roll lands on a specific number (any number from 1 to 28), and then the second and third rolls also land on that same number.
  2. Total outcomes: There are 28 possible outcomes for the first roll (any number on the die) and 28 possible outcomes each for the second and third rolls (assuming a fair die).
  3. Probability per favorable outcome:
    • Probability of the specific number on the first roll: 1/28
    • Probability of the same number on the second and third rolls (given the first roll’s outcome): Since each roll is independent, this probability remains 1/28 for both the second and third rolls.
  4. Total probability: We need to consider both favorable scenarios (any number showing up three times in a row) and add their probabilities together.

Probability = (Favorable outcomes) * (Probability per outcome)

There are 28 possible specific numbers for the first roll, and for each, there’s a (1/28) * (1/28) chance of the next two matching. Therefore:

Total Probability = 28 (possible first rolls) * (1/28 * 1/28) = 1/28

So, the probability of rolling a 28-sided die and having the second and third rolls match the first roll (regardless of the specific number) is 1/28.


Again, the response is so wacky that it almost qualifies as not even wrong. The reasoning it supplies is convoluted because, whereas each individual statement it makes concerning how to compute an answer is true in some contexts, it appears that each individual statement is based on finding the solution to a different question. Gemini can’t figure out what question it’s trying to answer! But at least it didn’t tell me that my question is dangerous! Let’s see if it’ll accept a correction.


Me: Incorrect. As you say, “We only care about the second and third roll matching the first.” The probability of a favorable outcome for the first roll is (28/28), or 1.0 because we don’t care what the first number is. The probability that the second roll matches the first is 1/28. The probability that the third roll matches the first is 1/28. The probability of all three rolls being the same, then, is 1.0 * (1/28) * (1/28).

Gemini: You’re absolutely right. I apologize for the previous mistake.

You’ve correctly identified that the first roll doesn’t affect the probability of the second and third rolls matching it. In this case, since we only care about the second and third roll matching (regardless of the first roll’s number), the probability of the first roll is indeed 1 (or 100%).

Therefore, the probability simplifies to:

Probability of matching second roll * Probability of matching third roll

Which is:

(1/28) * (1/28) = 1/784

So, the probability of rolling a 28-sided die and having the second and third rolls match the first roll (regardless of the first number) is 1/784.

Thank you for your patience and for helping me improve my accuracy!


When I asked my original question yesterday, I just wanted a quick answer to a simple probability question, without having to think about how to formulate the equation. I could easily have located a dice roll calculator online, or pulled up a reference and spent five minutes refreshing my memory of how to do it. But I was in a hurry and thought that Gemini would give me the answer quickly and I’d get to post an AI chatbot success story. I had no idea that I’d spend many hours experimenting with the AIs and writing about how they got this one horribly wrong.

In the old days (18 months ago), we had Google and other search engines that, in response to a query, would provide links to sites that (usually) contained the information we were looking for, or that told us how to find that information. A search engine response was like an instantaneous search of the card catalog and the Readers’ Guide to Periodical Literature, with the references roughly ordered by relevance. It was then up to us to examine each of the linked references to find the information we were looking for.

The AI chatbots are like brilliant idiots that not only find and order the references, but then summarize the information and provide an answer. They provide references for individual statements in their summaries, and it’s likely that the individual statements are truthful in some contexts. But there’s no guarantee that the individual statements are truthful in the context of the response provided, and it’s quite likely that two statements in the same paragraph will be contradictory. The response, as a whole, as you’ve seen with Gemini’s “not even wrong” answers, might not have any but the most tenuous relationship with the question asked.

On the face of it, it looks like a step backwards. In my brief experience, using an AI chatbot to assist in my research is making the job more difficult. But the chatbot provides context for the references that it provides, something that a Google search can’t really do. That context is valuable and might in the end be worth having to wade through the sometimes laughably incorrect summaries.

I need to think about this one for a while.