Randall Ingermanson's Entropy Tests And A Low-Density Bible Code
Author: Keith York
This article is property of the author and may not be reprinted or distributed without permission
Posted September 9, 2000
Dr. Randall Ingermanson, a Ph.D.-level physicist, has written a book entitled "Who Wrote The Bible Code?" (published by Water Brook Press, a division of Random House Inc., 1999). In it he uses a series of tests (to be described shortly) to conclude that "There is no Bible code." (ibid, p. 137). However, in a "Note Added In Proof" on page 177, he modifies this claim somewhat. He writes that his results still allow for the possibility of a "sparse Bible code", although I have gotten the impression that he does not believe that a sparse Bible code exists either. What he calls a "sparse Bible code" I will call a "low-density Bible Code". While it is my belief that Randall has made some important discoveries that impact our understanding of the Bible Code phenomenon, I believe that a "sparse" or "low-density" Bible Code can still contain enormous amounts of deliberately encoded information. This paper shall explain and demonstrate that statement.
A digram is two letters, one of which follows the other. For example, "example" contains the following digrams -- ex, xa, am, mp, pl, and le. Any alphabetic language (be it Hebrew, English, or another) has characteristic digram frequencies for lengthy texts. For example, in English 'th' is the most commonly occurring digram, occurring in words such as then, think, monthly, and south. (In a similar way, trigrams are groupings of three letters. The trigrams in "example" are exa, xam, amp, mpl, and ple.) If a lengthy text is thoroughly and randomly scrambled, however, the digram frequencies of the text will be the product of the two letters' frequencies, rather than that characteristic of the language. For each digram, there will thus be a difference between the natural digram frequency and the scrambled digram frequency. (Likewise, for each trigram there will be a difference between the natural trigram frequency and the scrambled trigram frequency.)
When a given text is examined at a particular skip distance, a skip-text is formed. This skip-text's digram and trigram frequencies can be statistically analyzed to determine whether it is more like a meaningful text or more like a scrambled text. This can be done for each skip distance in a particular range (for example, the graphs in "Who Wrote The Bible Code?" show analyses for skip distances 2 to 150).
The hypothesis that Dr. Ingermanson was testing can be called a "high-density Bible code". This is the idea that the Bible is chock full of equidistant letter sequences (ELS's) about different people, places, dates, and events. If this is the case, then whatever skip distance one examines, one should see more meaningful words as ELS's than one should expect by chance. To illustrate with English, say that one has a list of 2000 "words". Half are meaningful, such as "large" and "things", whereas the other half consist of the first 1000 words whose letters are scrambled in such a way to make them meaningless, such as "rgela" and "sgitnh". If one examines a lengthy text over all possible skip distances (2 and up), one should expect the meaningful word "large" and the meaningless word "rgela" to occur a similar number of times because they consist of the same letters. However, if the "high-density Bible code" hypothesis is correct, then meaningful words should occur significantly more often as ELS's in the Hebrew Bible than meaningless "words" consisting of the same letters. If this is true, then the digram and trigram frequencies of the range of skip-texts examined should be statistically distinguishable from an average scrambled text. This is what Dr. Ingermanson tested.
In chapter 13 of "Who Wrote The Bible Code?" Dr. Ingermanson explains some sensitivity tests which he performed. He concludes that if as much as 1 percent of the skip-texts' letters consisted of deliberately encoded ELS's then one would be able to detect the phenomenon with the digram and trigram entropy tests. In later work (available at his web site www.rsingermanson.com), he claims that if as much as 0.3 percent of the skip-texts' letters consisted of deliberately encoded ELS's then one would be able to detect the phenomenon with the digram and trigram entropy tests.
Does this prove that there is no Bible Code? No. But it does put an upper bound on the "density" of the Bible Code. Let me explain. Assume that God has placed deliberately encoded ELS's all throughout the Hebrew Bible (the Tanach or the Christian Old Testament) in skip distance ranges 2 to 10,000. Assume furthermore that at each skip distance an average of 0.1% of the letters are part of deliberately encoded ELS's. This is only one-third of the threshold of minimum detection that Dr. Ingermanson describes. There are just short of 1.2 million letters in the Tanach. Performing the calculation, 1.2 million X 10,000 X 0.1/100, shows that there would be approximately 12 million letters as part of deliberately encoded ELS's in the Tanach given the above assumptions. This is ten times the amount of data in the surface text of the Tanach itself. Conceivably every letter in the surface text of the Hebrew Bible could be part of several deliberately encoded ELS's and this would still be undetectable by digram and trigram entropy tests.
If these assumptions are true, what conclusions would follow about the nature of the Bible Code? First of all, the claim made by some that the Bible has encoded details about every person who has ever lived would have to be false. There are currently over 5 billion people living on Earth. There are other billions who have lived in the past, but who are no longer alive. Deliberately encoded ELS's of 12 million letters can not encode details about billions of individuals. Even using the 0.3% threshold found by Dr. Ingermanson means that there are at most approximately 36 million letters as part of deliberately encoded ELS's. If each individual had 12 letters devoted to him (name plus one identifying detail), then at most there could be 36 million divided by 12 = 3 million individuals encoded. This is less than 0.1% of Earth's current population. Likewise, one would not expect to be able to pick out any story from the newspaper or nightly news program and expect to be able to find a valid Bible code array about that subject.
Even so, reverting back to the 0.1% "density" assumption, 12 million letters is a lot of information. As noted, it is ten times as much information as is found in the entire Tanach. If one assumes that the average array or cluster consist of 30 letters of deliberately encoded ELS's (say five 6-letter terms or six 5-letter terms), then there could be 400,000 such arrays deliberately encoded by God in the Tanach. Of course, array sizes will vary. A few arrays or clusters may be very large with 100 letters or more, while some may be quite small at 20 letters or less. It is my experience and belief that the array or cluster with a few terms is much more the norm in the Bible codes than is the array or cluster with dozens of terms.
The above calculations are of course based upon an assumption of a 0.1% average "density" of deliberately encoded ELS's throughout the Tanach in the 2 to 10,000 skip distance range, an assumption that was made for the purposes of discussion and may not be true. Still, even if the average "density" is significantly less, there would still be enormous amounts of encoded information. A 0.01% average "density" still means 1,200,000 letters' worth of deliberately encoded ELS's, which is equal to the amount of data in the Tanach. A 0.001% average "density" means 120,000 letters' worth of deliberately encoded ELS's, which is 50% more data than contained in the surface text of Genesis. This average density could still produce 4000 30-letter arrays.
While there are some additional technical considerations that have been omitted for the sake of brevity, my conclusions can now be listed. (1) Dr. Ingermanson's digram and trigram entropy tests do not disprove the existence of the Bible Code. (2) Dr. Ingermanson's digram and trigram entropy do place an upper bound on the amount of information that could have been deliberately encoded in the Bible. (3) This upper bound means that the "high density" Bible Code hypothesis is false. (Dr. Ingermanson has informed me that it was this hypothesis that he tested, as this was the conception of the Bible codes that was being publicly advanced by many. Also, it is this hypothesis he refers to when he concluded in his book that "There is no Bible code".) Specifically, God has not encoded information (much less detailed information) about everyone who has ever lived. Nor has God encoded information (much less detailed information) about every newsworthy event that has ever occurred. (4) The conception of the Bible Code that fits the evidence is what I term a "low-density" Bible Code. God has encoded information about select individuals rather than every individual and about select historical events rather than every historical event. Even so, the upper bound empirically discovered by Dr. Ingermanson allows for the possibility of large amounts of information to have been encoded, distributed throughout possibly thousands of different clusters or arrays. That is why I prefer the term "low-density" to "sparse". "Sparse" may give the impression that the Bible contains only a few arrays, an idea that I think is false. "Low-density" simply refers to the density of meaningful ELS's at any given skip distance. (5) Based upon my own experience as well as the idea of economy of use of available resources, it is my belief that the typical Bible code array consists of a few to several terms rather than dozens of terms. There is the possibility of small numbers of clusters or arrays consisting of dozens of terms. However, I would strongly expect these to be the exception rather than the rule, and should only be expected of individuals or events that are historically very important.
[I wish to publicly thank Randall Ingermanson of www.rsingermanson.com and Ed Sherman of www.biblecodedigest.com for reviewing the initial draft of this paper.]