The Minimum Threshold Of Plausibility
For An N-Term Array
(Abridged Non-Technical Version)
Author: Keith York
This article is property of the author and may not be reprinted or distributed without permission
Posted June 18, 2000
On April 22, 2000 I posted A Protocol For The Statistical Analysis Of Bible Code Arrays, a 3-part article describing a protocol for statistically analyzing Bible code arrays and illustrating the method on two example arrays. Since that time I have not used it while it underwent a period of review and critique. While some of the individuals whom I asked to review the method never responded, some did, leading to a vigorous discussion. (It should be noted, however, that any errors or problems that remain are my responsibility, not theirs.) Now that the review period is finished, it is time to present the results. This is an abridged, non-technical version of this article in which I have tried to keep the math and other technical discussions to a minimum. If the reader wishes to read the unabridged technical version, click here.
No Cluster Analysis
One area that the reviewers agreed on was that it was good that the method does not use cluster analysis as illustrated on Roy Reinhold's website. When I first posted the review of his statistical method, Roy had just given me the calculations for the cluster analysis of the Sid Roth life array. Even though I was not prepared to accept cluster analysis without question, I gave him the public benefit of the doubt since I did not see any errors in the arithmetic. Then, when I published my own protocol, I simply stated that my method would not be using cluster analysis.
Shortly after I became aware of Roy's cluster analysis, I discovered what I believe to be serious flaws in the method. Rather than publicly post these concerns, I first engaged him in e-mail discussion on the subject. Though I have made him aware of these concerns, I have not changed his mind and the cluster analysis is still used on his website. Thus I have decided to detail my concerns in the article Why Cluster Analysis Is Flawed.
No Negative R-Values
A Protocol For The Statistical Analysis Of Bible Code Arrays Part 3 contained the following quote. "If a report for an array includes matrix R-values or R(A') values that are negative numbers, this is the same as saying that those ELS's are expected to occur at least once in the matrix simply by mere chance. (In this case, you should examine the list of words whose R > 0 and ask yourself if only these words were considered with the central term, would the word list be judged to be strongly and definitely related?" There is strong agreement on this statement and a belief that it should be more strongly emphasized. To state it more plainly, say that an array has fifty terms (in addition to the central term) but only six of these fifty terms have a matrix R-value or R(A') value (depending upon the method of analysis used) that is greater than zero. In this case, one DOES NOT have a 51-term array, but rather a 7-term array. Since a matrix R-value < 0 means by definition that this term is expected to appear at least once in the matrix in its skip distance range -d to +d, then that ELS cannot be considered as a candidate for being a valid element of that Bible code array. A 51-term array may look impressive, but if all but 7 terms are expected to appear by chance anyway, then the impressiveness of the array is much diminished.
Matrix R-Values Or R(A')?
In my protocol I stated why I liked the R(A') approach. However, use of matrix R-values is also legitimate, and quicker to calculate since CodeFinder automatically gives the matrix R-values for each term. One should only state in one's article which approach is being used for a particular array.
Concerns About The Overall Matrix Odds
One area of concern among the reviewers is that the calculation for overall matrix odds may give results that are much too high. There are two reasons behind this concern.
Fundamentally, a reported overall matrix odds for a particular array explicitly refers to ONLY the actual terms found in that array. There are two related concerns here. First, it may be that there are other terms (however few these may be) or alternate spellings that would convey the same or a similar message as the array actually in hand. If these other terms or alternate spellings were not actually searched for, then the overall matrix odds would still be valid. On the other hand, if these alternatives were searched for, but only the "successes" reported, then the calculated overall matrix odds is invalid. The overall odds for obtaining an array that is "equivalent" to the one actually seen would be lower than what has been calculated. Secondly, and just as relevant, other terms describing other aspects of one's chosen subject may have been searched for and likewise only the "successes" reported. This reporting of only "successes" and not "failures" is referred to as winnowing or "cherry-picking".
Saying this does not necessarily mean that arrays that individuals report have undergone such a winnowing process. It simply means that the potential for such winnowing to have occurred exists. The only way to be sure that such winnowing has not occurred is to be presented with a large set of rule-specified words that govern the search. (An example of a rule-specified search list is the "Great Sages" experiment. Here the names and appellations of rabbis whose biographical entries exceeded a set length in a particular Jewish reference work were searched for in Genesis, along with their Hebrew dates of birth and death in three standard formats.) To present a calculated overall odds for an array without such a rule-specified search list is to invite the suspicion (whether justified or not) that such an analysis is invalid because of the potential winnowing that might have occurred.
Secondly, since many people have the impression that the more terms that are in an array, the better the array must be, there may exist the temptation for the searcher to meet this expectation by packing an array with several marginal terms. These terms with positive but low R-values would increase the array's calculated overall matrix odds as each was added. However, since some of these terms might have a relatively high probability of occurring by mere chance, one may have an array of astronomically high calculated overall odds even though many of the terms are very questionable.
Is there a way to usefully analyze Bible code arrays given these particular concerns? Yes, as I will show below.
A Discussion Of Standard Deviations
Before introducing the concept I have developed to meet these concerns, I must first discuss the idea of standard deviations. The standard deviation is a term widely used by statisticians. In a normal or bell-shaped curve, 84.13% of all values lie below the +1 standard deviation level and 15.87% lie above it. An R-value corresponding to the +1 standard deviation level is 0.763. (For the full discussion, see the technical version of this article.)
Assume that one has an array with n terms, where the central term is at a skip distance d and that the central term is only expected to occur once in the search text in the skip distance range of -d to +d. In this case, the text R-value of the central term (denoted R0) is 0. Assume also that the row split is one (thus Sdif X Smax = 1), and that each of the other terms has an R-value [whether R(A') or matrix R-value] corresponding to the +1 standard deviation level. Using Equation 1.2 from A Protocol For The Statistical Analysis Of Bible Code Arrays Part 1 the overall matrix odds for this array would be calculated as Antilog [R0 + R(sum)]/1 = Antilog [0 + 0.763(n-1)] = Antilog [0.763(n-1)]. If the minimum level of plausibility for an array to be considered a valid Bible code array is for the average term to have a +1 standard deviation level, then this would be the calculated overall matrix odds of an n-term array of minimum plausibility.
The Minimum Threshold Of Plausibility For An N-Term Array Introduced
This leads to the central concept of this article, the minimum threshold of plausibility. Simply stated, if an array is found to not meet this threshold, it is not considered plausible to call it a valid Bible code array. If an array is found to meet this threshold, it is considered plausible to call it a valid Bible code array. (Note that the array MUST ALSO meet the conditions laid out in the section "The Word List Revisited: Independently Verifiable Information".)
Recall Equation 1.2: Overall matrix odds = Antilog [R(sum) + R0]/(Sdif X Smax).
The minimum threshold of plausibility modifies Equation 1.2 by comparing an n-term array being tested (the test array) with an n-term array of minimum plausibility and poses a question. It does this by dividing the calculated overall odds for the test array by the calculated overall odds of an n-term array of minimum plausibility, Antilog [0.763(n-1)]. This is written as follows.
Minimum Threshold Of Plausibility Antilog
[R(sum) + R0 - 0.763(n-1)] > 1
Is Met (for n > 2) When:
S
Definitions are as follows. R0 is the text R-value of the central term in the given search text (Torah or Tanach). R(sum) is the sum of all R(A') or all R-matrix values for every term in the array except for the central term. n is the number of terms in the array including the central term, and thus n-1 is the number of terms in the array excluding the central term. S is the row skip correction. (When only one array is under consideration, S is the row split of the central term. When two or more arrays sharing a common central term are under consideration, then S = Sdif X Smax, where Sdif is the number of different row splits seen in the arrays and Smax is the maximum row split seen in the arrays.)
Note the exception to the above formula that n must be greater than 2. A widely used criterion in scientific circles is that a result is not considered statistically significant unless the probability of it occurring is 5% or less (or, in other words, 20:1 odds or more). The logarithm of 20 is 1.3. When an array contains only two words, n-1 = 1. When this is the case, subtracting 0.763X1 from R(sum) + R0 does not satisfy the 20:1 rule. Therefore, for word pairs (n-1 = 1), 1.3 is subtracted from R(sum) + R0. Thus,
When n = 2, the Threshold is met when Antilog [R(sum) + R0 - 1.3]/S > 1.
Conceptually, this threshold only counts that portion of the R values which are above the +1 standard deviation level, and even then corrects for near-minimality and row split of the central term (with the additional provision that the overall calculated matrix odds would have to be at least 20:1). Note some important features of this threshold. First of all, one cannot simply add more terms of marginal quality to an array in an attempt to increase the overall calculated odds. As the number of terms increases, so does the value 0.763(n-1) to be subtracted from R(sum) + R0. Secondly, a term with an R-value of less than 0.763 will not contribute positively to the end result. If included, there must be one or more other terms with sufficiently high R-values to compensate. This is a deterrent to including low-quality terms (R < 0.763) in an array, even though such terms are formally allowed. Thirdly, as the central term becomes farther from near-minimality and/or its row split increases, the other terms' R-values must correspondingly increase to allow the threshold to be met.
This is important enough to warrant restating from a different angle. What features of arrays does this threshold deter? (1) It deters increasing the number of terms in an array to increase its perceived significance. As the number of terms increases, the bar which a test array must exceed is correspondingly raised. (2) It deters low-quality terms (i.e., ELS's with positive but low R-values). For a low-quality term to be included, other terms must be of high enough R-value to compensate. (3) It deters central terms which are far from near-minimality. Again, for such a central term to be included, other terms must be of high enough R-value to compensate.
Is this threshold simply a new way of stating overall matrix odds? No. Even though in the final analysis, arrays will be reported as being X times that of an n-term array of minimum plausibility, it is recognized that this is NOT an overall matrix odds but rather a comparison of an array with a predetermined standard. It is also recognized that the calculation is for that specific word list only, and that (as mentioned above) there may be other terms (however few these may be) or alternate spellings that would convey the same or a similar message as the array actually in hand. The only thing the threshold considers is whether or not this particular array under consideration can plausibly be called a valid Bible code array.
Thresholds Of Greater Than Minimum Significance
As shown above, 0.763 corresponds to an R-value at the +1 standard deviation level. Using the same basic concept, it is possible to devise even stricter thresholds corresponding to higher numbers of standard deviations (S.D.) These are covered in the technical version of this article.
The Threshold Illustrated
The threshold of minimum significance is now illustrated on the two arrays used as examples in A Protocol For The Statistical Analysis Of Bible Code Arrays Part 2.
First examine the Clinton impeachment array using the R(A') method, remembering that impeachment is a process and not a result. (President Clinton did undergo an impeachment trial in the U.S. Senate in the Hebrew year 5759, although he was not removed from office as a result.) For the Clinton impeachment array, R(sum) = 2.586 + 1.329 + 1.093 + 3.727 = 8.735. 'Clinton' at -9877 skip distance has an R0 value for the Torah of 0.108. Since 'Clinton' as the central term has a row split of 4 in the array, S = 4. Since there are 5 terms in the array, n = 5 and thus n-1 = 4. Therefore, Antilog [8.735 + 0.108 - 4(0.763)]/4 = Antilog (5.791)/4 = 618,016/4 = 155,000 to three significant digits. (All results in this article are reported to three significant digits.) Since this is much, much greater than one, the Clinton impeachment array easily passes the minimum threshold of plausibility for a 5-term array.
Does it pass any of the higher thresholds? Yes, as is shown in the technical version of this article.
For the Germany/Hitler array, R(sum) = 2.974 + 2.108 + 1.003 + 0.963 = 7.048 using the R(A') method. 'Germany' at -156 skip distance has a R0 value for the Tanach of -0.824. Since 'Germany' as the central term has a row split of 1 in the array, S = 1. Since there are 5 terms in the array, n = 5 and thus n-1 = 4. Therefore, Antilog [7.048 - 0.824 - 4(0.763)]/1 = Antilog (3.172) = 1490. Since this is much, much greater than one, the Germany/Hitler array easily passes the minimum threshold of plausibility for a 5-term array.
Does it pass any of the higher thresholds? Yes, as is shown in the technical version of this article.
The Word List Revisited: Independently Verifiable Information
In A Protocol For The Statistical Analysis Of Bible Code Arrays Part 3 I stated the following. "Thus guideline #1 for critical judgment in examining Bible code findings is: Use common sense and critical judgment when examining the word list for a Bible code array. Are the words strongly and definitely related to the subject of the array, or are they weakly and tenuously related? Do the words seem to be natural choices in describing the subject or do they seem unnatural, forced, vague, or of only minor relevance?"
This point needs re-emphasis. In the discussions which led to this article, one reviewer asked the following question. What if one has several high R-value terms (say in one's personal matrix) and then happens to find and includes a low R-value ELS such as "king of the world" that together with the other terms passes the minimum threshold of plausibility? Without an R-value cutoff, would the protocol presumably endorse such a silly conclusion as me or someone else being "king of the world"? Definitely not.
As the guideline above states, the words in an array must be definitely and strongly related to the subject of the array. Since I am not the king of the world, the hypothetical ELS would be meaningless in an array about myself even if it had the highest R-value of any of the ELS's.
A corollary to this guideline is that the information presented in a Bible code array must be independently verifiable. For example, since it refers to a historically documented past event, 'Clinton', 'impeachment', '5759', 'USA', and 'Senate' presents information that is independently verifiable. The same can be said about the array containing the terms 'Germany', 'Nazi', 'Hitler', 'death', and 'in 5705'.
If one is presented with a personal matrix, the matrix must contain independently verifiable information. Though the names of an individual's wife, child, and father (for example) may not be public knowledge, they are terms that can be independently verified given the proper sources. Say though that an individual has the term 'Elijah' in his personal matrix and claims that he is the incarnation of Elijah promised in the latter days. The reader has the right to ignore this term. If someone truly is Elijah, it would have to be independently verified outside a Bible code array. Has he been reliably documented as having called down fire from heaven? Has he been reliably documented as saying on a particular date that he would cause the rains to cease in a given area for x number of days, and in fact it was reliably documented that the rains did cease for x number of days? While this is an extreme example, the point is clear.
For the same reasons, the future cannot be predicted using the Bible codes. Why? Simply stated, an event is not independently verifiable until after it has happened. To give an example, Michael Drosnin states on p. 73 of The Bible Code (Simon & Schuster, 1997) that he found an array containing the terms 'Prime Minister Netanyahu', 'elected', and 'Bibi' a week before the May 29, 1996 election. The calculation shows that this array passes the minimum threshold of plausibility. Does that mean that when Mr. Drosnin first found it that this was a valid Bible code array? Whether it was a valid Bible code array depended upon whether Netanyahu was or was not elected prime minister of Israel. Since he was elected, it was a valid Bible code array. However, here is the important point. Before the event occurred there was no way to know if the array was valid or not because an event cannot be independently verified until after it has occurred.
Unresolved Issues
The unresolved issues are discussed in depth in the technical version of this article.
Summary
The points made in this article are now summarized. (1) Cluster analysis was not included in the original protocol, nor is it included in the revised protocol. (2) Terms with negative matrix R-values or R(A') should not be included in an array. (3) Matrix R-values and R(A') are both legitimate ways of calculating term R-values, requiring only that one state which method is being used. (4) The reporting of overall matrix odds will no longer be practiced due to the concerns described above. (5) An array is not plausibly considered as a valid Bible code array unless it passes the minimum threshold of plausibility. (6) Even if an array passes the minimum threshold of plausibility (the +1 S.D. level), the thresholds corresponding to higher standard deviation levels should be checked as well. If an array does pass a higher threshold, it should be reported as such. (7) Each array must be judged according to the number of terms present. Having more terms in an array does not always mean a higher significance for the array. (8) Even if an array does pass the minimum threshold of plausibility, it is not plausibly considered a valid Bible code array unless (a) the terms are strongly and definitely related to the subject of the array, and (b) the information presented in the array is independently verifiable. (9) High matrix R-values does not necessarily mean that a term is a valid element of a valid Bible code array. The necessity of terms being in close proximity is still important.
The above is the result of the review process of the last several weeks. Since the field of Bible codes research is progressing, this threshold approach may need certain modifications in the future if the data warrants. However, I believe it to be vigorous enough to begin putting it into practice now.
Future Plans
Now that the protocol of statistical analysis has been reviewed and revised, resulting in this minimum threshold of plausibility, I hope to publish some more arrays in the near future using this criterion.