The Minimum Threshold Of Plausibility
For An N-Term Array
(Unabridged Technical Version)

Author: Keith York

This article is property of the author and may not be reprinted or distributed without permission

Posted June 18, 2000

 

On April 22, 2000 I posted A Protocol For The Statistical Analysis Of Bible Code Arrays, a 3-part article describing a protocol for statistically analyzing Bible code arrays and illustrating the method on two example arrays.  Since that time I have not used it while it underwent a period of review and critique.  While some of the individuals whom I asked to review the method never responded, some did, leading to a vigorous discussion.  (It should be noted, however, that any errors or problems that remain are my responsibility, not theirs.)  Now that the review period is finished, it is time to present the results.  If the reader wishes to read a less technical abridged version of this article click here.

No Cluster Analysis

One area that the reviewers agreed on was that it was good that the method does not use cluster analysis as illustrated on Roy Reinhold's website.  When I first posted the review of his statistical method, Roy had just given me the calculations for the cluster analysis of the Sid Roth life array.  Even though I was not prepared to accept cluster analysis without question, I gave him the public benefit of the doubt since I did not see any errors in the arithmetic.  Then, when I published my own protocol, I simply stated that my method would not be using cluster analysis.

Shortly after I became aware of Roy's cluster analysis, I discovered what I believe to be serious flaws in the method.  Rather than publicly post these concerns, I first engaged him in e-mail discussion on the subject.  Though I have made him aware of these concerns, I have not changed his mind and the cluster analysis is still used on his website.  Thus I have decided to detail my concerns in the article Why Cluster Analysis Is Flawed.

No Negative R-Values

A Protocol For The Statistical Analysis Of Bible Code Arrays Part 3 contained the following quote.  "If a report for an array includes matrix R-values or R(A') values that are negative numbers, this is the same as saying that those ELS's are expected to occur at least once in the matrix simply by mere chance.  (In this case, you should examine the list of words whose R > 0 and ask yourself if only these words were considered with the central term, would the word list be judged to be strongly and definitely related?"  There is strong agreement on this statement and a belief that it should be more strongly emphasized.  To state it more plainly, say that an array has fifty terms (in addition to the central term) but only six of these fifty terms have a matrix R-value or R(A') value (depending upon the method of analysis used) that is greater than zero.  In this case, one DOES NOT have a 51-term array, but rather a 7-term array.  Since a matrix R-value < 0 means by definition that this term is expected to appear at least once in the matrix in its skip distance range -d to +d, then that ELS cannot be considered as a candidate for being a valid element of that Bible code array.  A 51-term array may look impressive, but if all but 7 terms are expected to appear by chance anyway, then the impressiveness of the array is much diminished. 

Matrix R-Values Or R(A')?

In my protocol I stated why I liked the R(A') approach.  However, use of matrix R-values is also legitimate, and quicker to calculate since CodeFinder automatically gives the matrix R-values for each term.  One should only state in one's article which approach is being used for a particular array. 

Concerns About The Overall Matrix Odds

One area of concern among the reviewers is that the calculation for overall matrix odds may give results that are much too high.  There are two reasons behind this concern.

Fundamentally, a reported overall matrix odds for a particular array explicitly refers to ONLY the actual terms found in that array.  There are two related concerns here.  First, it may be that there are other terms (however few these may be) or alternate spellings that would convey the same or a similar message as the array actually in hand.  If these other terms or alternate spellings were not actually searched for, then the overall matrix odds would still be valid.  On the other hand, if these alternatives were searched for, but only the "successes" reported, then the calculated overall matrix odds is invalid.  The overall odds for obtaining an array that is "equivalent" to the one actually seen would be lower than what has been calculated.  Secondly, and just as relevant, other terms describing other aspects of one's chosen subject may have been searched for and likewise only the "successes" reported.  This reporting of only "successes" and not "failures" is referred to as winnowing or "cherry-picking".

Saying this does not necessarily mean that arrays that individuals report have undergone such a winnowing process.  It simply means that the potential for such winnowing to have occurred exists.  The only way to be sure that such winnowing has not occurred is to be presented with a large set of rule-specified words that govern the search.  (An example of a rule-specified search list is the "Great Sages" experiment.   Here the names and appellations of rabbis whose biographical entries exceeded a set length in a particular Jewish reference work were searched for in Genesis, along with their Hebrew dates of birth and death in three standard formats.)  To present a calculated overall odds for an array without such a rule-specified search list is to invite the suspicion (whether justified or not) that such an analysis is invalid because of the potential winnowing that might have occurred.  

Secondly, since many people have the impression that the more terms that are in an array, the better the array must be, there may exist the temptation for the searcher to meet this expectation by packing an array with several marginal terms.  These terms with positive but low R-values would increase the array's calculated overall matrix odds as each was added.  However, since some of these terms might have a relatively high probability of occurring by mere chance, one may have an array of astronomically high calculated overall odds even though many of the terms are very questionable.

Is there a way to usefully analyze Bible code arrays given these particular concerns?  Yes, as I will show below.

A Discussion Of Standard Deviations

Before introducing the concept I have developed to meet these concerns, I must first discuss the idea of standard deviations.  The standard deviation is a term widely used by statisticians.  In a normal or bell-shaped curve, 84.13% of all values lie below the +1 standard deviation level and 15.87% lie above it.

Expected occurrences of an ELS in an array in a given skip distance range of -d to +d does not follow a normal distribution but rather a Poisson distribution.  Still one can use the concept to calculate the value of R corresponding to the +1 standard deviation level.  Any R-value is log (1/Expected), or if E = expected number of occurrences, R-value = log (1/E).  To solve for E, calculate 10^-R (10 to the -R power), or equivalently, 1/(10^R) (1 divided by 10 to the R power).  Now for any value E, one can calculate from the Poisson distribution that the probability that there will be zero occurrences (call this P0) is e^-E, where e is Euler's number (approx. 2.7182818) and "^-E" means to the -E power.  The probability that there will be one or more occurrences (call this P1+) is simply 1 - P0.  The question now is what would E need to be to produce a P0 = 0.8413 probability of zero occurrences of an ELS in the -d to +d skip distance range in a matrix.  To solve for E in the equation e^-E = 0.8413, take the natural logarithm of both sides.  Thus -E = -0.1728069.  The R-value corresponding to this is log (1/0.1728069) = 0.7624.  Rounding up to the nearest thousandths' place produces the solution of R = 0.763.  Thus the R-value which corresponds to the +1 standard deviation level is 0.763.

An array that is not governed by a ruled-specified word list cannot have a reported overall matrix odds without suspicion that "cherry-picking" has occurred.  However, we can set a threshold criterion for an array such that a matrix that meets this criterion is considered to be a plausible Bible code array.  Conversely, a matrix that does not meet this criterion is not considered to be a plausible Bible code array. 

Assume that one has an array with n terms, where the central term is at a skip distance d and that the central term is only expected to occur once in the search text in the skip distance range of -d to +d.  In this case, the text R-value of the central term (denoted R0) is 0.  Assume also that the row split is one (thus Sdif X Smax = 1), and that each of the other terms has an R-value [whether R(A') or matrix R-value] corresponding to the +1 standard deviation level.  Using Equation 1.2 from A Protocol For The Statistical Analysis Of Bible Code Arrays Part 1 the overall matrix odds for this array would be calculated as Antilog [R0 + R(sum)]/1 = Antilog [0 + 0.763(n-1)] = Antilog [0.763(n-1)].  If the minimum level of plausibility for an array to be considered a valid Bible code array is for the average term to have a +1 standard deviation level, then this would be the calculated overall matrix odds of an n-term array of minimum plausibility.

The Minimum Threshold Of Plausibility For An N-Term Array Introduced

This leads to the central concept of this article, the minimum threshold of plausibility.  Simply stated, if an array is found to not meet this threshold, it is not considered plausible to call it a valid Bible code array.  If an array is found to meet this threshold, it is considered plausible to call it a valid Bible code array.  (Note that the array MUST ALSO meet the conditions laid out in the section "The Word List Revisited: Independently Verifiable Information".)

Recall Equation 1.2: Overall matrix odds = Antilog [R(sum) + R0]/(Sdif X Smax).

The minimum threshold of plausibility modifies Equation 1.2 by comparing an n-term array being tested (the test array) with an n-term array of minimum plausibility and poses a question.  It does this by dividing the calculated overall odds for the test array by the calculated overall odds of an n-term array of minimum plausibility, Antilog [0.763(n-1)].  This is written as follows.

Minimum Threshold Of Plausibility    Antilog [R(sum) + R0 - 0.763(n-1)]  >  1
Is Met (for n > 2) When:                                                 S

Definitions are as follows.  R0 is the text R-value of the central term in the given search text (Torah or Tanach).  R(sum) is the sum of all R(A') or all R-matrix values for every term in the array except for the central term.  n is the number of terms in the array including the central term, and thus n-1 is the number of terms in the array excluding the central term.  S is the row skip correction.  (When only one array is under consideration, S is the row split of the central term.  When two or more arrays sharing a common central term are under consideration, then S = Sdif X Smax, where Sdif is the number of different row splits seen in the arrays and Smax is the maximum row split seen in the arrays.)

Note the exception to the above formula that n must be greater than 2.  A widely used criterion in scientific circles is that a result is not considered statistically significant unless the probability of it occurring is 5% or less (or, in other words, 20:1 odds or more).  The logarithm of 20 is 1.3.  When an array contains only two words, n-1 = 1.  When this is the case, subtracting 0.763X1 from R(sum) + R0 does not satisfy the 20:1 rule.  Therefore, for word pairs (n-1 = 1), 1.3 is subtracted from R(sum) + R0.  Thus,

When n = 2, the Threshold is met when Antilog [R(sum) + R0 - 1.3]/S > 1. 

Conceptually, this threshold only counts that portion of the R values which are above the +1 standard deviation level, and even then corrects for near-minimality and row split of the central term (with the additional provision that the overall calculated matrix odds would have to be at least 20:1).  Note some important features of this threshold.  First of all, one cannot simply add more terms of marginal quality to an array in an attempt to increase the overall calculated odds.  As the number of terms increases, so does the value 0.763(n-1) to be subtracted from R(sum) + R0.  Secondly, a term with an R-value of less than 0.763 will not contribute positively to the end result.  If included, there must be one or more other terms with sufficiently high R-values to compensate.  This is a deterrent to including low-quality terms (R < 0.763) in an array, even though such terms are formally allowed.  Thirdly, as the central term becomes farther from near-minimality and/or its row split increases, the other terms' R-values must correspondingly increase to allow the threshold to be met.

This is important enough to warrant restating from a different angle.  What features of arrays does this threshold deter?  (1) It deters increasing the number of terms in an array to increase its perceived significance.  As the number of terms increases, the bar which a test array must exceed is correspondingly raised.  (2) It deters low-quality terms (i.e., ELS's with positive but low R-values).  For a low-quality term to be included, other terms must be of high enough R-value to compensate.  (3) It deters central terms which are far from near-minimality.  Again, for such a central term to be included, other terms must be of high enough R-value to compensate. 

Is this threshold simply a new way of stating overall matrix odds?  No.  Even though in the final analysis, arrays will be reported as being X times that of an n-term array of minimum plausibility, it is recognized that this is NOT an overall matrix odds but rather a comparison of an array with a predetermined standard.  It is also recognized that the calculation is for that specific word list only, and that (as mentioned above) there may be other terms (however few these may be) or alternate spellings that would convey the same or a similar message as the array actually in hand.  The only thing the threshold considers is whether or not this particular array under consideration can plausibly be called a valid Bible code array.

Thresholds Of Greater Than Minimum Significance

As shown above, 0.763 corresponds to an R-value at the +1 standard deviation level.  Though the +1 standard deviation level is important, some may feel uneasy denoting as a valid Bible code array a matrix where the average term is only at the +1 standard deviation level.  (This is why I have termed the +1 standard deviation level the minimum threshold of plausibility.)  Using the same basic concept, it is possible to devise even stricter thresholds corresponding to higher numbers of standard deviations (S.D.)  The strictness of these higher standards are summed up here.  Note that as the standard deviation level increases, the statistical significance increases to a much greater extent.  Thus 2 S.D. does not mean twice as significant as 1 S.D. but rather 7 times more significant per term (15.87/2.28 = 7).  Likewise, 3 S.D. does not mean 3 times as significant as 1 S.D., but rather is 122 times more significant per term (15.87/0.13 = 122).

84.13% of all values are below +1.0 S.D., while 15.87% are above.
93.32% of all values are below +1.5 S.D., while 6.68% are above.
97.72% of all values are below +2.0 S.D., while 2.28% are above.
99.38% of all values are below +2.5 S.D., while 0.62% are above.
99.87% of all values are below +3.0 S.D., while 0.13% are above.

What would be the number K which would be multiplied by n-1 to be subtracted from R(sum)?  One can calculate based upon the Poisson distribution what K would be for these cases, as is tabulated below.  In each case, K is rounded up to the next thousandths' place.

For the +1 standard deviation level, P0 = 0.8413; ln 0.8413 = -E; E = 0.1728069; -log E = -0.7624389; and thus K = 0.763.

For the +1.5 standard deviation level, P0 = 0.9332; ln 0.9332 = -E; E = 0.0691357; -log E = -1.1602974; and thus K = 1.161.

For the +2 standard deviation level, P0 = 0.9772; ln 0.9772 = -E; E = 0.0230639; -log E = -1.6370665; and thus K = 1.638.

For the +2.5 standard deviation level, P0 = 0.9938; ln 0.9938 = -E; E = 0.0062193; -log E = -2.2062585; and thus K = 2.207.

For the +3 standard deviation level, P0 = 0.9987; ln 0.9987 = -E; E = 0.0013008; -log E = -2.8857742; and thus K = 2.886.

Since K must be at least 1.3 to satisfy the 20:1 rule, only the + 1.5 S.D. threshold is treated like the + 1 S.D. threshold when n = 2.  The + 2 S.D. and higher thresholds use their given K values for all values of n.

Whenever statistical analysis is performed on an array, one should report the highest of these standard deviation thresholds that the array meets.  Why report an array as meeting the +1 S.D. threshold, for example, if a further calculation shows that it meets the +2 S.D. threshold?

The Threshold Illustrated

The threshold of minimum significance is now illustrated on the two arrays used as examples in A Protocol For The Statistical Analysis Of Bible Code Arrays Part 2

First examine the Clinton impeachment array using the R(A') method, remembering that impeachment is a process and not a result.  (President Clinton did undergo an impeachment trial in the U.S. Senate in the Hebrew year 5759, although he was not removed from office as a result.)  For the Clinton impeachment array, R(sum) = 2.586 + 1.329 + 1.093 + 3.727 = 8.735.  'Clinton' at -9877 skip distance has an R0 value for the Torah of 0.108.  Since 'Clinton' as the central term has a row split of 4 in the array, S = 4.  Since there are 5 terms in the array, n = 5 and thus n-1 = 4.  Therefore, Antilog [8.735 + 0.108 - 4(0.763)]/4 = Antilog (5.791)/4 = 618,016/4 = 155,000 to three significant digits.  (All results in this article are reported to three significant digits.)  Since this is much, much greater than one, the Clinton impeachment array easily passes the minimum threshold of plausibility for a 5-term array.

Does it pass any of the higher thresholds?  Is the average term at least +1.5 S.D.?  Antilog [8.735 + 0.108 - 4(1.161)]/4 = 3950.  Yes, the average term in this array is at least +1.5 S.D.

Is the average term at least +2 S.D.?  Antilog [8.735 + 0.108 - 4(1.638)]/4 = 48.9.  Yes, the average term in this array is at least +2 S.D.

Is the average term at least +2.5 S.D.?  Antilog [8.735 + 0.108 - 4(2.207)]/4 = 0.259.  Since this is lower than one, the average term in this array is NOT at least +2.5 S.D.  

What if we drop some of the lower R-value terms?  Can we have an array with fewer terms which meets the +2.5 S.D. threshold?  If we use only 'Clinton', '5759', and 'Senate', then n = 3 and thus n-1 = 2.  R(sum) of '5759' and 'Senate' is 2.586 + 3.727 = 6.313.  Antilog [6.313 + 0.108 - 2(2.207)]/4 = 25.4, which is greater than one.  Does this mean that this 3-term array would pass the +2.5 S.D. threshold?  No, because the three terms 'Clinton', '5759', and 'Senate' do not form a meaningful message without at least the added term 'impeachment'.  Let us add 'impeachment' back in.  Now R(sum) = 7.642 and n-1 = 3.  Antilog [7.642 + 0.108 - 3(2.207)]/4 = 3.36, which is slightly greater than one.  Therefore, the array of the five terms 'Clinton', '5759', 'impeachment', 'USA', and 'Senate' is NOT significant at the +2.5 S.D. threshold level, but the array of the four terms 'Clinton', '5759', 'impeachment', and 'Senate' IS.

A whole matrix approach for the full 5-term array gives similar results.  Using matrix R-values, R(sum) = 7.451.  For the +1 S.D. threshold, Antilog [7.451 + 0.108 - 4(0.763)]/4 = 8030.  For the +1.5 S.D. threshold, the calculation yields 206.  For the + 2 S.D. threshold, the calculation yields 2.54.  For the + 2.5 S.D. threshold, the calculation yields 0.0135, which is less than one.  Therefore, whether the R(A') method or matrix R-values are used, the 5-term array meets the +2 S.D. threshold, but not the + 2.5 S.D. threshold.  The main difference between the results of the two arrays would be for the 4-term array: 'Clinton', 'impeachment', '5759', and 'Senate'.  The 4-term array passes the + 2.5 S.D. threshold for the R(A') method, but only the + 2 S.D. threshold for the whole matrix approach. 

For the Germany/Hitler array, R(sum) = 2.974 + 2.108 + 1.003 + 0.963 = 7.048 using the R(A') method.  'Germany' at -156 skip distance has a R0 value for the Tanach of -0.824.  Since 'Germany' as the central term has a row split of 1 in the array, S = 1.  Since there are 5 terms in the array, n = 5 and thus n-1 = 4.  Therefore, Antilog [7.048 - 0.824 - 4(0.763)]/1 = Antilog (3.172) = 1490.  Since this is much, much greater than one, the Germany/Hitler array easily passes the minimum threshold of plausibility for a 5-term array.

Does it pass any of the higher thresholds?  Is the average term at least +1.5 S.D.?  Antilog [7.048 - 0.824 - 4(1.161)]/1 = 38.0.  Yes, the average term in this array is at least +1.5 S.D.

Is the average term at least +2 S.D.?  Antilog [7.048 - 0.824 - 4(1.638)]/1 = 0.470.  Since this is lower than one, the average term in this array is NOT at least +2 S.D.

What if we drop some of the lower R-value terms?  Can we have an array with fewer terms which meets the +2 S.D. threshold?  If we use only 'Germany', 'Hitler', and 'Nazi' (a meaningful trio of words), then R(sum) = 5.082 and n-1 = 2.  Antilog [5.082 - 0.824 - 2(1.638)]/1 = 9.59.  Yes, the average term in this smaller array is at least +2 S.D.

Does this 3-term array meet +2.5 S.D. threshold?  Antilog [5.082 - 0.824 - 2(2.207)]/1 = 0.698.  Since this is lower than one, the average term in this array is NOT at least +2.5 S.D.

Let us then consider only the two terms 'Germany' and 'Hitler'.  This is a meaningful word pair.  Now we simply use the R(A') value for 'Hitler', 2.974, and n-1 = 1.  Antilog [2.974 - 0.824 - 1(2.207)]/1 = 0.877.  Since this is lower than one, the +2.5 S.D. threshold is still NOT met.

A whole matrix approach for the full 5-term array gives similar results.  Using matrix R-values, R(sum) = 6.853.  For the +1 S.D. threshold, Antilog [6.853 - 0.824 - 4(0.763)]/1 = 948.  For the +1.5 S.D. threshold, the calculation yields 24.3.  For the + 2 S.D. threshold, the calculation yields 0.300, which is less than one.  Therefore, whether the R(A') method or matrix R-values are used, the 5-term array meets the + 1.5 S.D. threshold, but not the + 2 S.D. threshold.  The main difference between the results of the two arrays would be for the 4-term array: 'Clinton', 'impeachment', '5759', and 'Senate'.  The 4-term array passes the + 2.5 S.D. threshold for the R(A') method, but only the + 2 S.D. threshold for the whole matrix approach.  If we use only the three terms 'Germany', 'Hitler', and 'Nazi' (a meaningful trio of words), then the calculation yields 9.86.  Yes, the average term in this smaller array, as with the R(A') method, is at least +2 S.D but not at least + 2.5 S.D..

The Word List Revisited: Independently Verifiable Information

In A Protocol For The Statistical Analysis Of Bible Code Arrays Part 3 I stated the following.  "Thus guideline #1 for critical judgment in examining Bible code findings is: Use common sense and critical judgment when examining the word list for a Bible code array.  Are the words strongly and definitely related to the subject of the array, or are they weakly and tenuously related?  Do the words seem to be natural choices in describing the subject or do they seem unnatural, forced, vague, or of only minor relevance?"

This point needs re-emphasis.  In the discussions which led to this article, one reviewer asked the following question.  What if one has several high R-value terms (say in one's personal matrix) and then happens to find and includes a low R-value ELS such as "king of the world" that together with the other terms passes the minimum threshold of plausibility?  Without an R-value cutoff, would the protocol presumably endorse such a silly conclusion as me or someone else being "king of the world"?  Definitely not.

As the guideline above states, the words in an array must be definitely and strongly related to the subject of the array.  Since I am not the king of the world, the hypothetical ELS would be meaningless in an array about myself even if it had the highest R-value of any of the ELS's.

A corollary to this guideline is that the information presented in a Bible code array must be independently verifiable.  For example, since it refers to a historically documented past event, 'Clinton', 'impeachment', '5759', 'USA', and 'Senate' presents information that is independently verifiable.  The same can be said about the array containing the terms 'Germany', 'Nazi', 'Hitler', 'death', and 'in 5705'.

If one is presented with a personal matrix, the matrix must contain independently verifiable information.  Though the names of an individual's wife, child, and father (for example) may not be public knowledge, they are terms that can be independently verified given the proper sources.  Say though that an individual has the term 'Elijah' in his personal matrix and claims that he is the incarnation of Elijah promised in the latter days.  The reader has the right to ignore this term.  If someone truly is Elijah, it would have to be independently verified outside a Bible code array.  Has he been reliably documented as having called down fire from heaven?  Has he been reliably documented as saying on a particular date that he would cause the rains to cease in a given area for x number of days, and in fact it was reliably documented that the rains did cease for x number of days?  While this is an extreme example, the point is clear.

For the same reasons, the future cannot be predicted using the Bible codes.  Why?  Simply stated, an event is not independently verifiable until after it has happened.  To give an example, Michael Drosnin states on p. 73 of The Bible Code (Simon & Schuster, 1997) that he found an array containing the terms 'Prime Minister Netanyahu', 'elected', and 'Bibi' a week before the May 29, 1996 election.  The calculation shows that this array passes the minimum threshold of plausibility.  Does that mean that when Mr. Drosnin first found it that this was a valid Bible code array?  Whether it was a valid Bible code array depended upon whether Netanyahu was or was not elected prime minister of Israel.  Since he was elected, it was a valid Bible code array.  However, here is the important point.  Before the event occurred there was no way to know if the array was valid or not because an event cannot be independently verified until after it has occurred.

Unresolved Issues

There are two main unresolved issues as of this date.  First is the question of whether there should be a minimum cutoff level for R-values.  I do not believe that there should be (as long the R-value is positive), but one of my reviewers believes strongly that there should be.  Given my respect for him, I include his dissent in this section.  His basic argument is as follows.  An array is only as strong as its weakest link.  Therefore, one should ensure that the weakest link is not too weak by setting a minimum R-value below which a term would not be included in a matrix.  While he has a good argument, my own belief is that an array's significance is set by the average R-value as modified by this threshold protocol.  In other words, a low R-value term will not show up in an array which passes the threshold unless other terms' R-values are sufficiently high to compensate.  Thus widespread use of low R-value terms is deterred.  Even when they are used, though, all terms must pass the word list guidelines of being (a) strongly and definitely related to each other and (b) independently verifiable information.  Therefore, for the time being, unless I come to see the need to change, there will not be a minimum cutoff level for R-values (as long as they are positive numbers).  However, one should keep in mind when examining a matrix which is the lowest R-value term in the array and ask oneself how the message of that array would be affected if that term were not included.

The second unresolved issue is in my mind even more important.  The paradigm for analysis of the statistical significance of ELS pairs is the Witztum, Rips, and Rosenberg (WRR) paper in Statistical Science.  WRR use a distance measure called delta.  Delta (e,e') is the sum of three squared distances in a two-dimensional array: (1) the square of the distance between consecutive letters in ELS e; (2) the square of the distance between consecutive letters in ELS e'; and (3) the square of the distance between the closest letters of ELS e and e' in the array.  (1) and (2) measure the compactness of each of the two ELS's.  (3) measures the proximity of the two ELS's to each other.  Therefore, delta measures the proximity and compactness of both terms.  A function of delta is then modified by the fraction of text for which e and e' are the minimal skip distance occurrences of those words (i.e., their domains of minimality).  The final result thus measures the proximity, compactness, and near-minimality of both terms.

Since WRR analysis requires comparisons to pairings generated in randomized control texts, an R-values method was developed as a surrogate, a method to enable statistical analysis without the need for comparisons with randomized control texts.  R-values measure the same parameters in a different way.  The text R-value of a term is a measure of a term's expected near-minimality.  The size of the rectangle encompassing two ELS's is a function of their compactness and proximity to each other, and when combined in a formula with the text R-value determines the matrix R-value of a term.  Thus in many cases an array that is good in a WRR analysis should be good in an R-value analysis.  Likewise, an array that is good in an R-value analysis should be good in a WRR analysis.  However, this may not always be the case.  The reason comes from the differing ways that the two methods measure proximity between two ELS's.  WRR measure the squared distance between the closest letters of two ELS's in a two-dimensional array.  An R-value analysis draws a rectangle around both ELS's.  This in effect is a measure of the distance between the farthest letters of two ELS's in a two-dimensional array.

As a result, the further two ELS's are away from each other, the more divergent the analyses of the significance of that pairing may become.  If the reader will be patient through some more math, I will demonstrate this with a hypothetical example.  Let us say that an array has three terms A, B, and C.  A is the main term and has a row split of two.  B and C are ELS's of the same word (five letters in length), but B has a skip distance of 2 and C has a skip distance of 6.  Say that B is minimal for the entire search text and C is minimal for one-third of the search text.  On one line B ranges from the 18th letter to the right of A to the 26th letter to the right of A.  On another line C ranges from the 2nd letter to the right of A to the 26th letter to the right of A.

What would an R-value analysis tell us?  The farthest letter of each from A is 26 letters away.  Thus the rectangle drawn around A and B is the same size as the rectangle drawn around A and C.  However, since B is one-third the skip distance of C, it will be considered to be three times as significant.  Therefore the matrix R-value of B will be 0.477 higher than that of C.  (Logarithm 3 = 0.477.)  The R-value analysis tells us that B is better than C in the array.

What would a WRR analysis tell us?  The squared distance between consecutive letters of A is 4.  The squared distance between consecutive letters of B is 4.  The squared distance between the closest letters of A and B is 324.  Thus delta (A,B) = 4 + 4 + 324 = 332.  Delta divided by the domain of minimality for B is 332/1 = 332.  The squared distance between consecutive letters of C is 36.  The squared distance between the closest letters of A and C is 4.  Thus delta (A,C) = 4 + 36 + 4 = 44.  Delta divided by the domain of minimality for C is 44/(1/3) = 132.  Therefore the WRR analysis tells us that C is better than B in the array.

Given the divergence between the R-value analysis and WRR analysis when terms are not in close proximity, one should use the protocol in this article only when the terms in an array are all in close proximity to the central term.  What is the limit of "close proximity"?  I do not know, and I do not think that there is a precise boundary.  However, based on personal experience, I would venture that if two terms' closest letters are 10 or fewer letters apart that they are in close proximity, but that if two terms' closest letters are 40 or more letters apart that they are not in close proximity to each other.  Values in-between constitute a "gray area", where individual judgment is required.  While this is not a hard and fast rule, it should prevent a searcher from mistakenly assigning a high R-value term (which is not in close proximity to other terms) to an array which may actually belong to a different array (if to one at all). 

Summary

The points made in this article are now summarized.  (1) Cluster analysis was not included in the original protocol, nor is it included in the revised protocol.  (2) Terms with negative matrix R-values or R(A') should not be included in an array.  (3) Matrix R-values and R(A') are both legitimate ways of calculating term R-values, requiring only that one state which method is being used.  (4) The reporting of overall matrix odds will no longer be practiced due to the concerns described above.  (5) An array is not plausibly considered as a valid Bible code array unless it passes the minimum threshold of plausibility.  (6) Even if an array passes the minimum threshold of plausibility (the +1 S.D. level), the thresholds corresponding to higher standard deviation levels should be checked as well.  If an array does pass a higher threshold, it should be reported as such.  (7) Each array must be judged according to the number of terms present.  Having more terms in an array does not always mean a higher significance for the array.  (8) Even if an array does pass the minimum threshold of plausibility, it is not plausibly considered a valid Bible code array unless (a) the terms are strongly and definitely related to the subject of the array, and (b) the information presented in the array is independently verifiable.  (9) High matrix R-values does not necessarily mean that a term is a valid element of a valid Bible code array.  The necessity of terms being in close proximity is still important.

The above is the result of the review process of the last several weeks.  Since the field of Bible codes research is progressing, this threshold approach may need certain modifications in the future if the data warrants.  However, I believe it to be vigorous enough to begin putting it into practice now.

Future Plans

Now that the protocol of statistical analysis has been reviewed and revised, resulting in this minimum threshold of plausibility, I hope to publish some more arrays in the near future using this criterion.