Client Side Scripting on the internet W.J. Montelpare,Ph.D. & M.N. McPherson, Ph.D.

This paper demonstrates the use of "client-side" scripting on the internet to produce a 2 x 2 calculator which can compute the McNemar test of symmetry and the kappa statistic. These two estimates are often combined in analyses which evaluate paired response data having a dichotomous outcome. The application is demonstrated using a comparison between a laboratory test and a field test. The web-based calculator, also referred to as a "webulator" computes the McNemar test of differences statistic as a "z" score, and the kappa measure of agreement. The tabular output includes the two statistics, and provides the probabilities associated with the statistics which can then be used to test the null hypothesis. The McNemar test and kappa statistic are extremely valuable in quantitative methods for health research where two or more tests are evaluated to establish differences and/or associations. The availability of a valid and reliable tool on the internet such as that demonstrated here can reduce the data processing time and assist in the interpretation of outcomes by researchers.

Introduction

Introduction

In a previous paper, we described a design model to use the internet to compute outcomes on the "user's computer", transfer the data across the internet to a specific "server", process the information on the server and deliver feedback to the user that submitted the initial data (Montelpare and McPherson, 1999). In the present paper we describe the approach to developing a "web-based" calculator also referred to as a "webulator" written in client-side javascript to compute McNemar's test of symmetry and the kappa statistic for paired response data having a dichotomous outcome.

As discussed previously by Montelpare and McPherson (1999), client-side javascript is a relatively simple "interpreted language" that health educators and researchers can embed within html documents to present information and incorporate simple visual calculators for a variety of applications. Client-side javascript does not use common gateway interface or "CGI" processing and therefore the information entered by the user remains on their computer and is not passed to the internet. As such, client-side javascript is an efficient approach to developing simple single user tools that can simplify tedious or cryptic computations. To this end, the internet is the medium by which researchers can share tools such as basic statistical calculators and project specific applications for all health educators and researchers, without regard for operating systems or the type of computer that will receive the information. The contribution of such approaches will enable individuals to compute outcomes in a reliable and valid format.

The purpose of the present application was therefore to develop a reliable "client-side javascripted" McNemar-kappa webulator that would allow researchers to compute measures of difference (computed as a "z statistic"), measures of association (kappa), and the estimated probability of such measures. The application was built with standard and advanced hypertext markup language (HTML), combined with the client-side interpreted scripting commands of "javascript" (Gesing and Schneider, 1997; Kent and Kent, 1996; Purcell and Mara, 1997).

Application of McNemar's test of symmetry and kappa statistic in health research

In many health applications, individuals are evaluated on more than one test to establish the validity of information against existing criteria or against a gold standard. For example, in determining an individual's maximal ability to deliver oxygen to the working muscles (Vo2 max) an individual may be tested on an accepted field test, such as the "one mile walk test" and then again on the "gold standard" maximal treadmill test. An appropriate determination of validity for the field test is a "chi-square" to test differences between individuals' responses on the two tests.

When computing the chi-square for independent samples, a portion of any difference in response could be attributed to intrinsic differences between participants. McNemar's test of symmetry determines the equivalence in responses to two independent factors, either by a single subject over conditions, or between subjects. The approach is therefore considered to be pair-wise. The assumption of the pair-wise comparison is that by using the same group of individuals, tested at two different times, the researcher will reduce the heterogeneity of variance which may occur when comparing data from independent samples. The approach is thus intended to reduce differences in the response distributions attributable to intrinsic differences between participant groups. Further, in a 2 by 2 model as illustrated in Figure 1, the proportion of pairs of individuals arranged for each of the possible outcomes is expected to be equal across the four cells (i.e. 25% of pairs in cell "a", cell "b", cell "c", cell "d").

Figure 1

Figure 1

FIGURE 1. Structure of the 2 x 2 design used in McNemar and kappa calculations

The McNemar equivalence estimate (expressed as either a chi-square or z test) focuses on "discordant pairs", also referred to as the "off-diagonal" elements (i.e. the paired data in upper right cell --cell "b", and lower left cell --cell "c"). The McNemar procedure tests the equality of frequencies in pairs of cells that are symmetric around the diagonal of a 2 by 2 design (the diagonal elements are the paired data in upper left cell --cell "a", and lower right cell --cell "d"). In the computation of the McNemar equivalence estimates, the frequencies in the major diagonal (upper left cell to lower right cell) are ignored. The null hypothesis (Ho: p_1. = p_.1) implies that the proportion of individuals who score high on the field test and low on the lab test will match the proportion of individuals who score low on the field test and high on the lab test.

The kappa statistic, in contrast, is a measure of agreement. Kappa computations focus on the data in the major diagonal from upper left to lower right, examining whether counts along this diagonal differ significantly from those expected by chance (Streiner, Norman, and Blum, 1989). If there were no agreement between the responses on the two tests then we would expect an identical proportion of individuals to score high versus score low on the field test for individuals who scored high on the lab test.

The application of McNemar's test and the kappa statistic to matched pairs of data was demonstrated by Suchower and Copenhaver (1996) and earlier by Fleiss (1981) who indicated that, especially in research studies that compare pairs of outcomes, such as comparisons of ratings between independent judges, some researchers felt the need to express the extent to which the measure of agreement is beyond that which is expected by chance. As such, while the McNemar test provides an appropriate estimate of differences in responses, Fleiss (1981) described the kappa statistic as a useful technique to correct for chance agreement.

Building the webulator

Building the webulator

The McNemar-kappa webulator is a relatively simple tool to create for the internet because it is merely an extension of a two by two data input table. The interpreted computer language of javascript was used to compose the " commands" or " scripts" that were embedded in the HTML document file (Figure 2 below). Scripts are more complex computer language statements, which invoke computer actions. javascript uses a presentation style and structure similar to that of "C" language but the javascript programs are less cryptic than "C" or "java" languages. Further, using the interpreted computer languages, scripts are written to the user'
s computer (the client) as part of the published web page. In this way, the scripts and HTML code pages, are passed directly to the user during an internet session (Montelpare and McPherson, 1999).

The commands to produce the webulator shown in Figure 2 are available by requesting to view the source code. The statistical formula for the McNemar test of symmetry and the kappa statistic are presented in Appendix 1 below using the notation of Suchower and Copenhaver (1996), after Fleiss (1981). Likewise, a SAS program and corresponding output for these data are presented in Appendix 2 below.

The McNemar Kappa Webulator

Building the webulator

Figure 2. A webulator for McNemar and kappa Statistics

Enter values for the "a", "b", "c", and "d" cells (the coloured cells) in the spaces provided,
then click the "Calculate" button to compute the remainder of the table values.

		Variable 1
Variable 2		level 1	level 2	(n₁₂ - n₂₁)	Row 1 probability (p1.)
	level 1	"a"=	"b"=

				SQRT(n₁₂ + n₂₁)	Row 2 probability (p2.)
	level 2	"c"=	"d"=

	P_o	P_e	Col 1 Probability (p.1)	Col 2 Probability (p.2)


	lower 95 % C.I. for kappa	upper 95 % C.I. for kappa	s.e for 95 % C.I. for kappa	z for kappa	Sum =

		McNemar test using " z" test formula =		kappa statistic =

Sample Application

Sample Application

McNemar's test assumes that response data are binary. Therefore, the first step is to arrange your data so that they represent a set of binary outcomes. A simple approach to preparing the data is to list the pair-wise response data from the scores on two measures in column format and then separate the columns at the median scores for each variable as shown in Example 1, below.

Here data are presented as an individual's response to a field test and to a lab test. The median score for the field test was 27 while the median score for the lab test was 54. Notice, in the table below, the chi-square cell assignment is given, based on the pair-wise scores relative to the two median scores. For example, the first observation scored "45" on the field test which is above the median. Likewise, this individual scored "54" on the lab test which is at the median for this variable. Therefore, the corresponding cell assignment in a two by two chi-square for paired response data is (+,+)=cell "a".

Example 1. Creating a Median Split for response data
Subject Code	Variable 1 (field test scores)	Variable 2 (lab test scores)	Cell assignments V1,V2
Compare the subject's score to the median score for the variable. Use (+,-) to indicate the score's position relative to the median score for the entire variable
001	45	54	+ + (cell a)
002	34	23	+ - (cell c)
003	76	33	+ - (cell c)
004	55	44	+ - (cell c)
005	64	53	+ + (cell a)
006	26	37	- - (cell d)
007	47	55	+ + (cell a)
008	37	62	+ + (cell a)
009	70	38	+ - (cell c)
010	71	37	+ - (cell c)
011	15	64	- + (cell b)
012	14	63	- + (cell b)
013	16	63	- + (cell b)
014	15	64	- + (cell b)
015	14	63	- + (cell b)
016	16	67	- + (cell b)
017	17	65	- + (cell b)
018	17	62	- + (cell b)
019	10	68	- + (cell b)
020	11	67	- + (cell b)
021	25	54	- + (cell b)
022	34	13	+ - (cell c)
023	26	53	- - (cell d)
024	35	14	+ - (cell c)
025	24	53	- - (cell d)
026	36	17	+ - (cell c)
027	27	55	+ + (cell a)
028	37	12	+ - (cell c)
029	20	58	- + (cell b)
030	31	17	+ - (cell c)
031	15	14	- - (cell d)
032	14	13	- - (cell d)
033	16	12	- - (cell d)
034	15	11	- - (cell d)
035	14	13	- - (cell d)
036	76	77	+ + (cell a)
037	77	75	+ + (cell a)
038	77	72	+ + (cell a)
039	70	78	+ + (cell a)
040	71	77	+ + (cell a)

Next determine the cell assignment in the 2 x 2 table for each subject based on whether they scored above or below the median score on both variables (this is indicated with "+" for scores at or above median score, and "-" for scores below the median score). The simplest way to locate the median score is to arrange the set of scores from lowest to highest, use the following formula of Freund and Simon (1991) median = (n+1)/2, which gives the position of the score with in the data set. In the example given here the median can be found at postion 20.5 since there are 40 scores in the data set for each variable, therefore (40+1)/2 = 20.5. After organizing the scores in rank order from lowest to highest we see that the median score for the first variable is 27, while the median score for the second variable is 54.

The McNemar's test can be applied to large data tables, but when data are arranged in a 2 by 2 design there can be any of the following outcomes:

"positive" on field test with "positive" on lab test (cell a)

"positive" on lab test with "negative" on field test (cell b)

"negative" on lab test with"positive" on field test (cell c)

"negative"on field test with "negative" on lab test (cell d)

The following is the rank order presentation of the data for variable 1 in the "sample table above": 10,11,14,14,14,14,15,15,15,15,16,16,16,17,17,20,24,25,26,26,27,31,34,34,35,36, 37,37,45,47,55,64,70,70,71,71,76,76,77,77

The median score for this data set is 27

The following is the rank order presentation of the data for variable 2 in the "sample table above": : 12,13,13,13,13,14,14,14,17,17,23,33,37,37,38,44,53,53,53,54,54,55,55,58,62,62,63,63,63,64,64,65,67,67,68,72,75,77,77,78

The median score for this data set is 54

By organizing the data according to the cells -- a through d, we observe the following frequencies: cell a = 9, cell b = 12, cell c = 11, cell d = 8.

A calculation of Chi-square or z then determines the extent to which the proportions of discordant pairs are equivalent. Entering the data into the webulator above using the following arrangement.

		Variable 1
Variable 2		<27	>=27
	<54	"a"=9	"b"=12

	>=54	"c"=11	"d"=8

and clicking on the button labelled "calculate", we observe the following results.

The results of the webulator computation for the first data set are as follows; the McNemar z score is 0.21, and the kappa statistic is -0.149 with a corresponding z for kappa of -0.949. The standard error of the kappa statistic is 0.15613 which gives an upper 95% confidence interval of 0.15601 and a lower 95% confidence interval of -0.45601. Therefore, we would interpret these results to suggest that there is no significant difference in the group of individuals that scored high on variable 1 and low on variable 2 versus those individuals that scored low on variable 1 and high on variable 2. Likewise, the non-significant kappa statistic (z for kappa= -0.949) indicates that there is no association between the responses on variables 1 and 2 used in this example. In order to verify that the webulator was indeed computing the values accurately, we ran a SAS program using the data from example 1. The code used in the SAS program is included in Appendix 1 below. The results of the SAS analysis are presented below. Notice that all values are equivalent between the webulator and the SAS output, any slight differences can be attributed to rounding.

SAS output for computation of McNemar z and kappa in Sample 1 data set StdErr_{McNemar z}= 0.15792 P_o=0.425 P_e=0.5 kappa=-0.15 Kappa_z=-0.94987 StdErr_{kappa z}=0.15613 Kappa CI95LLA= -0.45601 Kappa CI95ULA= 0.15601

Computation of kappa

Computation of kappa

Unlike the McNemar test, the kappa statistic uses the "elements of the diagonal", namely the data in cell "a" and cell "d" from the 2 x 2 design. The steps used to compute the kappa statistic, the 95% confidence interval for kappa, and the test of the null hypothesis (H₀: k = 0) are presented in Appendix 1 below using the notation of Suchower and Copenhaver (1996), after Fleiss (1981). Likewise, a SAS program and corresponding output for these data are presented in Appendix 2.

Your Turn

Your Turn

Use the following data set and the webulator above to compute the McNemar test and kappa statistic for a 2 x 2 arrangement of subjects. The scenario is as follows: you are presenting a course in which you wish to demonstrate the accuracy of a field test to measure an individual's maximal ability to deliver oxygen to the working muscles (Vo2 max) against the commonly accepted gold standard " the laboratory maximal treadmill test". The data are listed below in three variables for 60 subjects(subject's ID, the field test response, and the lab test response).

ID	Field Test Score	Lab Test Score	ID	Field Test Score	Lab Test Score	ID	Field Test Score	Lab Test Score
001	45	49	021	32	38	041	55	49
002	49	49	022	63	44	042	25	29
003	33	36	023	70	68	043	29	31
004	40	36	024	60	56	044	31	27
005	33	50	025	22	30	045	25	52
006	41	48	026	52	54	046	45	39
007	53	56	027	30	28	047	24	42
008	56	46	028	51	47	048	60	59
009	73	66	029	40	38	049	44	32
010	36	36	030	41	45	050	36	39

011	44	48	031	31	37	051	54	48
012	48	48	032	62	42	052	29	31
013	35	35	033	67	62	053	37	30
014	48	36	034	56	59	054	37	33
015	38	45	035	29	27	055	43	45
016	40	47	036	51	53	056	44	38
017	54	57	037	31	29	057	25	43
018	55	45	038	50	46	058	59	58
019	74	67	039	41	39	059	45	33
020	35	35	040	40	44	060	35	38

Split the two "outcome variables – field test and lab test" at their respective median scores. Arrange the data so that each subject is assigned to one of the four cells "a, b, c, or d" based on their scores on the two outcome variables with respect to the median splits. Enter the frequencies for each cell (i.e. "n" cases were assigned ++ cell "a") into the appropriate cells of the webulator and click the button labelled "calculate" to compute the McNemar test, the kappa measure of association, and the probabilities associated with the computed values.

Compare your results
In this sample data set the median split for the field test is 42 and the median split for the lab test is 44. Therefore, in the 2 x 2 table the results are as follows: there are 24 cases in the "a" cell, 7 cases in the "b" cell, 6 cases in the "c" cell, and 24 cases in the "d" cell. Entering these values into the webulator above will produce a McNemar z score of 0.28 with a standard error of 0.129. The kappa statistic is 0.566 with a z_kappa of 4.391. The standard error for kappa is 0.1063 with a corresponding 95% lower confidence limit of 0.3582 and an upper 95% confidence limit of 0.775. A SAS program to compute these data and compare responses is presented in Appendix 3 below. The results of the SAS program are identical to the output for the webulator.

Decision Rules

Decision Rules

When the McNemar test is used in a 2 by 2 research design, there is only one degree of freedom. The corresponding maximal chi-square critical value at which we would accept the null hypothesis (Ho: "cell b" = "cell c") at p<0.05 is 3.84, which refers to a "z" score = 1.96. Recall that the z score is the square root of the chi-square score, and therefore the McNemar chi-square "observed" values must be greater than the chi-square critical value of 3.84 or the z critical score of 1.96 in order to reject the null hypothesis (Ho : p1. = p .1) for any comparisons.

If the result of the McNemar test is less than the critical values then the researcher must accept the null hypothesis of no difference between the proportion of individuals who scored below the median score on the field test but above the median score on the lab test versus the proportion of individuals that scored above the median score on the field test but below the median score on the lab test.

The results from the computations of the kappa statistic provide different conclusions than the results from the McNemar tests of symmetry. Considering the alpha level of p= 0.05 as an acceptable critical value, or point at which to establish statistical importance, the researcher should observe whether or not the z-score for kappa is greater than 1.96, since z=1.96 is the critical value associated with the commonly accepted p=0.05 alpha level. Further, the z critical of 1.96 (p=0.05) is the value which is therefore used to determine the statistical significance of the estimated association.

The kappa statistic is translated as the percent agreement in responses. The webulator provides the values for the kappa statistic as well as the corresponding z score for kappa, and the 95% confidence interval of [C.I.lower limit= measured percent; C.I. upper limit =measured percent].

Although there may be no significant differences computed for the McNemar test, a significant z _kappa statistic may be observed. The interpretation of the kappa statistic is as follows. For the specific statistically significant kappa statistics, under the null hypothesis (H₀ : k = 0), kappa indicates that the agreement between the test responses was unlikely to occur merely by chance.

Further, in such examples where no significant differences were computed for the McNemar test but significant measures of association are computed for kappa, the results suggest that a statistically significant proportion of the sample (kappa value written as a percent),agreed in the way that they responded on each test.

Such results demonstrate the effectiveness of combining measures of difference with measures of agreement in research designs, where outcomes may not be significantly different, but should not necessarily be construed as similar.

Conclusion

This paper demonstrates the utility of the internet to provide tools that are useful to researchers. In the present example, a "webulator" was developed that could provide the computations of McNemar's z test as well as the kappa estimate of association. The use of the webulator reduces the computational errors which may occur for moderate to large sample sizes. Likewise, by having such tools posted to the internet, researchers can reduce the time required to work through such tedious and cryptic computations. webulators of the type presented here increase the accuracy and reliability of reported results, and eliminate errors in computations which may have either a direct or indirect effect on research decisions or tests of hypotheses.

References