The Chi-square is a significance statistic, and should be followed with a strength statistic. Advantages of the Chi-square include its robustness with respect to distribution of the data, its ease of computation, the detailed information that can be derived from the test, its use in studies for which parametric assumptions cannot be met, and its flexibility in handling data from both two group and multiple group studies.
The Chi-square test of independence also known as the Pearson Chi-square test, or simply the Chi-square is one of the most useful statistics for testing hypotheses when the variables are nominal, as often happens in clinical research. The Chi-square test is a non-parametric statistic, also called a distribution free test.
Non-parametric tests should be used when any one of the following conditions pertains to the data:. The original data were measured at an interval or ratio level, but violate one of the following assumptions of a parametric test:. The distribution of the data was seriously skewed or kurtotic parametric tests assume approximately normal distribution of the dependent variable , and thus the researcher must use a distribution free statistic rather than a parametric statistic.
For any of a number of reasons 1 , the continuous data were collapsed into a small number of categories, and thus the data are no longer interval or ratio. However, it is not uncommon to find inferential statistics used when data are from convenience samples rather than random samples.
To have confidence in the results when the random sampling assumption is violated, several replication studies should be performed with essentially the same result obtained. Each non-parametric test has its own specific assumptions as well.
The assumptions of the Chi-square include:. The data in the cells should be frequencies, or counts of cases rather than percentages or some other transformation of the data.
The levels or categories of the variables are mutually exclusive. That is, a particular subject fits into one and only one level of each of the variables. If, for example, the same subjects are tested over time such that the comparisons are of the same subjects at Time 1, Time 2, Time 3, etc. The study groups must be independent. This means that a different test must be used if the two groups are related.
There are 2 variables, and both are measured as categories, usually at the nominal level. However, data may be ordinal data.
Interval or ratio data that have been collapsed into ordinal categories may also be used. While Chi-square has no rule about limiting the number of cells by limiting the number of categories for each variable , a very large number of cells over 20 can make it difficult to meet assumption 6 below, and to interpret the meaning of the results. This assumption is most likely to be met if the sample size equals at least the number of cells multiplied by 5.
This requirement will be fully explained in the example of the calculation of the statistic in the case study example. The owner of a laboratory wants to keep sick leave as low as possible by keeping employees healthy through disease prevention programs.
Many employees have contracted pneumonia leading to productivity problems due to sick leave from the disease. There is a vaccine for pneumococcal pneumonia, and the owner believes that it is important to get as many employees vaccinated as possible. Due to a production problem at the company that produces the vaccine, there is only enough vaccine for half the employees.
In effect, there are two groups; employees who received the vaccine and employees who did not receive the vaccine. The company sent a nurse to every employee who contracted pneumonia to provide home health care and to take a sputum sample for culture to determine the causative agent.
They kept track of the number of employees who contracted pneumonia and which type of pneumonia each had. The data were organized as follows:. In this case, the independent variable is vaccination status vaccinated versus unvaccinated.
The dependent variable is health outcome with three levels:. The company wanted to know if providing the vaccine made a difference. To answer this question, they must choose a statistic that can test for differences when all the variables are nominal.
The formula for calculating a Chi-Square is:. The marginal values for the case study data are presented in Table 2. The second step is to calculate the expected values for each cell. Expected values must reflect both the incidence of cases in each category and the unbiased distribution of cases if there is no vaccine effect. This means the statistic cannot just count the total N and divide by 6 for the expected number in each cell. That would not take account of the fact that more subjects stayed healthy regardless of whether they were vaccinated or not.
Chi-Square expecteds are calculated as follows:. Specifically, for each cell, its row marginal is multiplied by its column marginal, and that product is divided by the sample size. Table 3 provides the results of this calculation for each cell. A Chi-square table of significances is available in many elementary statistics texts and on many Internet sites. This is a result of the observed value being 23 while only Therefore, this cell has a much larger number of observed cases than would be expected by chance.
Skip to content. Home About. Why watching an Olympics loser was more inspirational than the gold. Why do scientists…? Posted on August 10, by freakofnature. Why do geneticists… use pedigrees? Why do climate scientists… smooth temperature data? Why do theoretical physicists… believe in god? Why do paleontologists… study fossils? Why do marine biologists… use microscopes? Why do zoologists… Debate the relationship of mollusks?
Share this: Click to email this to a friend Opens in new window Click to share on Reddit Opens in new window Click to share on Twitter Opens in new window Click to share on Facebook Opens in new window Click to print Opens in new window. Like this: Like Loading This entry was posted in Science sortta. Bookmark the permalink.
Gwen says:. We use cookies to track how our visitors are browsing and engaging with our website in order to understand and improve the user experience. Review our Privacy Policy to learn more.
Ready to Calculate? Fill in your values. These must be numbers. Choose a significance level Submit the table The table will output totals for the rows and columns, as well as the Chi-squared result. It will tell you if your result is statistically significant or not.
Chi Square accepts weaker, less accurate data as inputs thus having less status in the pantheon of the statistical tests. It can be applied in surveys, business, decision making, quality control, medical and biological control. It can be used in a wide variety of research contexts since it is not limiting in the data it accepts. Chi Square is easy to calculate and interpret since you can get the correlating confirmation reports with ease. It can also be used in analysis of nominal data How Chi Square Works Chi Square always tests for null hypothesis where there is no significant difference between the expected and observed results.
There are points to note when testing your hypothesis and calculating Chi Square: Determination of the expected numbers for each observational class is crucial. Ensure you complete all the calculations using the Chi-Square formula and also consider rounding off the answer to two significant digits. Chi Square table should be used to determine the significance of the values.
0コメント