
The OCS statistics and research questions are consistently described as the most challenging items on the exam. You need to have a good handle on research methods and the associated terminology. Use this guide to assist your studying, and you’ll be on your way to passing the OCS!
Want to see where you stand? Try this free quiz focusing on research and statistics to identify any knowledge gaps.
Statistics and Research Terminology
Reliability
Reliability is the reproducibility of a measurement. It is a determination of how precise a measurement is.
Intrarater Reliability
Intrarater reliability is how reproducible or consistent a test is when the SAME examiner performs the test on multiple occasions.
Interrater Reliability
Interrater reliability is how reproducible a test is when DIFFERENT examiners perform the measurement.
Kappa statistic
The Kappa statistic is a measure of reliability. It determines the percent agreement beyond chance with nominal data.
- <0.0: Poor
- 0.00 to 0.20 Slight
- 0.21 to 0.40 Fair
- 0.41 to 0.60 Moderate
- 0.61 to 0.80 Substantial
- 0.81 to 1 Almost Perfect
Intraclass Correlation Coefficient
The intraclass correlation coefficient measures correlation and agreement with ordinal, ratio, and interval data.
- <0.50 Poor
- 0.50 to 0.75 Moderate
- > 0.75 to 0.90 Good
-
- >0.90 Excellent
Minimum Detectable Change (MDC)
The MDC is defined as the minimal change detected that falls outside of measurement error.
Minimum Clinically Important Difference (MCID)
The MCID represents the smallest improvement considered worthwhile by a patient.
Validity
Validity is the degree to which an instrument measures what it is intended to measure. It is usually reported in terms of sensitivity, specificity, and likelihood ratios.
Criterion-Related Validity
In criterion-related validity, the clinical test is compared to a known gold or reference standard, and a determination is made as to how closely the test matches that gold standard.
True Positive
A true positive is a result that accurately indicates a condition is present.
False Positive
A false positive is a test result that incorrectly indicates that a particular condition is present (Type 1 Error).
True Negative
A true negative is a test result that accurately indicates a condition is absent.
False Negative
A false negative is a test result that incorrectly indicates that a particular condition or attribute is absent (Type 2 Error).
Sensitivity
Sensitivity is the percentage of patients who have the disease of interest who register a positive test finding. A test with high sensitivity is valuable for ruling out a condition. A high sensitivity indicates a low number of false negatives.
Specificity
Specificity is the percentage of patients who do not have the disease of interest who register a negative test finding. A test with high specificity indicates a low number of false positives so it is useful for ruling in a condition.
Positive Predictive Value
Positive predictive value is the probability that subjects with a positive screening test truly have the disease.
Negative Predictive Value
Negative predictive value is the probability that subjects with a negative screening test truly don’t have the disease.



Likelihood Ratio (LR)
The Likelihood Ratio is the likelihood that a given test result would be expected in a patient with the target disorder compared to the likelihood that that same result would be expected in a patient without the target disorder. It takes into account both sensitivity and specificity.
Positive Likelihood Ratio (+LR)
A positive likelihood ratio (LR+) reflects the probability of a patient with the disease and a positive test divided by the probability of a patient without the disease and a positive test. The higher the +LR, the greater the value of a positive diagnosis. It is commonly used to rule in a condition.
- 1-2: Alters post-test probability minimally
- 2 to 5: Alters post-test probability to a small degree
- 5 to 10: Alters post-test probability to a moderate degree
- >10: Significantly alters post-test probability
Negative Likelihood Ration (-LR)
A negative likelihood ratio (LR-) is the probability of a person who has the disease testing negative divided by the probability of a person who does not have the disease testing negative. The lower the likelihood ratio, the greater the value of a negative diagnosis. It is commonly used to rule out a condition.
- 0.5 to 1: Alters post-test probability minimall
- 0.2 to 0.5: Alters post-test probability to a small degree
- 0.1 o 0.2: Alters post-test probability to a moderate degree
- < 0.1: Significantly alters post-test probability
Types of Research Studies and Important Statistics



Case Report or Case-Control Study
A case report is a detailed report of the symptoms, signs, diagnosis, treatment, and follow-up of an individual patient. A case-control study examines a group of patients with a particular disease or disorder and compares them with a control group of persons who have not had that medical problem. There is often a review of medical records, or the patients are interviewed to gather the necessary information. These may show a relationship between two factors (the presence of neck pain and headaches) but can’t determine causation. Because of this, they are usually less reliable than cohort studies and randomized controlled trials.
Cohort Studies
A cohort study is an analytical study in which a group having one or more similar characteristics (such as high blood pressure) is closely monitored over time simultaneously with another group that does not have the disorder or disease. These studies are observational and not as reliable as randomized controlled trials because there may be other differences between the groups besides the presence or absence of the disease or condition.
Randomized Controlled Trials
In randomized controlled trials, an investigator tests the effectiveness of a new therapeutic procedure using an experimental design. They assign individuals randomly to a treatment group (experimental therapy) or a control group (placebo or standard therapy) and then compare the outcomes. There is less bias due to random assignment and blinding. The results can provide information about cause and effect. Because of this, the design provides a higher level of evidence than the previously mentioned designs.
Systematic Reviews
During a systematic review, the researcher performs an extensive literature search to identify studies with sound methodology. The investigator reviews the studies and assesses their quality. They are then able to summarize the results according to the predetermined criteria of the review question. It is particularly useful in bringing together a number of separately conducted studies, sometimes with conflicting findings, and synthesizing their results. This summary allows us to take account of the whole range of relevant findings from research on a particular topic, and not just the results of one or two studies. We can use the results to establish whether scientific findings are reliable and generalized across populations, settings, and treatment variations or whether findings vary significantly by particular subgroups.
Meta-Analysis
A meta-analysis thoroughly examines a number of valid studies on a topic and mathematically combines the results using accepted statistical methodology to report the results as if it was one large study. In effect, the investigator increases the overall sample size, thereby improving the statistical power of the analysis as well as the precision of the estimates of treatment effects.
Levels of Evidence in Research
I. Evidence obtained from high-quality randomized controlled trials, prospective studies, or diagnostic studies.
II. Evidence obtained from lesser quality randomized controlled trials, prospective studies, or diagnostic studies (improper randomization, no blinding, < 80% follow-up)
III. Case-controlled studies or retrospective studies
IV. Case series
V. Expert Opinion



Research Design Terminology
Dependent Variable
The variable affected in an experiment is the dependent variable.
Independent Variable
The independent variable is what the researcher varies or manipulates during an experiment
Control Group
The control group does not receive the experimental treatment or is treated as usual.
Experimental Group
The experimental group receives the intervention tested in the experiment.
Types of Quantitative Research Design
Descriptive Research Design
Descriptive research aims to observe and measure a phenomenon. The researcher describes the patient’s characteristics, categories, or individual journeys, as in a case study.
Survey Research Designs
Surveys allow us to gain insights into opinions and practices in large samples. They can be descriptive and/or test associations.
Cross-sectional survey research measures exposures and outcomes at a certain point in time. Researchers commonly use observational designs for population-based studies and investigations of disease prevalence.
Longitudinal survey research measures variables at different points in time (continuously or repeatedly). This design is very useful in identifying the relationship between risk factors and disease onset or treatment outcomes over time.
Correlational Research Designs
Correlational research designs identify relationships between variables without implying causation. The researcher describes the strength of relationships without manipulating the variables. The three primary types of correlational designs are cohort, case-control, and cross-sectional studies.
In cohort studies, a sample of participants is observed over time. For example, those exposed to a condition and those not exposed are compared for differences in one or more predefined outcomes, such as adverse events. The studies can be either prospective or retrospective in nature.
Case-control studies are retrospective studies. The researcher selects participants already exposed to the event and matches them with unexposed participants, using historical cases to ensure they have similar characteristics.
Cross-sectional studies are a type of cohort study where only one comparison is made between exposed and unexposed subjects.
Quasi-Experimental Research Designs
Quasi-experiments are studies that aim to evaluate interventions without using randomization. Similar to randomized trials, quasi-experiments aim to demonstrate causality between an intervention and an outcome. Quasi-experimental studies can use both pre-intervention and post-intervention measurements as well as nonrandomly selected control groups. Medical researchers use quasi-experiments when it is not ethical or feasible to withhold treatment in a control group.
Experimental Research Designs
In experimental research designs, the researcher can manipulate one or more independent variables. In this way, we can study the effect on a dependent variable. The randomized controlled trial is the most important type of experimental design. It is the top method in the hierarchy of evidence to test cause and effect in clinical interventions.
Types of Statistical Data



In order to answer research questions or test hypotheses, research studies aim to collect data. The main types of data collected are as follows:
Nominal Data
Nominal data is “labeled” or “named” data which can be divided into various groups that do not overlap. Examples would be ethnicity, gender, or state of residence.
Ordinal Data
Ordinal data is a categorical, statistical data type where the variables have natural, ordered categories, but the distances between the categories are not known. An example would be a rating of perceived exertion or a verbal pain scale. These data indicate the order of values but not the degree of difference between them.
Interval Data
We measure interval data along a scale in which each point is placed at an equal distance from one another. Examples are temperature, pH, and scores on an IQ test. There is no absolute zero, however, so we are unable to say that 50 degrees is twice as warm as 25 degrees.
Ratio Data
We measure ratio data on a scale with a true zero point. Examples would be body weight, vertical jump distance, or strength measured with a dynamometer. Because there is an absolute zero point, we can say that 10 pounds is twice as heavy as 5 pounds or that a 12-inch jump is twice as far as a 6-inch jump.
Descriptive Statistics
Descriptive statistics summarize the characteristics of a data set. It is a simple technique to describe, show and summarize data in a meaningful way. During this process, the researcher chooses a group of interest, records data about the group, and then uses summary statistics and graphs to describe the group’s properties. You’re not aiming to infer properties about a large data set, you are just describing the data.
Mean
The average of the score, which is appropriate to use with either ratio or interval data.
Median
The midpoint where 50% of the scores are above and 50% of the scores are below, is appropriate to use with ordinal data.
Mode
The most frequently occurring score, which is appropriate to use with nominal data.
Range
The difference between the highest and lowest scores.
Standard Deviation
A determination of the variability of scores (difference) from the mean.
Inferential Statistics
Inferential statistics allow us to make inferences and draw conclusions about a population based on sample data. They enable hypothesis testing and evaluation of the applicability of the data to a larger population. The accuracy of inferential statistics depends largely on the accuracy of sample data and how it represents the larger population. Random sampling is extremely important for carrying out inferential techniques. We use the tools found below with inferential statistics.
Analysis of Variance (ANOVA)
An ANOVA is a statistical formula that compares variances across the means (or average) of three different groups. It helps you find out whether the differences between groups of data are statistically significant. It works by analyzing the levels of variance within the groups through samples taken from each of them. A one-way ANOVA compares the effects of an independent variable on multiple dependent variables. The end result is the p-value, which indicates the probability of whether or not differences between your groups are statistically significant. Usually, a p-value below 0.05 indicates that we can reject the null hypothesis.
T-test
Similarly, the T-test compares the means of two populations. A determined p-value gives a probability of the null hypothesis.
Chi-square
A chi-square test is a statistical test that compares observed results with expected results. The purpose of this test is to determine if a difference between observed data and expected data is due to chance or if it is due to a relationship between the variables you are studying. Therefore, the chi-square test is a good choice when the data are not normally distributed. The ANOVA and t-tests test hypotheses about continuous data.
Regression Analysis
Regression analysis is a statistical technique for determining the relationship between a single dependent (criterion) variable and one or more independent (predictor) variables. The analysis yields a predicted value for the criterion resulting from a linear combination of the predictors.
Summary
It can’t be emphasized enough how important a working knowledge of research and statistics is to pass the OCS. You should be able to understand both new and historical research and how to use the results to enhance clinical decision-making. Refer to this post often!





Carol Grgic, PT, OCS, CSCS
Carol Grgic, PT, OCS, CSCS is a Physical Therapist practicing in Milwaukee, WI. She is the owner and treating Physical Therapist at Elite Bodyworks, a self-pay Physical Therapy clinic. She is the creator of OCStestprep.com and enjoys blogging about Physical Therapy related topics.