Integrity Tool Web App

Repeating Patterns Across Baseline Variables

ⓘ This test counts the frequency of values in the selected field to detect repeated patterns that might indicate data entry errors (e.g., copy-paste mistakes). Recommended fields include Parity or Education, where unusual repetition may suggest issues.

Terminal Digit Bias

ⓘ This test extracts the final digit from numeric values and examines its distribution. In a truly random dataset, these digits should be uniformly distributed. Recommended fields include Hemoglobin_Baseline or SerumFerritin_Baseline. Anomalies might signal rounding issues or data fabrication.

Runs Test

ⓘ This test evaluates the randomness in a sequence of binary data (0s and 1s) by comparing the observed number of runs to the expected number. A low p-value (< 0.05) suggests non-randomness, while a high p-value indicates randomness. Suitable fields include Hysterectomy_2y, Smoking, or Diabetes

Excessive Imbalances in Continuous Baseline Variables

ⓘ This test calculates the mean and standard deviation for a continuous variable, grouping the data by a selected field. It checks for imbalances in central tendency across groups. Recommended continuous fields are Age or Weight, with a grouping field such as Treatment.

Excessive Imbalances in Baseline Categorical Variables

ⓘ This test compares frequency counts for a categorical variable across different groups. It evaluates whether the distribution of categories (e.g., Yes/No) significantly differs between groups. Recommended fields include Smoking or Diabetes, with a grouping field like Treatment.

Correlation Check

ⓘ This test plots a scatter plot for two continuous variables and calculates the Pearson correlation coefficient, assessing the strength and direction of their linear relationship. For example, use SBP for X and DBP for Y. It helps verify if the expected association is present.

Variability Difference Test (Levene’s Test)

ⓘ This test determines whether the variability (spread) of a continuous variable differs significantly between groups. It computes absolute deviations from each group’s median and performs an ANOVA to produce an F-statistic and p-value. A low p-value (< 0.05) indicates significant differences in spread. Recommended fields include Age, SBP, or DBP with a grouping field like Treatment.

Date Violations Test

ⓘ This test checks if the dates in the selected field (e.g., RandomisationDate in dd/mm/yyyy format) fall within a defined study period. Use the provided datepickers to set the study start and end dates. It flags any records with dates outside this range.

Study Start Date

Study End Date

Cumulative Allocation Plot

ⓘ his test visualizes the cumulative allocation of participants over time for each group, plotting a dynamic line chart that shows how randomisation evolved. Recommended fields are a date field (e.g., RandomisationDate in dd/mm/yyyy format) and a grouping field such as Treatment. This helps identify if allocations remain balanced over time.

Day-of-Week Imbalance Test

ⓘ This test counts the number of participants randomised on each day of the week for each group, helping to detect imbalances (e.g., fewer randomisations on weekends). Recommended fields include a date field (e.g., RandomisationDate in dd/mm/yyyy format) and a grouping field like Treatment.

Missing Data Assessment

ⓘ This test assesses the number and proportion of missing values in a selected outcome field across groups. It identifies whether missing data are evenly distributed, which is important for data quality. Recommended fields include mat_death or hospital_days, with a grouping field such as Treatment.

Implausible Event Rates Test

ⓘ This test compares the observed event rate (e.g., an outcome value of 1 or "Yes" indicating an event) in a selected binary field with an expected event rate entered by the user. It helps identify if there are unexpected discrepancies that might indicate data issues. Recommended outcome fields include mat_death or neo_death, with a grouping field such as Treatment. Enter the expected rate as a percentage (e.g., 5%).

Expected Event Rate (%)

Data Preview (First 10 Records)

Map Fields and Select Tests

Results