University of Sydney IPD Integrity Tool
About

Data Preview (First 10 Records)

rows x columns

Map Fields and Select Tests

ⓘ This test counts the frequency of values in the selected field to detect repeated patterns that might indicate data entry errors (e.g., copy-paste mistakes). Recommended fields include Parity or Education, where unusual repetition may suggest issues.
ⓘ This test extracts the final digit from numeric values and examines its distribution. In a truly random dataset, these digits should be uniformly distributed. Recommended fields include Hemoglobin_Baseline or SerumFerritin_Baseline. Anomalies might signal rounding issues or data fabrication.
ⓘ This test evaluates the randomness in a sequence of binary data (0s and 1s) by comparing the observed number of runs to the expected number. A low p-value (< 0.05) suggests non-randomness, while a high p-value indicates randomness. Suitable fields include Hysterectomy_2y, Smoking, or Diabetes
ⓘ This test calculates the mean and standard deviation for a continuous variable, grouping the data by a selected field. It checks for imbalances in central tendency across groups. Recommended continuous fields are Age or Weight, with a grouping field such as Treatment.
ⓘ This test compares frequency counts for a categorical variable across different groups. It evaluates whether the distribution of categories (e.g., Yes/No) significantly differs between groups. Recommended fields include Smoking or Diabetes, with a grouping field like Treatment.
ⓘ This test plots a scatter plot for two continuous variables and calculates the Pearson correlation coefficient, assessing the strength and direction of their linear relationship. For example, use SBP for X and DBP for Y. It helps verify if the expected association is present.
ⓘ This test determines whether the variability (spread) of a continuous variable differs significantly between groups. It computes absolute deviations from each group’s median and performs an ANOVA to produce an F-statistic and p-value. A low p-value (< 0.05) indicates significant differences in spread. Recommended fields include Age, SBP, or DBP with a grouping field like Treatment.
ⓘ This test checks if the dates in the selected field (e.g., RandomisationDate in dd/mm/yyyy format) fall within a defined study period. Use the provided datepickers to set the study start and end dates. It flags any records with dates outside this range.
ⓘ his test visualizes the cumulative allocation of participants over time for each group, plotting a dynamic line chart that shows how randomisation evolved. Recommended fields are a date field (e.g., RandomisationDate in dd/mm/yyyy format) and a grouping field such as Treatment. This helps identify if allocations remain balanced over time.
ⓘ This test counts the number of participants randomised on each day of the week for each group, helping to detect imbalances (e.g., fewer randomisations on weekends). Recommended fields include a date field (e.g., RandomisationDate in dd/mm/yyyy format) and a grouping field like Treatment.
ⓘ This test assesses the number and proportion of missing values in a selected outcome field across groups. It identifies whether missing data are evenly distributed, which is important for data quality. Recommended fields include mat_death or hospital_days, with a grouping field such as Treatment.
ⓘ This test compares the observed event rate (e.g., an outcome value of 1 or "Yes" indicating an event) in a selected binary field with an expected event rate entered by the user. It helps identify if there are unexpected discrepancies that might indicate data issues. Recommended outcome fields include mat_death or neo_death, with a grouping field such as Treatment. Enter the expected rate as a percentage (e.g., 5%).

Results