Definition

The observed shape of a distribution differs from the expected shape.

Explanation

The realization of continuous measurement values commonly follows some probability distribution, e.g. body weight may follow a normal distribution. This indicator targets discrepancies between the expected and observed shape of a distribution. Either the expected probability distribution may not apply or the observed shape parameter of a given probability distribution may differ from the expected shape parameter.

Example

Example 1

The peak flow measurement in a lung function examination shows a bimodal instead of a unimodal normal distribution. Further inspection reveals that one out of three examiners malused the device. The affected values cannot be corrected and must be excluded.

Example 2

For a creatinine measurement a normal distribution is expected. However, the distribution is positively skewed. In this case, inspecting and comparing the distribution to other samples, it shows that it was too much of a simplification to assume a normal distribution.

Guidance

Deviations of observed from expected distributional shapes may indicate a wide range of issues such as examiner effects, device effects but also sampling issues.

In a designed study, little effects of study design factors, such as devices or examiners, should be exerted on the shape of a distribution. Finding associations of relevance between these factors and measurements are commonly indicative of measurement error.

For any interpretation it is important to take the number of cases into account. Low numbers may introduce a considerable amount of uncertainty.

Interpretation

Within variables:

The larger the deviation of expected and observed shape/scale parameters, the larger the probability of a lower data quality.

Across variables:

The higher the number or percentage of variables affected by unexpected shape/scale parameter related issues, the higher the probability of a low data quality.

Descriptors

Literature

Nonnemacher M, Nasseh D, Stausberg J. Datenqualität in der medizinischen Forschung: Leitlinie zum Adaptiven Datenmanagement in Kohortenstudien und Registern. Berlin: TMF e.V..; 2014.
Stausberg J, Bauer U, Nasseh D, et al. Indicators of data quality: review and requirements from the perspective of networked medical research MIBE 2019;15(1):1-8.

Indicator “Unexpected shape”