The Data type mismatch indicator be calculated using int_datatype_matrix in the following way:

# Load dataquieR
library(dataquieR)

# Load data
sd1 <- prep_get_data_frame("ship")
# sd1 <- as.data.frame(sd1) # "untibble" it 

# Load metadata
file_name <- system.file("extdata", "ship_meta_v2.xlsx", package = "dataquieR")
prep_load_workbook_like_file(file_name)
meta_data_item <- prep_get_data_frame("item_level") # item_level is a sheet in ship_meta_v2.xlsx

# Apply indicator function
datatype_res <- int_datatype_matrix(
  study_data = sd1, 
  meta_data = meta_data_item, 
  label_col = "LONG_LABEL"
)

A plot and a table are provided to view the results:

datatype_res$SummaryPlot

datatype_res$SummaryData

	Variables	MATCH	STUDY_SEGMENT
22	Participant ID	Matching datatype	INTRO
14	Examination date and time	Matching datatype	INTRO
23	Sex	Matching datatype	INTRO
1	Age	Matching datatype	INTRO
4	Blood pressure examiner	Matching datatype	SOMATOMETRY
3	Blood pressure device ID	Matching datatype	SOMATOMETRY
26	Systolic blood pressure 1	Matching datatype	SOMATOMETRY
27	Systolic blood pressure 2	Matching datatype	SOMATOMETRY
9	Diastolic blood pressure 1	Matching datatype	SOMATOMETRY
10	Diastolic blood pressure 2	Matching datatype	SOMATOMETRY
25	Somatometry examiner	Matching datatype	SOMATOMETRY
5	Body height	Matching datatype	SOMATOMETRY
6	Body height scale ID	Matching datatype	SOMATOMETRY
7	Body weight	Matching datatype	SOMATOMETRY
8	Body weight scale ID	Matching datatype	SOMATOMETRY
29	Waist circumference	Non-matching datatype	SOMATOMETRY
17	Interview examiner	Matching datatype	INTERVIEW
16	Highest educational level	Matching datatype	INTERVIEW
20	Marital status	Matching datatype	INTERVIEW
24	Smoking status	Matching datatype	INTERVIEW
12	Ever had stroke	Matching datatype	INTERVIEW
11	Ever had myocardial infarction	Matching datatype	INTERVIEW
18	Known diabetes	Matching datatype	INTERVIEW
2	Age of diabetes onset	Matching datatype	INTERVIEW
13	Ever taken birth control pills	Matching datatype	INTERVIEW
21	Monthly household income	Matching datatype	INTERVIEW
15	HDL-cholesterol	Matching datatype	LABORATORY
19	LDL-cholesterol	Matching datatype	LABORATORY
28	Total cholesterol	Matching datatype	LABORATORY

All datatype issues found by int_datatype_matrix should be checked data element by data element. For instance, a major issue was found in the variable WAIST_CIRC_0. This variable is in the study data with datatype character, which differs from the expected datatype float defined in the metadata. Some basic checks show the misuse of commas as the decimal delimiter.

int_inspect_char(sd1$waist)

Character	Count
,	3
.	2144
0	933
1	1443
2	908
3	884
4	889
5	898
6	1018
7	1279
8	1355
9	1409
NA	3

To correct this issue, converting WAIST_CIRC_0 to datatype numeric will coerce respective values to NA’s, which should be avoided. Hence, we replace the comma with the correct delimiter and correct the datatype without losing data values. The resulting applicability plot shows no more issues.

# replace comma with the correct delimiter
sd1$waist <- as.numeric(gsub(",", ".", sd1$waist))

int_datatype_matrix(
  study_data = sd1, 
  meta_data = meta_data_item, 
  label_col = "LONG_LABEL"
)$SummaryPlot

Back to Example data quality assessment of SHIP data