Overview

Dataset statistics

Number of variables15
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.9 KiB
Average record size in memory132.3 B

Variable types

Text1
DateTime3
Numeric11

Dataset

Description당뇨병 환자들이 시행한 혈액 검사 데이터. 검사항목은 HbA1c, Bun, Creatinine, AST(GOT), ALT(GPT), TC, TG, HDL, LDL . - HbA1c(당화혈색소): 혈액 속 적혈구 내 혈색소에 포도당 일부가 결합한 상태. 일반 혈당 검사가 검사 시점 혈당만을 알 수 있는데 반해 당화혈색소를 통해 3개월 간의 평균 혈당을 알 수 있음 - BUN(Blood Urea Nitrogen): 간세포 손상이나 신장의 기능을 평가할 수 있는 항목 - Creatinine: 근육에서 크레틴(Creatine)으로부터 생성되며 신장 기능 이외의 영향이 적어 신기능을 평가하는데 유용함 - AST(Aspartate aminotransferase. GOT(Glutamic Oxalacetic Transaminase)) - ALT(alanine aminotransferase, GPT(glutamic pyruvate transaminase)): 간세포 손상을 반영하는 아미노전이효소(Aminotransferases)로 기본적인 간기능검사 항목임 - MDRD-Estimated Glomerular Filtration Rate, eGFR): 혈액 내 크레아티닌 수치를 측정하고 그 결과를 MDRD공식을 사용하여 계산해 신장이 얼마나 잘 기능 하는지를 나태내는 수치 - Total Cholesterol(TC, 총콜레스테롤) : 혈액 내에 있는 모든 콜레스테롤을 뜻함 - Triglyceride(TG, 중성지방): 혈 중 트리글리세라이드의 양을 측정. 혈 중 트리글리세라이드가 증가하는 이유는 분명하지 않으나 심혈관 질환으로 진행될 위험의 증가와 관련이 있음 - HDL(High Density Lipoprotein) Cholesterol: 좋은 콜레스테롤이라고도 불리는 고밀도 지단백 콜레스테롤로 콜레스테롤을 흡수하여 간으로 다시 운반함. 높은 HDL cholesterol은 심장질환과 뇌졸중 위험을 낮출 수 있음 - LDL(Low Density Lipoprotein) Cholesterol: 나쁜 콜레스테롤이라고도 불리는 저밀도 지단백 콜레스테롤. 신체 콜레스테롤의 대부분을 차지하며 수치가 높으면 심장질환 및 뇌놀중 위험이 높아짐
Author가톨릭대학교 서울성모병원
URLhttp://cmcdata.net/data/dataset/diabetes_lab2

Alerts

A1C_VAL is highly overall correlated with A1C_VAL_CHigh correlation
A1C_VAL_C is highly overall correlated with A1C_VALHigh correlation
Cr_VAL is highly overall correlated with MDRD_VALHigh correlation
AST_VAL is highly overall correlated with ALT_VALHigh correlation
ALT_VAL is highly overall correlated with AST_VALHigh correlation
MDRD_VAL is highly overall correlated with Cr_VALHigh correlation
TC_VAL is highly overall correlated with LDL_VALHigh correlation
LDL_VAL is highly overall correlated with TC_VALHigh correlation
RID has unique valuesUnique

Reproduction

Analysis started2023-10-08 18:57:18.296791
Analysis finished2023-10-08 18:57:47.868584
Duration29.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RID
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-10-09T03:57:48.366369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters800
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowR0000001
2nd rowR0000002
3rd rowR0000003
4th rowR0000004
5th rowR0000005
ValueCountFrequency (%)
r0000001 1
 
1.0%
r0000063 1
 
1.0%
r0000074 1
 
1.0%
r0000073 1
 
1.0%
r0000072 1
 
1.0%
r0000071 1
 
1.0%
r0000070 1
 
1.0%
r0000069 1
 
1.0%
r0000068 1
 
1.0%
r0000067 1
 
1.0%
Other values (90) 90
90.0%
2023-10-09T03:57:49.363601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 519
64.9%
R 100
 
12.5%
1 21
 
2.6%
3 20
 
2.5%
4 20
 
2.5%
5 20
 
2.5%
6 20
 
2.5%
7 20
 
2.5%
8 20
 
2.5%
9 20
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700
87.5%
Uppercase Letter 100
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 519
74.1%
1 21
 
3.0%
3 20
 
2.9%
4 20
 
2.9%
5 20
 
2.9%
6 20
 
2.9%
7 20
 
2.9%
8 20
 
2.9%
9 20
 
2.9%
2 20
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
R 100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 700
87.5%
Latin 100
 
12.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 519
74.1%
1 21
 
3.0%
3 20
 
2.9%
4 20
 
2.9%
5 20
 
2.9%
6 20
 
2.9%
7 20
 
2.9%
8 20
 
2.9%
9 20
 
2.9%
2 20
 
2.9%
Latin
ValueCountFrequency (%)
R 100
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 519
64.9%
R 100
 
12.5%
1 21
 
2.6%
3 20
 
2.5%
4 20
 
2.5%
5 20
 
2.5%
6 20
 
2.5%
7 20
 
2.5%
8 20
 
2.5%
9 20
 
2.5%
Distinct63
Distinct (%)63.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2009-06-01 00:00:00
Maximum2019-05-01 00:00:00
2023-10-09T03:57:49.948524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:50.373392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

A1C_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct50
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.963
Minimum5.5
Maximum17.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:50.695063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5.5
5-th percentile5.795
Q16.475
median7.35
Q38.925
95-th percentile12.105
Maximum17.6
Range12.1
Interquartile range (IQR)2.45

Descriptive statistics

Standard deviation2.1123808
Coefficient of variation (CV)0.26527449
Kurtosis3.4973665
Mean7.963
Median Absolute Deviation (MAD)1.1
Skewness1.5920068
Sum796.3
Variance4.4621525
MonotonicityNot monotonic
2023-10-09T03:57:50.980572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.0 6
 
6.0%
6.1 5
 
5.0%
6.2 4
 
4.0%
8.0 4
 
4.0%
6.8 4
 
4.0%
6.5 4
 
4.0%
7.4 3
 
3.0%
5.9 3
 
3.0%
7.7 3
 
3.0%
7.6 3
 
3.0%
Other values (40) 61
61.0%
ValueCountFrequency (%)
5.5 2
 
2.0%
5.6 2
 
2.0%
5.7 1
 
1.0%
5.8 1
 
1.0%
5.9 3
3.0%
6.0 2
 
2.0%
6.1 5
5.0%
6.2 4
4.0%
6.3 2
 
2.0%
6.4 3
3.0%
ValueCountFrequency (%)
17.6 1
 
1.0%
13.0 1
 
1.0%
12.8 1
 
1.0%
12.3 1
 
1.0%
12.2 1
 
1.0%
12.1 1
 
1.0%
11.9 1
 
1.0%
11.6 1
 
1.0%
11.3 1
 
1.0%
10.9 3
3.0%

A1C_VAL_C
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.02
Minimum6
Maximum18
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:51.201902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile6
Q16.75
median7
Q39
95-th percentile12
Maximum18
Range12
Interquartile range (IQR)2.25

Descriptive statistics

Standard deviation2.1224152
Coefficient of variation (CV)0.2646403
Kurtosis4.0078624
Mean8.02
Median Absolute Deviation (MAD)1
Skewness1.6551006
Sum802
Variance4.5046465
MonotonicityNot monotonic
2023-10-09T03:57:51.449631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
7 28
28.0%
6 25
25.0%
8 17
17.0%
9 10
 
10.0%
10 6
 
6.0%
11 6
 
6.0%
12 5
 
5.0%
13 2
 
2.0%
18 1
 
1.0%
ValueCountFrequency (%)
6 25
25.0%
7 28
28.0%
8 17
17.0%
9 10
 
10.0%
10 6
 
6.0%
11 6
 
6.0%
12 5
 
5.0%
13 2
 
2.0%
18 1
 
1.0%
ValueCountFrequency (%)
18 1
 
1.0%
13 2
 
2.0%
12 5
 
5.0%
11 6
 
6.0%
10 6
 
6.0%
9 10
 
10.0%
8 17
17.0%
7 28
28.0%
6 25
25.0%
Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2009-06-01 00:00:00
Maximum2019-05-01 00:00:00
2023-10-09T03:57:51.699984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:52.040339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

BUN_VAL
Real number (ℝ)

Distinct80
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.849
Minimum5.7
Maximum84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:52.485826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5.7
5-th percentile9.18
Q111.8
median15.25
Q319.375
95-th percentile26.12
Maximum84
Range78.3
Interquartile range (IQR)7.575

Descriptive statistics

Standard deviation9.129954
Coefficient of variation (CV)0.54186919
Kurtosis29.572725
Mean16.849
Median Absolute Deviation (MAD)3.7
Skewness4.4307295
Sum1684.9
Variance83.35606
MonotonicityNot monotonic
2023-10-09T03:57:52.861593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
17.9 3
 
3.0%
11.0 3
 
3.0%
12.1 3
 
3.0%
11.9 3
 
3.0%
17.7 2
 
2.0%
14.3 2
 
2.0%
14.7 2
 
2.0%
10.8 2
 
2.0%
21.1 2
 
2.0%
11.8 2
 
2.0%
Other values (70) 76
76.0%
ValueCountFrequency (%)
5.7 1
1.0%
7.2 1
1.0%
7.6 1
1.0%
7.8 1
1.0%
8.8 1
1.0%
9.2 1
1.0%
9.4 1
1.0%
9.5 2
2.0%
9.8 2
2.0%
9.9 2
2.0%
ValueCountFrequency (%)
84.0 1
1.0%
42.8 1
1.0%
37.3 1
1.0%
31.1 1
1.0%
30.3 1
1.0%
25.9 1
1.0%
24.9 1
1.0%
24.8 1
1.0%
24.7 1
1.0%
23.6 1
1.0%

Cr_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct58
Distinct (%)58.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9646
Minimum0.48
Maximum5.91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:53.164692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.48
5-th percentile0.58
Q10.72
median0.87
Q31.0125
95-th percentile1.4415
Maximum5.91
Range5.43
Interquartile range (IQR)0.2925

Descriptive statistics

Standard deviation0.58045162
Coefficient of variation (CV)0.6017537
Kurtosis54.033165
Mean0.9646
Median Absolute Deviation (MAD)0.15
Skewness6.587708
Sum96.46
Variance0.33692408
MonotonicityNot monotonic
2023-10-09T03:57:53.436259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.69 5
 
5.0%
0.79 4
 
4.0%
1.05 3
 
3.0%
0.92 3
 
3.0%
0.87 3
 
3.0%
1.01 3
 
3.0%
0.78 3
 
3.0%
0.84 3
 
3.0%
0.9 3
 
3.0%
0.83 3
 
3.0%
Other values (48) 67
67.0%
ValueCountFrequency (%)
0.48 1
1.0%
0.5 1
1.0%
0.51 1
1.0%
0.56 1
1.0%
0.58 2
2.0%
0.59 1
1.0%
0.6 1
1.0%
0.61 2
2.0%
0.62 1
1.0%
0.65 1
1.0%
ValueCountFrequency (%)
5.91 1
1.0%
2.22 1
1.0%
2.14 1
1.0%
1.98 1
1.0%
1.47 1
1.0%
1.44 1
1.0%
1.43 1
1.0%
1.41 1
1.0%
1.31 1
1.0%
1.27 1
1.0%

AST_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct34
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.59
Minimum12
Maximum196
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:53.851549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile15
Q118
median22
Q329
95-th percentile64.5
Maximum196
Range184
Interquartile range (IQR)11

Descriptive statistics

Standard deviation23.3403
Coefficient of variation (CV)0.81637985
Kurtosis28.546768
Mean28.59
Median Absolute Deviation (MAD)5
Skewness4.7253139
Sum2859
Variance544.7696
MonotonicityNot monotonic
2023-10-09T03:57:54.173963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
20 7
 
7.0%
18 7
 
7.0%
15 7
 
7.0%
19 7
 
7.0%
17 6
 
6.0%
21 5
 
5.0%
28 5
 
5.0%
22 5
 
5.0%
24 4
 
4.0%
36 4
 
4.0%
Other values (24) 43
43.0%
ValueCountFrequency (%)
12 1
 
1.0%
14 3
3.0%
15 7
7.0%
16 4
4.0%
17 6
6.0%
18 7
7.0%
19 7
7.0%
20 7
7.0%
21 5
5.0%
22 5
5.0%
ValueCountFrequency (%)
196 1
1.0%
119 1
1.0%
85 1
1.0%
75 1
1.0%
74 1
1.0%
64 1
1.0%
55 1
1.0%
53 1
1.0%
50 1
1.0%
45 1
1.0%

ALT_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct51
Distinct (%)51.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.83
Minimum7
Maximum218
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:54.488480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile13
Q118
median24
Q337.25
95-th percentile91.25
Maximum218
Range211
Interquartile range (IQR)19.25

Descriptive statistics

Standard deviation31.136407
Coefficient of variation (CV)0.8939537
Kurtosis15.066775
Mean34.83
Median Absolute Deviation (MAD)8
Skewness3.3896001
Sum3483
Variance969.47586
MonotonicityNot monotonic
2023-10-09T03:57:54.775928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18 7
 
7.0%
20 6
 
6.0%
19 6
 
6.0%
16 5
 
5.0%
13 4
 
4.0%
21 4
 
4.0%
29 4
 
4.0%
28 3
 
3.0%
24 3
 
3.0%
14 3
 
3.0%
Other values (41) 55
55.0%
ValueCountFrequency (%)
7 1
 
1.0%
9 1
 
1.0%
11 1
 
1.0%
12 1
 
1.0%
13 4
4.0%
14 3
3.0%
15 2
 
2.0%
16 5
5.0%
17 2
 
2.0%
18 7
7.0%
ValueCountFrequency (%)
218 1
1.0%
175 1
1.0%
100 1
1.0%
96 2
2.0%
91 1
1.0%
90 1
1.0%
72 1
1.0%
71 1
1.0%
67 1
1.0%
66 1
1.0%

MDRD_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct97
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean81.9759
Minimum9.62
Maximum146.34
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:55.190358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9.62
5-th percentile36.9555
Q170.7375
median82.88
Q394.6825
95-th percentile121.599
Maximum146.34
Range136.72
Interquartile range (IQR)23.945

Descriptive statistics

Standard deviation22.999817
Coefficient of variation (CV)0.28056803
Kurtosis1.215002
Mean81.9759
Median Absolute Deviation (MAD)12.165
Skewness-0.34966167
Sum8197.59
Variance528.99157
MonotonicityNot monotonic
2023-10-09T03:57:55.642842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
88.8 2
 
2.0%
85.66 2
 
2.0%
78.84 2
 
2.0%
98.58 1
 
1.0%
80.17 1
 
1.0%
90.22 1
 
1.0%
64.94 1
 
1.0%
9.62 1
 
1.0%
62.76 1
 
1.0%
52.28 1
 
1.0%
Other values (87) 87
87.0%
ValueCountFrequency (%)
9.62 1
1.0%
22.53 1
1.0%
24.78 1
1.0%
34.76 1
1.0%
36.49 1
1.0%
36.98 1
1.0%
46.45 1
1.0%
47.08 1
1.0%
52.28 1
1.0%
53.48 1
1.0%
ValueCountFrequency (%)
146.34 1
1.0%
131.06 1
1.0%
126.28 1
1.0%
125.67 1
1.0%
122.91 1
1.0%
121.53 1
1.0%
116.96 1
1.0%
114.62 1
1.0%
109.69 1
1.0%
108.71 1
1.0%
Distinct64
Distinct (%)64.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2009-06-01 00:00:00
Maximum2019-05-01 00:00:00
2023-10-09T03:57:56.227227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-09T03:57:56.483558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

TC_VAL
Real number (ℝ)

HIGH CORRELATION 

Distinct75
Distinct (%)75.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean173.99
Minimum74
Maximum286
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-10-09T03:57:56.808119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/