Overview

Dataset statistics

Number of variables10
Number of observations479
Missing cells58
Missing cells (%)1.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory39.0 KiB
Average record size in memory83.3 B

Variable types

Numeric3
Categorical4
Text3

Dataset

Description아산시 관내 석면조사 대상 건축물현황으로 용도, 주소, 대상면적 등이 포함됩니다.--------------------------
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=450&beforeMenuCd=DOM_000000201001001000&publicdatapk=3073034

Alerts

제외현황 is highly overall correlated with 번호 and 5 other fieldsHigh correlation
석면건축물여부 is highly overall correlated with 석면(자재)면적 and 2 other fieldsHigh correlation
구분 is highly overall correlated with 번호 and 1 other fieldsHigh correlation
안전관리인지정여부 is highly overall correlated with 석면(자재)면적 and 2 other fieldsHigh correlation
번호 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
연면적 is highly overall correlated with 제외현황High correlation
석면(자재)면적 is highly overall correlated with 석면건축물여부 and 2 other fieldsHigh correlation
제외현황 is highly imbalanced (61.5%)Imbalance
동명 has 58 (12.1%) missing valuesMissing
번호 has unique valuesUnique
석면(자재)면적 has 317 (66.2%) zerosZeros

Reproduction

Analysis started2024-01-09 22:14:05.670603
Analysis finished2024-01-09 22:14:07.137758
Duration1.47 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct479
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean240
Minimum1
Maximum479
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2024-01-10T07:14:07.193922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile24.9
Q1120.5
median240
Q3359.5
95-th percentile455.1
Maximum479
Range478
Interquartile range (IQR)239

Descriptive statistics

Standard deviation138.41965
Coefficient of variation (CV)0.57674855
Kurtosis-1.2
Mean240
Median Absolute Deviation (MAD)120
Skewness0
Sum114960
Variance19160
MonotonicityStrictly increasing
2024-01-10T07:14:07.303956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
2 1
 
0.2%
329 1
 
0.2%
328 1
 
0.2%
327 1
 
0.2%
326 1
 
0.2%
325 1
 
0.2%
324 1
 
0.2%
323 1
 
0.2%
322 1
 
0.2%
Other values (469) 469
97.9%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
479 1
0.2%
478 1
0.2%
477 1
0.2%
476 1
0.2%
475 1
0.2%
474 1
0.2%
473 1
0.2%
472 1
0.2%
471 1
0.2%
470 1
0.2%

구분
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
어린이집
196 
공공건축물
119 
대학교
102 
불특정다수이용
53 
기타
 
9

Length

Max length7
Median length5
Mean length4.3298539
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공공건축물
2nd row대학교
3rd row공공건축물
4th row공공건축물
5th row공공건축물

Common Values

ValueCountFrequency (%)
어린이집 196
40.9%
공공건축물 119
24.8%
대학교 102
21.3%
불특정다수이용 53
 
11.1%
기타 9
 
1.9%

Length

2024-01-10T07:14:07.410187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:14:07.488599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
어린이집 196
40.9%
공공건축물 119
24.8%
대학교 102
21.3%
불특정다수이용 53
 
11.1%
기타 9
 
1.9%
Distinct387
Distinct (%)80.8%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2024-01-10T07:14:07.721328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length22
Mean length9.605428
Min length4

Characters and Unicode

Total characters4601
Distinct characters355
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique359 ?
Unique (%)74.9%

Sample

1st row경찰교육원(본관/도서관)
2nd row한국폴리텍 아산캠퍼스
3rd row국립종자원
4th row국립농산물품질관리원
5th row한국전력 아산지사
ValueCountFrequency (%)
금호리조트(주)아산스파비스 19
 
3.4%
충무교육원 12
 
2.2%
본관 11
 
2.0%
한국폴리텍 8
 
1.4%
아산캠퍼스 8
 
1.4%
가축위생연구소 7
 
1.3%
아산지소 7
 
1.3%
아산정수장 6
 
1.1%
lh아산에너지사업단 6
 
1.1%
아산경찰서 6
 
1.1%
Other values (408) 465
83.8%
2024-01-10T07:14:08.074780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
230
 
5.0%
206
 
4.5%
200
 
4.3%
195
 
4.2%
) 149
 
3.2%
( 149
 
3.2%
147
 
3.2%
141
 
3.1%
131
 
2.8%
118
 
2.6%
Other values (345) 2935
63.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4153
90.3%
Close Punctuation 149
 
3.2%
Open Punctuation 149
 
3.2%
Space Separator 84
 
1.8%
Uppercase Letter 37
 
0.8%
Decimal Number 17
 
0.4%
Other Punctuation 7
 
0.2%
Connector Punctuation 2
 
< 0.1%
Lowercase Letter 1
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
230
 
5.5%
206
 
5.0%
200
 
4.8%
195
 
4.7%
147
 
3.5%
141
 
3.4%
131
 
3.2%
118
 
2.8%
112
 
2.7%
86
 
2.1%
Other values (318) 2587
62.3%
Uppercase Letter
ValueCountFrequency (%)
H 8
21.6%
L 8
21.6%
B 4
10.8%
A 4
10.8%
C 3
 
8.1%
D 2
 
5.4%
I 1
 
2.7%
W 1
 
2.7%
S 1
 
2.7%
N 1
 
2.7%
Other values (4) 4
10.8%
Decimal Number
ValueCountFrequency (%)
1 10
58.8%
2 6
35.3%
3 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
/ 3
42.9%
, 3
42.9%
. 1
 
14.3%
Close Punctuation
ValueCountFrequency (%)
) 149
100.0%
Open Punctuation
ValueCountFrequency (%)
( 149
100.0%
Space Separator
ValueCountFrequency (%)
84
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
i 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4154
90.3%
Common 409
 
8.9%
Latin 38
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
230
 
5.5%
206
 
5.0%
200
 
4.8%
195
 
4.7%
147
 
3.5%
141
 
3.4%
131
 
3.2%
118
 
2.8%
112
 
2.7%
86
 
2.1%
Other values (319) 2588
62.3%
Latin
ValueCountFrequency (%)
H 8
21.1%
L 8
21.1%
B 4
10.5%
A 4
10.5%
C 3
 
7.9%
D 2
 
5.3%
i 1
 
2.6%
I 1
 
2.6%
W 1
 
2.6%
S 1
 
2.6%
Other values (5) 5
13.2%
Common
ValueCountFrequency (%)
) 149
36.4%
( 149
36.4%
84
20.5%
1 10
 
2.4%
2 6
 
1.5%
/ 3
 
0.7%
, 3
 
0.7%
_ 2
 
0.5%
- 1
 
0.2%
. 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4153
90.3%
ASCII 447
 
9.7%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
230
 
5.5%
206
 
5.0%
200
 
4.8%
195
 
4.7%
147
 
3.5%
141
 
3.4%
131
 
3.2%
118
 
2.8%
112
 
2.7%
86
 
2.1%
Other values (318) 2587
62.3%
ASCII
ValueCountFrequency (%)
) 149
33.3%
( 149
33.3%
84
18.8%
1 10
 
2.2%
H 8
 
1.8%
L 8
 
1.8%
2 6
 
1.3%
B 4
 
0.9%
A 4
 
0.9%
/ 3
 
0.7%
Other values (16) 22
 
4.9%
None
ValueCountFrequency (%)
1
100.0%

동명
Text

MISSING 

Distinct382
Distinct (%)90.7%
Missing58
Missing (%)12.1%
Memory size3.9 KiB
2024-01-10T07:14:08.279458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length6.0950119
Min length2

Characters and Unicode

Total characters2566
Distinct characters351
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique353 ?
Unique (%)83.8%

Sample

1st row본관동
2nd row다동
3rd row본관
4th row온천동 우체국
5th row아산교육지원청
ValueCountFrequency (%)
본관 8
 
1.8%
창고 6
 
1.3%
경비실 4
 
0.9%
체육관 3
 
0.7%
학생회관 3
 
0.7%
사무동 3
 
0.7%
2동 3
 
0.7%
인주농협 3
 
0.7%
관리동 2
 
0.4%
동심어린이집 2
 
0.4%
Other values (380) 411
91.7%
2024-01-10T07:14:08.584717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
219
 
8.5%
197
 
7.7%
195
 
7.6%
191
 
7.4%
74
 
2.9%
59
 
2.3%
48
 
1.9%
42
 
1.6%
36
 
1.4%
35
 
1.4%
Other values (341) 1470
57.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2423
94.4%
Decimal Number 52
 
2.0%
Space Separator 35
 
1.4%
Uppercase Letter 24
 
0.9%
Close Punctuation 9
 
0.4%
Open Punctuation 9
 
0.4%
Dash Punctuation 6
 
0.2%
Other Punctuation 4
 
0.2%
Connector Punctuation 2
 
0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
219
 
9.0%
197
 
8.1%
195
 
8.0%
191
 
7.9%
74
 
3.1%
59
 
2.4%
48
 
2.0%
42
 
1.7%
36
 
1.5%
31
 
1.3%
Other values (309) 1331
54.9%
Uppercase Letter
ValueCountFrequency (%)
C 4
16.7%
A 3
12.5%
D 3
12.5%
H 2
8.3%
B 2
8.3%
G 2
8.3%
E 2
8.3%
F 1
 
4.2%
N 1
 
4.2%
S 1
 
4.2%
Other values (3) 3
12.5%
Decimal Number
ValueCountFrequency (%)
1 21
40.4%
2 13
25.0%
3 5
 
9.6%
7 2
 
3.8%
8 2
 
3.8%
6 2
 
3.8%
4 2
 
3.8%
9 2
 
3.8%
5 2
 
3.8%
0 1
 
1.9%
Other Punctuation
ValueCountFrequency (%)
, 2
50.0%
/ 2
50.0%
Space Separator
ValueCountFrequency (%)
35
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
i 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2424
94.5%
Common 117
 
4.6%
Latin 25
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
219
 
9.0%
197
 
8.1%
195
 
8.0%
191
 
7.9%
74
 
3.1%
59
 
2.4%
48
 
2.0%
42
 
1.7%
36
 
1.5%
31
 
1.3%
Other values (310) 1332
55.0%
Common
ValueCountFrequency (%)
35
29.9%
1 21
17.9%
2 13
 
11.1%
) 9
 
7.7%
( 9
 
7.7%
- 6
 
5.1%
3 5
 
4.3%
7 2
 
1.7%
8 2
 
1.7%
6 2
 
1.7%
Other values (7) 13
 
11.1%
Latin
ValueCountFrequency (%)
C 4
16.0%
A 3
12.0%
D 3
12.0%
H 2
8.0%
B 2
8.0%
G 2
8.0%
E 2
8.0%
i 1
 
4.0%
F 1
 
4.0%
N 1
 
4.0%
Other values (4) 4
16.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2423
94.4%
ASCII 142
 
5.5%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
219
 
9.0%
197
 
8.1%
195
 
8.0%
191
 
7.9%
74
 
3.1%
59
 
2.4%
48
 
2.0%
42
 
1.7%
36
 
1.5%
31
 
1.3%
Other values (309) 1331
54.9%
ASCII
ValueCountFrequency (%)
35
24.6%
1 21
14.8%
2 13
 
9.2%
) 9
 
6.3%
( 9
 
6.3%
- 6
 
4.2%
3 5
 
3.5%
C 4
 
2.8%
A 3
 
2.1%
D 3
 
2.1%
Other values (21) 34
23.9%
None
ValueCountFrequency (%)
1
100.0%

주소
Text

Distinct299
Distinct (%)62.4%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
2024-01-10T07:14:08.791442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/