Overview

Dataset statistics

Number of variables10
Number of observations5065
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory415.6 KiB
Average record size in memory84.0 B

Variable types

Numeric4
Categorical2
Text4

Alerts

SIDO_NM is highly overall correlated with PRSMP_LFSTS_PC_FLCTS_ACCTO_MTRNT_RLIMP_INFO_NO and 1 other fieldsHigh correlation
TNSHP_NM is highly overall correlated with PRSMP_LFSTS_PC_FLCTS_ACCTO_MTRNT_RLIMP_INFO_NO and 2 other fieldsHigh correlation
PRSMP_LFSTS_PC_FLCTS_ACCTO_MTRNT_RLIMP_INFO_NO is highly overall correlated with SIDO_NM and 1 other fieldsHigh correlation
LFSTS_CNTRCT_CASCNT is highly overall correlated with MTHT_CNTRCT_CASCNTHigh correlation
MTHT_CNTRCT_CASCNT is highly overall correlated with LFSTS_CNTRCT_CASCNTHigh correlation
PRSMP_LFSTS_PC_FLCTN_RATE is highly overall correlated with TNSHP_NMHigh correlation
TNSHP_NM is highly imbalanced (75.3%)Imbalance
PRSMP_LFSTS_PC_FLCTS_ACCTO_MTRNT_RLIMP_INFO_NO has unique valuesUnique
LFSTS_CNTRCT_CASCNT has 2187 (43.2%) zerosZeros
MTHT_CNTRCT_CASCNT has 1976 (39.0%) zerosZeros
PRSMP_LFSTS_PC_FLCTN_RATE has 159 (3.1%) zerosZeros

Reproduction

Analysis started2023-12-11 22:31:21.818373
Analysis finished2023-12-11 22:31:25.250587
Duration3.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

PRSMP_LFSTS_PC_FLCTS_ACCTO_MTRNT_RLIMP_INFO_NO
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct5065
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2533
Minimum1
Maximum5065
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2023-12-12T07:31:25.308444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile254.2
Q11267
median2533
Q33799
95-th percentile4811.8
Maximum5065
Range5064
Interquartile range (IQR)2532

Descriptive statistics

Standard deviation1462.2839
Coefficient of variation (CV)0.57729328
Kurtosis-1.2
Mean2533
Median Absolute Deviation (MAD)1266
Skewness0
Sum12829645
Variance2138274.2
MonotonicityStrictly increasing
2023-12-12T07:31:25.414406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
3375 1
 
< 0.1%
3382 1
 
< 0.1%
3381 1
 
< 0.1%
3380 1
 
< 0.1%
3379 1
 
< 0.1%
3378 1
 
< 0.1%
3377 1
 
< 0.1%
3376 1
 
< 0.1%
3374 1
 
< 0.1%
Other values (5055) 5055
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
5065 1
< 0.1%
5064 1
< 0.1%
5063 1
< 0.1%
5062 1
< 0.1%
5061 1
< 0.1%
5060 1
< 0.1%
5059 1
< 0.1%
5058 1
< 0.1%
5057 1
< 0.1%
5056 1
< 0.1%

SIDO_NM
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size39.7 KiB
경기도
747 
경상남도
547 
경상북도
533 
서울특별시
467 
전라남도
422 
Other values (12)
2349 

Length

Max length7
Median length5
Mean length4.1472853
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 747
14.7%
경상남도 547
10.8%
경상북도 533
10.5%
서울특별시 467
9.2%
전라남도 422
8.3%
전라북도 410
8.1%
강원도 298
 
5.9%
충청남도 285
 
5.6%
충청북도 238
 
4.7%
대구광역시 204
 
4.0%
Other values (7) 914
18.0%

Length

2023-12-12T07:31:25.524027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 747
14.7%
경상남도 547
10.8%
경상북도 533
10.5%
서울특별시 467
9.2%
전라남도 422
8.3%
전라북도 410
8.1%
강원도 298
 
5.9%
충청남도 285
 
5.6%
충청북도 238
 
4.7%
대구광역시 204
 
4.0%
Other values (7) 914
18.0%
Distinct207
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size39.7 KiB
2023-12-12T07:31:25.801115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.899309
Min length2

Characters and Unicode

Total characters14685
Distinct characters133
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row춘천시
2nd row춘천시
3rd row춘천시
4th row춘천시
5th row춘천시
ValueCountFrequency (%)
중구 268
 
5.3%
창원시 204
 
4.0%
동구 143
 
2.8%
북구 104
 
2.1%
서구 99
 
2.0%
청주시 95
 
1.9%
종로구 87
 
1.7%
전주시 83
 
1.6%
광산구 79
 
1.6%
목포시 64
 
1.3%
Other values (197) 3839
75.8%
2023-12-12T07:31:26.201455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2830
19.3%
1504
 
10.2%
910
 
6.2%
734
 
5.0%
478
 
3.3%
451
 
3.1%
382
 
2.6%
342
 
2.3%
336
 
2.3%
274
 
1.9%
Other values (123) 6444
43.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14685
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2830
19.3%
1504
 
10.2%
910
 
6.2%
734
 
5.0%
478
 
3.3%
451
 
3.1%
382
 
2.6%
342
 
2.3%
336
 
2.3%
274
 
1.9%
Other values (123) 6444
43.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14685
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2830
19.3%
1504
 
10.2%
910
 
6.2%
734
 
5.0%
478
 
3.3%
451
 
3.1%
382
 
2.6%
342
 
2.3%
336
 
2.3%
274
 
1.9%
Other values (123) 6444
43.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14685
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2830
19.3%
1504
 
10.2%
910
 
6.2%
734
 
5.0%
478
 
3.3%
451
 
3.1%
382
 
2.6%
342
 
2.3%
336
 
2.3%
274
 
1.9%
Other values (123) 6444
43.9%

TNSHP_NM
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct33
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size39.7 KiB
<NA>
4353 
진해구
 
65
마산합포구
 
64
완산구
 
46
성산구
 
41
Other values (28)
496 

Length

Max length5
Median length4
Mean length3.88154
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 4353
85.9%
진해구 65
 
1.3%
마산합포구 64
 
1.3%
완산구 46
 
0.9%
성산구 41
 
0.8%
덕진구 37
 
0.7%
흥덕구 34
 
0.7%
덕양구 32
 
0.6%
상당구 31
 
0.6%
북구 31
 
0.6%
Other values (23) 331
 
6.5%

Length

2023-12-12T07:31:26.330678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 4353
85.9%
진해구 65
 
1.3%
마산합포구 64
 
1.3%
완산구 46
 
0.9%
성산구 41
 
0.8%
덕진구 37
 
0.7%
흥덕구 34
 
0.7%
덕양구 32
 
0.6%
북구 31
 
0.6%
상당구 31
 
0.6%
Other values (23) 331
 
6.5%

EMD_NM
Text

Distinct3975
Distinct (%)78.5%
Missing0
Missing (%)0.0%
Memory size39.7 KiB
2023-12-12T07:31:26.624884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1461007
Min length2

Characters and Unicode

Total characters15935
Distinct characters370
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3385 ?
Unique (%)66.8%

Sample

1st row봉의동
2nd row요선동
3rd row낙원동
4th row중앙로1가
5th row중앙로2가
ValueCountFrequency (%)
교동 18
 
0.4%
송정동 14
 
0.3%
중동 13
 
0.3%
남면 12
 
0.2%
중앙동 11
 
0.2%
금곡동 10
 
0.2%
신흥동 9
 
0.2%
내동 9
 
0.2%
서면 9
 
0.2%
신동 9
 
0.2%
Other values (3965) 4951
97.7%
2023-12-12T07:31:27.038816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3760
23.6%
1183
 
7.4%
510
 
3.2%
376
 
2.4%
246
 
1.5%
221
 
1.4%
214
 
1.3%
214
 
1.3%
199
 
1.2%
186
 
1.2%
Other values (360) 8826
55.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15500
97.3%
Decimal Number 435
 
2.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3760
24.3%
1183
 
7.6%
510
 
3.3%
376
 
2.4%
246
 
1.6%
221
 
1.4%
214
 
1.4%
214
 
1.4%
199
 
1.3%
186
 
1.2%
Other values (352) 8391
54.1%
Decimal Number
ValueCountFrequency (%)
1 133
30.6%
2 132
30.3%
3 87
20.0%
4 40
 
9.2%
5 24
 
5.5%
6 12
 
2.8%
7 6
 
1.4%
8 1
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15500
97.3%
Common 435
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3760
24.3%
1183
 
7.6%
510
 
3.3%
376
 
2.4%
246
 
1.6%
221
 
1.4%
214
 
1.4%
214
 
1.4%
199
 
1.3%
186
 
1.2%
Other values (352) 8391
54.1%
Common
ValueCountFrequency (%)
1 133
30.6%
2 132
30.3%
3 87
20.0%
4 40
 
9.2%
5 24
 
5.5%
6 12
 
2.8%
7 6
 
1.4%
8 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15500
97.3%
ASCII 435
 
2.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3760
24.3%
1183
 
7.6%
510
 
3.3%
376
 
2.4%
246
 
1.6%
221
 
1.4%
214
 
1.4%
214
 
1.4%
199
 
1.3%
186
 
1.2%
Other values (352) 8391
54.1%
ASCII
ValueCountFrequency (%)
1 133
30.6%
2 132
30.3%
3 87
20.0%
4 40
 
9.2%
5 24
 
5.5%
6 12
 
2.8%
7 6
 
1.4%
8 1
 
0.2%

LFSTS_CNTRCT_CASCNT
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct424
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44.00849
Minimum0
Maximum2421
Zeros2187
Zeros (%)43.2%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2023-12-12T07:31:27.156278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q328
95-th percentile246.4
Maximum2421
Range2421
Interquartile range (IQR)28

Descriptive statistics

Standard deviation122.27172
Coefficient of variation (CV)2.7783666
Kurtosis65.088873
Mean44.00849
Median Absolute Deviation (MAD)1
Skewness6.3180179
Sum222903
Variance14950.373
MonotonicityNot monotonic
2023-12-12T07:31:27.267146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2187
43.2%
1 384
 
7.6%
2 199
 
3.9%
3 122
 
2.4%
4 104
 
2.1%
5 89
 
1.8%
6 68
 
1.3%
7 58
 
1.1%
10 54
 
1.1%
8 52
 
1.0%
Other values (414) 1748
34.5%
ValueCountFrequency (%)
0 2187
43.2%
1 384
 
7.6%
2 199
 
3.9%
3 122
 
2.4%
4 104
 
2.1%
5 89
 
1.8%
6 68
 
1.3%
7 58
 
1.1%
8 52
 
1.0%
9 42
 
0.8%
ValueCountFrequency (%)
2421 1
< 0.1%
1796 1
< 0.1%
1457 1
< 0.1%
1451 1
< 0.1%
1377 1
< 0.1%
1279 1
< 0.1%
1261 1
< 0.1%
1256 1
< 0.1%
1218 1
< 0.1%
1182 1
< 0.1%

MTHT_CNTRCT_CASCNT
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct442
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.11767
Minimum0
Maximum2742
Zeros1976
Zeros (%)39.0%
Negative0
Negative (%)0.0%
Memory size44.6 KiB
2023-12-12T07:31:27.381974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q335
95-th percentile261.8
Maximum2742
Range2742
Interquartile range (IQR)35

Descriptive statistics

Standard deviation133.03682
Coefficient of variation (CV)2.7085328
Kurtosis84.9719
Mean49.11767
Median Absolute Deviation (MAD)2
Skewness6.8755555
Sum248781
Variance17698.795
MonotonicityNot monotonic
2023-12-12T07:31:27.496772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/