Overview

Dataset statistics

Number of variables5
Number of observations38
Missing cells66
Missing cells (%)34.7%
Duplicate rows1
Duplicate rows (%)2.6%
Total size in memory1.7 KiB
Average record size in memory46.5 B

Variable types

Numeric2
Text2
Unsupported1

Dataset

Description샘플 데이터
Author(재)전남정보문화산업진흥원
URLhttps://kadx.co.kr/opmk/frn/pmumkproductDetail/PMU_4c00c53e-9780-4c01-8711-c7dc9454173b/5

Alerts

Dataset has 1 (2.6%) duplicate rowsDuplicates
FAMP_ID has 7 (18.4%) missing valuesMissing
FMLD_ADDR has 7 (18.4%) missing valuesMissing
PHT_DT has 7 (18.4%) missing valuesMissing
FILE_NM has 7 (18.4%) missing valuesMissing
IMG_URL has 38 (100.0%) missing valuesMissing
IMG_URL is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 20:16:40.787293
Analysis finished2023-12-11 20:16:42.602396
Duration1.82 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

FAMP_ID
Real number (ℝ)

MISSING 

Distinct31
Distinct (%)100.0%
Missing7
Missing (%)18.4%
Infinite0
Infinite (%)0.0%
Mean6863058.4
Minimum6860052
Maximum6871082
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2023-12-12T05:16:42.676418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6860052
5-th percentile6860226.5
Q16861333.5
median6862644
Q36863968
95-th percentile6868146.5
Maximum6871082
Range11030
Interquartile range (IQR)2634.5

Descriptive statistics

Standard deviation2557.051
Coefficient of variation (CV)0.00037258186
Kurtosis3.6224827
Mean6863058.4
Median Absolute Deviation (MAD)1479
Skewness1.6960001
Sum2.1275481 × 108
Variance6538510
MonotonicityNot monotonic
2023-12-12T05:16:42.776550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
6860234 1
 
2.6%
6863336 1
 
2.6%
6861502 1
 
2.6%
6862605 1
 
2.6%
6864172 1
 
2.6%
6860803 1
 
2.6%
6861695 1
 
2.6%
6861165 1
 
2.6%
6870256 1
 
2.6%
6860052 1
 
2.6%
Other values (21) 21
55.3%
(Missing) 7
 
18.4%
ValueCountFrequency (%)
6860052 1
2.6%
6860219 1
2.6%
6860234 1
2.6%
6860555 1
2.6%
6860791 1
2.6%
6860803 1
2.6%
6860907 1
2.6%
6861165 1
2.6%
6861502 1
2.6%
6861603 1
2.6%
ValueCountFrequency (%)
6871082 1
2.6%
6870256 1
2.6%
6866037 1
2.6%
6865212 1
2.6%
6864629 1
2.6%
6864427 1
2.6%
6864172 1
2.6%
6864168 1
2.6%
6863768 1
2.6%
6863610 1
2.6%

FMLD_ADDR
Text

MISSING 

Distinct31
Distinct (%)100.0%
Missing7
Missing (%)18.4%
Memory size436.0 B
2023-12-12T05:16:42.933739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length22
Mean length21.903226
Min length20

Characters and Unicode

Total characters679
Distinct characters41
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)100.0%

Sample

1st row전라남도 무안군 운남면 동암리 430-0
2nd row전라남도 무안군 해제면 천장리 712-9
3rd row전라남도 무안군 해제면 천장리 725-1
4th row전라남도 무안군 해제면 창매리 232-0
5th row전라남도 무안군 해제면 광산리 243-1
ValueCountFrequency (%)
전라남도 31
20.0%
무안군 31
20.0%
해제면 28
18.1%
천장리 8
 
5.2%
창매리 8
 
5.2%
산길리 4
 
2.6%
용학리 3
 
1.9%
운남면 3
 
1.9%
양매리 2
 
1.3%
유월리 2
 
1.3%
Other values (35) 35
22.6%
2023-12-12T05:16:43.190792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
124
18.3%
34
 
5.0%
31
 
4.6%
31
 
4.6%
- 31
 
4.6%
31
 
4.6%
31
 
4.6%
31
 
4.6%
31
 
4.6%
31
 
4.6%
Other values (31) 273
40.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 402
59.2%
Space Separator 124
 
18.3%
Decimal Number 122
 
18.0%
Dash Punctuation 31
 
4.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
34
 
8.5%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
28
 
7.0%
Other values (19) 92
22.9%
Decimal Number
ValueCountFrequency (%)
1 28
23.0%
2 18
14.8%
0 17
13.9%
3 12
9.8%
4 11
 
9.0%
5 9
 
7.4%
6 8
 
6.6%
9 7
 
5.7%
7 7
 
5.7%
8 5
 
4.1%
Space Separator
ValueCountFrequency (%)
124
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 31
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 402
59.2%
Common 277
40.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
34
 
8.5%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
28
 
7.0%
Other values (19) 92
22.9%
Common
ValueCountFrequency (%)
124
44.8%
- 31
 
11.2%
1 28
 
10.1%
2 18
 
6.5%
0 17
 
6.1%
3 12
 
4.3%
4 11
 
4.0%
5 9
 
3.2%
6 8
 
2.9%
9 7
 
2.5%
Other values (2) 12
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 402
59.2%
ASCII 277
40.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
124
44.8%
- 31
 
11.2%
1 28
 
10.1%
2 18
 
6.5%
0 17
 
6.1%
3 12
 
4.3%
4 11
 
4.0%
5 9
 
3.2%
6 8
 
2.9%
9 7
 
2.5%
Other values (2) 12
 
4.3%
Hangul
ValueCountFrequency (%)
34
 
8.5%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
28
 
7.0%
Other values (19) 92
22.9%

PHT_DT
Real number (ℝ)

MISSING 

Distinct31
Distinct (%)100.0%
Missing7
Missing (%)18.4%
Infinite0
Infinite (%)0.0%
Mean2.0230222 × 1013
Minimum2.0230117 × 1013
Maximum2.0230309 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2023-12-12T05:16:43.303715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0230117 × 1013
5-th percentile2.0230117 × 1013
Q12.0230119 × 1013
median2.0230222 × 1013
Q32.0230303 × 1013
95-th percentile2.0230309 × 1013
Maximum2.0230309 × 1013
Range1.9208969 × 108
Interquartile range (IQR)1.8454525 × 108

Descriptive statistics

Standard deviation81602711
Coefficient of variation (CV)4.0337031 × 10-6
Kurtosis-1.6485556
Mean2.0230222 × 1013
Median Absolute Deviation (MAD)81014388
Skewness-0.26572985
Sum6.271369 × 1014
Variance6.6590024 × 1015
MonotonicityNot monotonic
2023-12-12T05:16:43.423277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20230303035032 1
 
2.6%
20230117120628 1
 
2.6%
20230309111554 1
 
2.6%
20230303114555 1
 
2.6%
20230118023256 1
 
2.6%
20230309111449 1
 
2.6%
20230222094509 1
 
2.6%
20230309110826 1
 
2.6%
20230119123930 1
 
2.6%
20230117104447 1
 
2.6%
Other values (21) 21
55.3%
(Missing) 7
 
18.4%
ValueCountFrequency (%)
20230117023840 1
2.6%
20230117023948 1
2.6%
20230117104447 1
2.6%
20230117112617 1
2.6%
20230117113033 1
2.6%
20230117120628 1
2.6%
20230118023256 1
2.6%
20230118024038 1
2.6%
20230119101147 1
2.6%
20230119123930 1
2.6%
ValueCountFrequency (%)
20230309113534 1
2.6%
20230309111554 1
2.6%
20230309111449 1
2.6%
20230309110826 1
2.6%
20230309102302 1
2.6%
20230303114947 1
2.6%
20230303114555 1
2.6%
20230303112339 1
2.6%
20230303103349 1
2.6%
20230303094837 1
2.6%

FILE_NM
Text

MISSING 

Distinct31
Distinct (%)100.0%
Missing7
Missing (%)18.4%
Memory size436.0 B
2023-12-12T05:16:43.579682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length48
Mean length47.903226
Min length46

Characters and Unicode

Total characters1485
Distinct characters46
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)100.0%

Sample

1st row06871082_전라남도 무안군 운남면 동암리 430-0_230119101147.jpg
2nd row06860234_전라남도 무안군 해제면 천장리 712-9_230303035032.jpg
3rd row06860555_전라남도 무안군 해제면 천장리 725-1_230118024038.jpg
4th row06862876_전라남도 무안군 해제면 창매리 232-0_230303114947.jpg
5th row06860219_전라남도 무안군 해제면 광산리 243-1_230117113033.jpg
ValueCountFrequency (%)
무안군 31
20.0%
해제면 28
18.1%
천장리 8
 
5.2%
창매리 8
 
5.2%
산길리 4
 
2.6%
용학리 3
 
1.9%
운남면 3
 
1.9%
유월리 2
 
1.3%
양매리 2
 
1.3%
06860234_전라남도 1
 
0.6%
Other values (65) 65
41.9%
2023-12-12T05:16:43.829672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 144
 
9.7%
124
 
8.4%
2 108
 
7.3%
1 105
 
7.1%
3 101
 
6.8%
6 87
 
5.9%
_ 62
 
4.2%
8 51
 
3.4%
4 47
 
3.2%
5 37
 
2.5%
Other values (36) 619
41.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 742
50.0%
Other Letter 402
27.1%
Space Separator 124
 
8.4%
Lowercase Letter 93
 
6.3%
Connector Punctuation 62
 
4.2%
Other Punctuation 31
 
2.1%
Dash Punctuation 31
 
2.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
34
 
8.5%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
28
 
7.0%
Other values (19) 92
22.9%
Decimal Number
ValueCountFrequency (%)
0 144
19.4%
2 108
14.6%
1 105
14.2%
3 101
13.6%
6 87
11.7%
8 51
 
6.9%
4 47
 
6.3%
5 37
 
5.0%
9 33
 
4.4%
7 29
 
3.9%
Lowercase Letter
ValueCountFrequency (%)
g 31
33.3%
j 31
33.3%
p 31
33.3%
Space Separator
ValueCountFrequency (%)
124
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 62
100.0%
Other Punctuation
ValueCountFrequency (%)
. 31
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 31
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 990
66.7%
Hangul 402
27.1%
Latin 93
 
6.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
34
 
8.5%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
28
 
7.0%
Other values (19) 92
22.9%
Common
ValueCountFrequency (%)
0 144
14.5%
124
12.5%
2 108
10.9%
1 105
10.6%
3 101
10.2%
6 87
8.8%
_ 62
6.3%
8 51
 
5.2%
4 47
 
4.7%
5 37
 
3.7%
Other values (4) 124
12.5%
Latin
ValueCountFrequency (%)
g 31
33.3%
j 31
33.3%
p 31
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1083
72.9%
Hangul 402
 
27.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 144
13.3%
124
11.4%
2 108
10.0%
1 105
9.7%
3 101
9.3%
6 87
8.0%
_ 62
 
5.7%
8 51
 
4.7%
4 47
 
4.3%
5 37
 
3.4%
Other values (7) 217
20.0%
Hangul
ValueCountFrequency (%)
34
 
8.5%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
31
 
7.7%
28
 
7.0%
Other values (19) 92
22.9%

IMG_URL
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing38
Missing (%)100.0%
Memory size474.0 B

Interactions

2023-12-12T05:16:42.197669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T05:16:42.026775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T05:16:42.264312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T05:16:42.128646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T05:16:43.902196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
FAMP_IDFMLD_ADDRPHT_DTFILE_NM
FAMP_ID1.0001.0000.4931.000
FMLD_ADDR1.0001.0001.0001.000
PHT_DT0.4931.0001.0001.000
FILE_NM1.0001.0001.0001.000
2023-12-12T05:16:43.970600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
FAMP_IDPHT_DT
FAMP_ID1.000-0.165
PHT_DT-0.1651.000

Missing values

2023-12-12T05:16:42.353024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T05:16:42.430721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T05:16:42.536510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

FAMP_IDFMLD_ADDRPHT_DTFILE_NMIMG_URL
06871082전라남도 무안군 운남면 동암리 430-02023011910114706871082_전라남도 무안군 운남면 동암리 430-0_230119101147.jpg<NA>
16860234전라남도 무안군 해제면 천장리 712-92023030303503206860234_전라남도 무안군 해제면 천장리 712-9_230303035032.jpg<NA>
26860555전라남도 무안군 해제면 천장리 725-12023011802403806860555_전라남도 무안군 해제면 천장리 725-1_230118024038.jpg<NA>
36862876전라남도 무안군 해제면 창매리 232-02023030311494706862876_전라남도 무안군 해제면 창매리 232-0_230303114947.jpg<NA>
46860219전라남도 무안군 해제면 광산리 243-12023011711303306860219_전라남도 무안군 해제면 광산리 243-1_230117113033.jpg<NA>
56864168전라남도 무안군 해제면 용학리 404-02023020703450706864168_전라남도 무안군 해제면 용학리 404-0_230207034507.jpg<NA>
66864427전라남도 무안군 해제면 용학리 64-22023030304425606864427_전라남도 무안군 해제면 용학리 64-2_230303044256.jpg<NA>
76863610전라남도 무안군 해제면 천장리 901-52023030310334906863610_전라남도 무안군 해제면 천장리 901-5_230303103349.jpg<NA>
86864629전라남도 무안군 해제면 유월리 375-162023030911353406864629_전라남도 무안군 해제면 유월리 375-16_230309113534.jpg<NA>
96863254전라남도 무안군 해제면 창매리 521-12023011702394806863254_전라남도 무안군 해제면 창매리 521-1_230117023948.jpg<NA>
FAMP_IDFMLD_ADDRPHT_DTFILE_NMIMG_URL
286862605전라남도 무안군 해제면 창매리 126-22023030311455506862605_전라남도 무안군 해제면 창매리 126-2_230303114555.jpg<NA>
296861502전라남도 무안군 해제면 천장리 846-02023030911155406861502_전라남도 무안군 해제면 천장리 846-0_230309111554.jpg<NA>
306863336전라남도 무안군 해제면 천장리 251-12023011712062806863336_전라남도 무안군 해제면 천장리 251-1_230117120628.jpg<NA>
31<NA><NA><NA><NA><NA>
32<NA><NA><NA><NA><NA>
33<NA><NA><NA><NA><NA>
34<NA><NA><NA><NA><NA>
35<NA><NA><NA><NA><NA>
36<NA><NA><NA><NA><NA>
37<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

FAMP_IDFMLD_ADDRPHT_DTFILE_NM# duplicates
0<NA><NA><NA><NA>7