Overview

Dataset statistics

Number of variables18
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)< 0.1%
Total size in memory1.6 MiB
Average record size in memory167.0 B

Variable types

Categorical2
Numeric14
Text2

Dataset

Description자동기상관측장비(AWS) 시간별 관측현황
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=458YRRY04VI3BBMI6Q8326869752&infSeq=1

Alerts

Dataset has 4 (< 0.1%) duplicate rowsDuplicates
지점번호 is highly overall correlated with 시군명High correlation
WGS84위도 is highly overall correlated with 시군명High correlation
WGS84경도 is highly overall correlated with 고도(m) and 1 other fieldsHigh correlation
고도(m) is highly overall correlated with WGS84경도 and 1 other fieldsHigh correlation
기온(℃) is highly overall correlated with 습도(%)High correlation
습도(%) is highly overall correlated with 기온(℃) and 1 other fieldsHigh correlation
현지기압(hPa) is highly overall correlated with 해면기압(hPa) and 1 other fieldsHigh correlation
해면기압(hPa) is highly overall correlated with 현지기압(hPa) and 1 other fieldsHigh correlation
시간누적강우량(mm) is highly overall correlated with 일누적강우량(mm)High correlation
일누적강우량(mm) is highly overall correlated with 시간누적강우량(mm)High correlation
시군명 is highly overall correlated with 지점번호 and 6 other fieldsHigh correlation
강수감지(0:없음1:있음2:오류) is highly imbalanced (59.5%)Imbalance
관측시간 has 436 (4.4%) zerosZeros
풍향(deg) has 884 (8.8%) zerosZeros
풍속(m/s) has 1108 (11.1%) zerosZeros
시간누적강우량(mm) has 8680 (86.8%) zerosZeros
일누적강우량(mm) has 6579 (65.8%) zerosZeros

Reproduction

Analysis started2024-07-13 12:40:53.244644
Analysis finished2024-07-13 12:42:09.060546
Duration1 minute and 15.82 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
포천시
889 
연천군
705 
여주시
 
605
화성시
 
590
평택시
 
538
Other values (26)
6673 

Length

Max length4
Median length3
Mean length3.0577
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row이천시
2nd row파주시
3rd row수원시
4th row광주시
5th row화성시

Common Values

ValueCountFrequency (%)
포천시 889
 
8.9%
연천군 705
 
7.0%
여주시 605
 
6.0%
화성시 590
 
5.9%
평택시 538
 
5.4%
안성시 533
 
5.3%
파주시 513
 
5.1%
양주시 504
 
5.0%
가평군 454
 
4.5%
용인시 444
 
4.4%
Other values (21) 4225
42.2%

Length

2024-07-13T21:42:09.305109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
포천시 889
 
8.9%
연천군 705
 
7.0%
여주시 605
 
6.0%
화성시 590
 
5.9%
평택시 538
 
5.4%
안성시 533
 
5.3%
파주시 513
 
5.1%
양주시 504
 
5.0%
가평군 454
 
4.5%
용인시 444
 
4.4%
Other values (21) 4225
42.2%

지점번호
Real number (ℝ)

HIGH CORRELATION 

Distinct135
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean468.7314
Minimum98
Maximum967
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-13T21:42:09.717142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum98
5-th percentile326
Q1436
median470
Q3533
95-th percentile590
Maximum967
Range869
Interquartile range (IQR)97

Descriptive statistics

Standard deviation116.11352
Coefficient of variation (CV)0.24771868
Kurtosis5.8678561
Mean468.7314
Median Absolute Deviation (MAD)39
Skewness0.24478369
Sum4687314
Variance13482.35
MonotonicityNot monotonic
2024-07-13T21:42:10.201667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
548 94
 
0.9%
434 92
 
0.9%
565 92
 
0.9%
456 91
 
0.9%
353 90
 
0.9%
203 88
 
0.9%
351 88
 
0.9%
538 88
 
0.9%
98 87
 
0.9%
966 87
 
0.9%
Other values (125) 9103
91.0%
ValueCountFrequency (%)
98 87
0.9%
99 65
0.7%
116 65
0.7%
119 77
0.8%
202 68
0.7%
203 88
0.9%
326 66
0.7%
351 88
0.9%
352 73
0.7%
353 90
0.9%
ValueCountFrequency (%)
967 70
0.7%
966 87
0.9%
692 72
0.7%
652 70
0.7%
599 62
0.6%
598 87
0.9%
590 87
0.9%
589 73
0.7%
576 86
0.9%
575 83
0.8%
Distinct135
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-07-13T21:42:10.868932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length2
Mean length2.5042
Min length2

Characters and Unicode

Total characters25042
Distinct characters124
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row장호원
2nd row도라산
3rd row경기
4th row경기광주
5th row운평
ValueCountFrequency (%)
840
 
7.7%
여주 94
 
0.9%
시흥 92
 
0.8%
안양 92
 
0.8%
연천 91
 
0.8%
덕정동 90
 
0.8%
이천 88
 
0.8%
남면 88
 
0.8%
신서 88
 
0.8%
과천 87
 
0.8%
Other values (126) 9190
84.8%
2024-07-13T21:42:11.774693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1133
 
4.5%
865
 
3.5%
* 840
 
3.4%
840
 
3.4%
815
 
3.3%
719
 
2.9%
692
 
2.8%
610
 
2.4%
561
 
2.2%
463
 
1.8%
Other values (114) 17504
69.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23232
92.8%
Other Punctuation 840
 
3.4%
Space Separator 840
 
3.4%
Close Punctuation 65
 
0.3%
Open Punctuation 65
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1133
 
4.9%
865
 
3.7%
815
 
3.5%
719
 
3.1%
692
 
3.0%
610
 
2.6%
561
 
2.4%
463
 
2.0%
447
 
1.9%
441
 
1.9%
Other values (110) 16486
71.0%
Other Punctuation
ValueCountFrequency (%)
* 840
100.0%
Space Separator
ValueCountFrequency (%)
840
100.0%
Close Punctuation
ValueCountFrequency (%)
) 65
100.0%
Open Punctuation
ValueCountFrequency (%)
( 65
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23232
92.8%
Common 1810
 
7.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1133
 
4.9%
865
 
3.7%
815
 
3.5%
719
 
3.1%
692
 
3.0%
610
 
2.6%
561
 
2.4%
463
 
2.0%
447
 
1.9%
441
 
1.9%
Other values (110) 16486
71.0%
Common
ValueCountFrequency (%)
* 840
46.4%
840
46.4%
) 65
 
3.6%
( 65
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 23232
92.8%
ASCII 1810
 
7.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1133
 
4.9%
865
 
3.7%
815
 
3.5%
719
 
3.1%
692
 
3.0%
610
 
2.6%
561
 
2.4%
463
 
2.0%
447
 
1.9%
441
 
1.9%
Other values (110) 16486
71.0%
ASCII
ValueCountFrequency (%)
* 840
46.4%
840
46.4%
) 65
 
3.6%
( 65
 
3.6%
Distinct135
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-07-13T21:42:12.418506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length15
Mean length13.6627
Min length10

Characters and Unicode

Total characters136627
Distinct characters155
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경기도 이천시 장호원읍 진암리
2nd row경기도 파주시 장단면 도라산리
3rd row경기도 수원시팔달구 매산로3가
4th row경기도 광주시 송정동
5th row경기도 화성시 우정읍 운평리
ValueCountFrequency (%)
경기도 10000
27.8%
포천시 889
 
2.5%
연천군 705
 
2.0%
여주시 605
 
1.7%
화성시 590
 
1.6%
평택시 538
 
1.5%
안성시 533
 
1.5%
파주시 513
 
1.4%
양주시 504
 
1.4%
가평군 454
 
1.3%
Other values (236) 20623
57.4%
2024-07-13T21:42:13.518369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25954
19.0%
10311
 
7.5%
10211
 
7.5%
10000
 
7.3%
8640
 
6.3%
6177
 
4.5%
4943
 
3.6%
4698
 
3.4%
3305
 
2.4%
2260
 
1.7%
Other values (145) 50128
36.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 110590
80.9%
Space Separator 25954
 
19.0%
Decimal Number 83
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10311
 
9.3%
10211
 
9.2%
10000
 
9.0%
8640
 
7.8%
6177
 
5.6%
4943
 
4.5%
4698
 
4.2%
3305
 
3.0%
2260
 
2.0%
2256
 
2.0%
Other values (143) 47789
43.2%
Space Separator
ValueCountFrequency (%)
25954
100.0%
Decimal Number
ValueCountFrequency (%)
3 83
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 110590
80.9%
Common 26037
 
19.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10311
 
9.3%
10211
 
9.2%
10000
 
9.0%
8640
 
7.8%
6177
 
5.6%
4943
 
4.5%
4698
 
4.2%
3305
 
3.0%
2260
 
2.0%
2256
 
2.0%
Other values (143) 47789
43.2%
Common
ValueCountFrequency (%)
25954
99.7%
3 83
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 110590
80.9%
ASCII 26037
 
19.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25954
99.7%
3 83
 
0.3%
Hangul
ValueCountFrequency (%)
10311
 
9.3%
10211
 
9.2%
10000
 
9.0%
8640
 
7.8%
6177
 
5.6%
4943
 
4.5%
4698
 
4.2%
3305
 
3.0%
2260
 
2.0%
2256
 
2.0%
Other values (143) 47789
43.2%

관측일자
Real number (ℝ)

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20240663
Minimum20240616
Maximum20240713
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-13T21:42:13.901553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20240616
5-th percentile20240617
Q120240623
median20240630
Q320240707
95-th percentile20240712
Maximum20240713
Range97
Interquartile range (IQR)84

Descriptive statistics

Standard deviation42.110073
Coefficient of variation (CV)2.0804691 × 10-6
Kurtosis-1.953728
Mean20240663
Median Absolute Deviation (MAD)13
Skewness0.1043448
Sum2.0240663 × 1011
Variance1773.2582
MonotonicityNot monotonic
2024-07-13T21:42:14.190886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
20240621 390
 
3.9%
20240624 389
 
3.9%
20240710 386
 
3.9%
20240617 385
 
3.9%
20240623 383
 
3.8%
20240706 379
 
3.8%
20240620 377
 
3.8%
20240618 376
 
3.8%
20240619 375
 
3.8%
20240708 375
 
3.8%
Other values (18) 6185
61.9%
ValueCountFrequency (%)
20240616 221
2.2%
20240617 385
3.9%
20240618 376
3.8%
20240619 375
3.8%
20240620 377
3.8%
20240621 390
3.9%
20240622 361
3.6%
20240623 383
3.8%
20240624 389
3.9%
20240625 336
3.4%
ValueCountFrequency (%)
20240713 359
3.6%
20240712 362
3.6%
20240711 355
3.5%
20240710 386
3.9%
20240709 321
3.2%
20240708 375
3.8%
20240707 374
3.7%
20240706 379
3.8%
20240705 357
3.6%
20240704 353
3.5%

관측시간
Real number (ℝ)

ZEROS 

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.4367
Minimum0
Maximum23
Zeros436
Zeros (%)4.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-13T21:42:14.639319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15
median11
Q317
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.905705
Coefficient of variation (CV)0.60381972
Kurtosis-1.1987209
Mean11.4367
Median Absolute Deviation (MAD)6
Skewness0.0021487352
Sum114367
Variance47.688762
MonotonicityNot monotonic
2024-07-13T21:42:15.029023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
12 444
 
4.4%
19 439
 
4.4%
0 436
 
4.4%
2 434
 
4.3%
10 430
 
4.3%
4 428
 
4.3%
6 426
 
4.3%
21 426
 
4.3%
15 425
 
4.2%
14 424
 
4.2%
Other values (14) 5688
56.9%
ValueCountFrequency (%)
0 436
4.4%
1 405
4.0%
2 434
4.3%
3 417
4.2%
4 428
4.3%
5 393
3.9%
6 426
4.3%
7 413
4.1%
8 399
4.0%
9 420
4.2%
ValueCountFrequency (%)
23 388
3.9%
22 404
4.0%
21 426
4.3%
20 409
4.1%
19 439
4.4%
18 384
3.8%
17 419
4.2%
16 417
4.2%
15 425
4.2%
14 424
4.2%

WGS84위도
Real number (ℝ)

HIGH CORRELATION 

Distinct135
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.512361
Minimum36.9514
Maximum38.1725
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-13T21:42:15.423712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum36.9514
5-th percentile37.0326
Q137.2344
median37.4403
Q337.8025
95-th percentile38.0586
Maximum38.1725
Range1.2211
Interquartile range (IQR)0.5681

Descriptive statistics

Standard deviation0.33291355
Coefficient of variation (CV)0.0088747692
Kurtosis-1.1275207
Mean37.512361
Median Absolute Deviation (MAD)0.2737
Skewness0.22978274
Sum375123.61
Variance0.11083143
MonotonicityNot monotonic
2024-07-13T21:42:15.891155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.2688 94
 
0.9%
37.3944 92
 
0.9%
37.3915 92
 
0.9%
38.0963 91
 
0.9%
37.8414 90
 
0.9%
37.264 88
 
0.9%
37.8978 88
 
0.9%
38.1725 88
 
0.9%
37.9019 87
 
0.9%
37.1161 87
 
0.9%
Other values (125) 9103
91.0%
ValueCountFrequency (%)
36.9514 72
0.7%
36.9672 64
0.6%
36.9806 77
0.8%
36.9845 76
0.8%
36.9877 81
0.8%
37.0037 62
0.6%
37.0326 82
0.8%