Overview

Dataset statistics

Number of variables9
Number of observations6927
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory521.0 KiB
Average record size in memory77.0 B

Variable types

Categorical2
Text4
Numeric3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12912/S/1/datasetView.do

Alerts

사용일자 has constant value ""Constant
등록일자 has constant value ""Constant
승차총승객수 is highly overall correlated with 하차총승객수High correlation
하차총승객수 is highly overall correlated with 승차총승객수High correlation
승차총승객수 has 440 (6.4%) zerosZeros
하차총승객수 has 302 (4.4%) zerosZeros

Reproduction

Analysis started2024-07-13 17:38:07.364538
Analysis finished2024-07-13 17:38:09.794989
Duration2.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.2 KiB
20230701
6927 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20230701
2nd row20230701
3rd row20230701
4th row20230701
5th row20230701

Common Values

ValueCountFrequency (%)
20230701 6927
100.0%

Length

2024-07-14T02:38:09.917474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-14T02:38:10.018684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20230701 6927
100.0%
Distinct82
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size54.2 KiB
2024-07-14T02:38:10.248043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.5599827
Min length3

Characters and Unicode

Total characters24660
Distinct characters16
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row601
2nd row441
3rd row601
4th row441
5th row441
ValueCountFrequency (%)
n26 256
 
3.7%
n37 207
 
3.0%
542 138
 
2.0%
9701 127
 
1.8%
661 125
 
1.8%
541 123
 
1.8%
441 121
 
1.7%
9408 121
 
1.7%
302 120
 
1.7%
9403 120
 
1.7%
Other values (72) 5469
79.0%
2024-07-14T02:38:10.800541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3672
14.9%
0 3291
13.3%
5 3188
12.9%
1 2802
11.4%
2 2685
10.9%
4 2320
9.4%
6 2239
9.1%
3 2215
9.0%
9 862
 
3.5%
N 500
 
2.0%
Other values (6) 886
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23702
96.1%
Uppercase Letter 788
 
3.2%
Other Letter 154
 
0.6%
Dash Punctuation 16
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3672
15.5%
0 3291
13.9%
5 3188
13.5%
1 2802
11.8%
2 2685
11.3%
4 2320
9.8%
6 2239
9.4%
3 2215
9.3%
9 862
 
3.6%
8 428
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
N 500
63.5%
B 200
 
25.4%
A 88
 
11.2%
Other Letter
ValueCountFrequency (%)
77
50.0%
77
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23718
96.2%
Latin 788
 
3.2%
Hangul 154
 
0.6%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3672
15.5%
0 3291
13.9%
5 3188
13.4%
1 2802
11.8%
2 2685
11.3%
4 2320
9.8%
6 2239
9.4%
3 2215
9.3%
9 862
 
3.6%
8 428
 
1.8%
Latin
ValueCountFrequency (%)
N 500
63.5%
B 200
 
25.4%
A 88
 
11.2%
Hangul
ValueCountFrequency (%)
77
50.0%
77
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24506
99.4%
Hangul 154
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3672
15.0%
0 3291
13.4%
5 3188
13.0%
1 2802
11.4%
2 2685
11.0%
4 2320
9.5%
6 2239
9.1%
3 2215
9.0%
9 862
 
3.5%
N 500
 
2.0%
Other values (4) 732
 
3.0%
Hangul
ValueCountFrequency (%)
77
50.0%
77
50.0%
Distinct84
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size54.2 KiB
2024-07-14T02:38:11.121783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length21
Mean length17.089794
Min length12

Characters and Unicode

Total characters118381
Distinct characters163
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row601번(개화동~종로4가)
2nd row441번(월암공영차고지~신사사거리)
3rd row601번(개화동~종로4가)
4th row441번(월암공영차고지~신사사거리)
5th row441번(월암공영차고지~신사사거리)
ValueCountFrequency (%)
542번(군포버스공영차고지~신사역 138
 
1.9%
n26번(강서공영차고지~중랑공영차고지 129
 
1.8%
9701번(가좌동~서울역 127
 
1.8%
n26번(중랑공영차고지~강서공영차고지 127
 
1.8%
661번(부천상동~영등포역,신세계백화점 125
 
1.7%
541번(군포공영차고지~강남역 123
 
1.7%
9408번(구미동차고지~고속터미널 121
 
1.7%
441번(월암공영차고지~신사사거리 121
 
1.7%
302번(성남~동대문 120
 
1.7%
9403번(구미동차고지~중곡역 120
 
1.7%
Other values (77) 5946
82.6%
2024-07-14T02:38:11.600033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 7076
 
6.0%
( 7076
 
6.0%
~ 6927
 
5.9%
6681
 
5.6%
4452
 
3.8%
4137
 
3.5%
4136
 
3.5%
3827
 
3.2%
7 3672
 
3.1%
0 3291
 
2.8%
Other values (153) 67106
56.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 71640
60.5%
Decimal Number 23844
 
20.1%
Close Punctuation 7076
 
6.0%
Open Punctuation 7076
 
6.0%
Math Symbol 6927
 
5.9%
Uppercase Letter 820
 
0.7%
Other Punctuation 712
 
0.6%
Space Separator 270
 
0.2%
Dash Punctuation 16
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6681
 
9.3%
4452
 
6.2%
4137
 
5.8%
4136
 
5.8%
3827
 
5.3%
3045
 
4.3%
2862
 
4.0%
2718
 
3.8%
1824
 
2.5%
1784
 
2.5%
Other values (132) 36174
50.5%
Decimal Number
ValueCountFrequency (%)
7 3672
15.4%
0 3291
13.8%
5 3188
13.4%
1 2942
12.3%
2 2685
11.3%
4 2322
9.7%
6 2239
9.4%
3 2215
9.3%
9 862
 
3.6%
8 428
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
N 500
61.0%
B 200
 
24.4%
A 104
 
12.7%
K 16
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 579
81.3%
. 133
 
18.7%
Close Punctuation
ValueCountFrequency (%)
) 7076
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7076
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6927
100.0%
Space Separator
ValueCountFrequency (%)
270
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 71640
60.5%
Common 45921
38.8%
Latin 820
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6681
 
9.3%
4452
 
6.2%
4137
 
5.8%
4136
 
5.8%
3827
 
5.3%
3045
 
4.3%
2862
 
4.0%
2718
 
3.8%
1824
 
2.5%
1784
 
2.5%
Other values (132) 36174
50.5%
Common
ValueCountFrequency (%)
) 7076
15.4%
( 7076
15.4%
~ 6927
15.1%
7 3672
8.0%
0 3291
7.2%
5 3188
6.9%
1 2942
6.4%
2 2685
 
5.8%
4 2322
 
5.1%
6 2239
 
4.9%
Other values (7) 4503
9.8%
Latin
ValueCountFrequency (%)
N 500
61.0%
B 200
 
24.4%
A 104
 
12.7%
K 16
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 71640
60.5%
ASCII 46741
39.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 7076
15.1%
( 7076
15.1%
~ 6927
14.8%
7 3672
7.9%
0 3291
7.0%
5 3188
6.8%
1 2942
6.3%
2 2685
 
5.7%
4 2322
 
5.0%
6 2239
 
4.8%
Other values (11) 5323
11.4%
Hangul
ValueCountFrequency (%)
6681
 
9.3%
4452
 
6.2%
4137
 
5.8%
4136
 
5.8%
3827
 
5.3%
3045
 
4.3%
2862
 
4.0%
2718
 
3.8%
1824
 
2.5%
1784
 
2.5%
Other values (132) 36174
50.5%

표준버스정류장ID
Real number (ℝ)

Distinct3705
Distinct (%)53.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5409767 × 108
Minimum1 × 108
Maximum9.998 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size61.0 KiB
2024-07-14T02:38:11.810485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.0200001 × 108
Q11.1200041 × 108
median1.21 × 108
Q32.1000034 × 108
95-th percentile2.220015 × 108
Maximum9.998 × 108
Range8.998 × 108
Interquartile range (IQR)97999936

Descriptive statistics

Standard deviation65536931
Coefficient of variation (CV)0.42529477
Kurtosis69.052473
Mean1.5409767 × 108
Median Absolute Deviation (MAD)14999794
Skewness5.7109508
Sum1.0674345 × 1012
Variance4.2950893 × 1015
MonotonicityNot monotonic
2024-07-14T02:38:11.966338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
117000003 11
 
0.2%
121000007 11
 
0.2%
121000005 11
 
0.2%
121000014 10
 
0.1%
121000009 10
 
0.1%
121000008 10
 
0.1%
117000002 10
 
0.1%
117000004 10
 
0.1%
121000015 9
 
0.1%
112000416 9
 
0.1%
Other values (3695) 6826
98.5%
ValueCountFrequency (%)
100000001 2
< 0.1%
100000002 1
< 0.1%
100000003 1
< 0.1%
100000004 2
< 0.1%
100000005 1
< 0.1%
100000006 1
< 0.1%
100000007 1
< 0.1%
100000008 1
< 0.1%
100000015 1
< 0.1%
100000016 1
< 0.1%
ValueCountFrequency (%)
999800005 2
< 0.1%
999800004 1
 
< 0.1%
999800003 1
 
< 0.1%
999033574 4
0.1%
998502964 1
 
< 0.1%
998502269 1
 
< 0.1%
998501980 2
< 0.1%
998501973 1
 
< 0.1%
998501932 1
 
< 0.1%
998501931 1
 
< 0.1%
Distinct3670
Distinct (%)53.0%
Missing0
Missing (%)0.0%
Memory size54.2 KiB
2024-07-14T02:38:12.392266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/