Overview

Dataset statistics

Number of variables9
Number of observations7151
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory537.9 KiB
Average record size in memory77.0 B

Variable types

Categorical2
Text4
Numeric3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12912/S/1/datasetView.do

Alerts

사용일자 has constant value ""Constant
등록일자 has constant value ""Constant
승차총승객수 is highly overall correlated with 하차총승객수High correlation
하차총승객수 is highly overall correlated with 승차총승객수High correlation
승차총승객수 has 391 (5.5%) zerosZeros
하차총승객수 has 285 (4.0%) zerosZeros

Reproduction

Analysis started2024-07-13 17:38:26.013945
Analysis finished2024-07-13 17:38:28.393570
Duration2.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size56.0 KiB
20230901
7151 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20230901
2nd row20230901
3rd row20230901
4th row20230901
5th row20230901

Common Values

ValueCountFrequency (%)
20230901 7151
100.0%

Length

2024-07-14T02:38:28.506058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-14T02:38:28.634964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20230901 7151
100.0%
Distinct89
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size56.0 KiB
2024-07-14T02:38:28.936049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.5864914
Min length3

Characters and Unicode

Total characters25647
Distinct characters20
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row601
2nd row9409
3rd row9409
4th row9409
5th row9409
ValueCountFrequency (%)
n26 259
 
3.6%
n37 214
 
3.0%
4318 177
 
2.5%
542 138
 
1.9%
9701 127
 
1.8%
661 125
 
1.7%
541 123
 
1.7%
441 123
 
1.7%
9403 121
 
1.7%
5623 120
 
1.7%
Other values (79) 5624
78.6%
2024-07-14T02:38:29.403563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3823
14.9%
0 3212
12.5%
5 3167
12.3%
1 2939
11.5%
3 2588
10.1%
4 2488
9.7%
2 2429
9.5%
6 2334
9.1%
9 859
 
3.3%
8 822
 
3.2%
Other values (10) 986
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 24661
96.2%
Uppercase Letter 741
 
2.9%
Other Letter 230
 
0.9%
Dash Punctuation 15
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3823
15.5%
0 3212
13.0%
5 3167
12.8%
1 2939
11.9%
3 2588
10.5%
4 2488
10.1%
2 2429
9.8%
6 2334
9.5%
9 859
 
3.5%
8 822
 
3.3%
Other Letter
ValueCountFrequency (%)
77
33.5%
77
33.5%
29
 
12.6%
18
 
7.8%
18
 
7.8%
11
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
N 523
70.6%
B 147
 
19.8%
A 71
 
9.6%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 24676
96.2%
Latin 741
 
2.9%
Hangul 230
 
0.9%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3823
15.5%
0 3212
13.0%
5 3167
12.8%
1 2939
11.9%
3 2588
10.5%
4 2488
10.1%
2 2429
9.8%
6 2334
9.5%
9 859
 
3.5%
8 822
 
3.3%
Hangul
ValueCountFrequency (%)
77
33.5%
77
33.5%
29
 
12.6%
18
 
7.8%
18
 
7.8%
11
 
4.8%
Latin
ValueCountFrequency (%)
N 523
70.6%
B 147
 
19.8%
A 71
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25417
99.1%
Hangul 230
 
0.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3823
15.0%
0 3212
12.6%
5 3167
12.5%
1 2939
11.6%
3 2588
10.2%
4 2488
9.8%
2 2429
9.6%
6 2334
9.2%
9 859
 
3.4%
8 822
 
3.2%
Other values (4) 756
 
3.0%
Hangul
ValueCountFrequency (%)
77
33.5%
77
33.5%
29
 
12.6%
18
 
7.8%
18
 
7.8%
11
 
4.8%
Distinct93
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size56.0 KiB
2024-07-14T02:38:29.679978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length23
Mean length17.208642
Min length12

Characters and Unicode

Total characters123059
Distinct characters189
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row601번(개화동~종로4가)
2nd row9409번(구미동차고지~신사역)
3rd row9409번(구미동차고지~신사역)
4th row9409번(구미동차고지~신사역)
5th row9409번(구미동차고지~신사역)
ValueCountFrequency (%)
542번(군포버스공영차고지~신사역 138
 
1.9%
n26번(중랑공영차고지~강서공영차고지 130
 
1.8%
n26번(강서공영차고지~중랑공영차고지 129
 
1.7%
9701번(가좌동~서울역 127
 
1.7%
661번(부천상동~영등포역,신세계백화점 125
 
1.7%
541번(군포공영차고지~강남역 123
 
1.7%
441번(월암공영차고지~신사사거리 123
 
1.7%
9403번(구미동차고지~중곡역 121
 
1.6%
5623번(군포 120
 
1.6%
공영차고지~여의도 120
 
1.6%
Other values (86) 6171
83.1%
2024-07-14T02:38:30.189453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 7304
 
5.9%
) 7304
 
5.9%
~ 7151
 
5.8%
6909
 
5.6%
4865
 
4.0%
4017
 
3.3%
3938
 
3.2%
7 3823
 
3.1%
3674
 
3.0%
0 3212
 
2.6%
Other values (179) 70862
57.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 74666
60.7%
Decimal Number 24819
 
20.2%
Open Punctuation 7304
 
5.9%
Close Punctuation 7304
 
5.9%
Math Symbol 7151
 
5.8%
Uppercase Letter 771
 
0.6%
Other Punctuation 753
 
0.6%
Space Separator 276
 
0.2%
Dash Punctuation 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6909
 
9.3%
4865
 
6.5%
4017
 
5.4%
3938
 
5.3%
3674
 
4.9%
3062
 
4.1%
3020
 
4.0%
2711
 
3.6%
1877
 
2.5%
1827
 
2.4%
Other values (158) 38766
51.9%
Decimal Number
ValueCountFrequency (%)
7 3823
15.4%
0 3212
12.9%
5 3167
12.8%
1 3082
12.4%
3 2588
10.4%
4 2503
10.1%
2 2429
9.8%
6 2334
9.4%
9 859
 
3.5%
8 822
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
N 523
67.8%
B 147
 
19.1%
A 86
 
11.2%
K 15
 
1.9%
Other Punctuation
ValueCountFrequency (%)
, 411
54.6%
. 342
45.4%
Open Punctuation
ValueCountFrequency (%)
( 7304
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7304
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7151
100.0%
Space Separator
ValueCountFrequency (%)
276
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 74666
60.7%
Common 47622
38.7%
Latin 771
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6909
 
9.3%
4865
 
6.5%
4017
 
5.4%
3938
 
5.3%
3674
 
4.9%
3062
 
4.1%
3020
 
4.0%
2711
 
3.6%
1877
 
2.5%
1827
 
2.4%
Other values (158) 38766
51.9%
Common
ValueCountFrequency (%)
( 7304
15.3%
) 7304
15.3%
~ 7151
15.0%
7 3823
8.0%
0 3212
6.7%
5 3167
6.7%
1 3082
6.5%
3 2588
 
5.4%
4 2503
 
5.3%
2 2429
 
5.1%
Other values (7) 5059
10.6%
Latin
ValueCountFrequency (%)
N 523
67.8%
B 147
 
19.1%
A 86
 
11.2%
K 15
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 74666
60.7%
ASCII 48393
39.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 7304
15.1%
) 7304
15.1%
~ 7151
14.8%
7 3823
7.9%
0 3212
6.6%
5 3167
6.5%
1 3082
6.4%
3 2588
 
5.3%
4 2503
 
5.2%
2 2429
 
5.0%
Other values (11) 5830
12.0%
Hangul
ValueCountFrequency (%)
6909
 
9.3%
4865
 
6.5%
4017
 
5.4%
3938
 
5.3%
3674
 
4.9%
3062
 
4.1%
3020
 
4.0%
2711
 
3.6%
1877
 
2.5%
1827
 
2.4%
Other values (158) 38766
51.9%

표준버스정류장ID
Real number (ℝ)

Distinct3729
Distinct (%)52.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5234727 × 108
Minimum1 × 108
Maximum9.998 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size63.0 KiB
2024-07-14T02:38:30.378749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.0100027 × 108
Q11.1200017 × 108
median1.2000016 × 108
Q32.0900026 × 108
95-th percentile2.2200158 × 108
Maximum9.998 × 108
Range8.998 × 108
Interquartile range (IQR)97000092

Descriptive statistics

Standard deviation67733672
Coefficient of variation (CV)0.4446005
Kurtosis72.059946
Mean1.5234727 × 108
Median Absolute Deviation (MAD)10000104
Skewness6.14242
Sum1.0894353 × 1012
Variance4.5878503 × 1015
MonotonicityNot monotonic
2024-07-14T02:38:30.572305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000380 11
 
0.2%
121000007 11
 
0.2%
117000003 11
 
0.2%
112000398 10
 
0.1%
121000008 10
 
0.1%
112000401 10
 
0.1%
121000010 10
 
0.1%
121000009 10
 
0.1%
117000004 10
 
0.1%
112000408 10
 
0.1%
Other values (3719) 7048
98.6%
ValueCountFrequency (%)
100000001 2
< 0.1%
100000002 1
< 0.1%
100000003 1
< 0.1%
100000004 2
< 0.1%
100000005 1
< 0.1%
100000006 1
< 0.1%
100000007 1
< 0.1%
100000008 1
< 0.1%
100000015 1
< 0.1%
100000016 1
< 0.1%
ValueCountFrequency (%)
999800005 2
< 0.1%
999800004 1
 
< 0.1%
999800003 1
 
< 0.1%
999033574 4
0.1%
998502944 1
 
< 0.1%
998502907 1
 
< 0.1%
998501980 2
< 0.1%
998501977 1
 
< 0.1%
998501975 1
 
< 0.1%
998501932 1
 
< 0.1%
Distinct3693
Distinct (%)51.6%
Missing0
Missing (%)0.0%
Memory size56.0 KiB
2024-07-14T02:38:31.006938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/