Overview

Dataset statistics

Number of variables9
Number of observations7688
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory578.2 KiB
Average record size in memory77.0 B

Variable types

Categorical2
Text4
Numeric3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12912/S/1/datasetView.do

Alerts

사용일자 has constant value ""Constant
등록일자 has constant value ""Constant
승차총승객수 is highly overall correlated with 하차총승객수High correlation
하차총승객수 is highly overall correlated with 승차총승객수High correlation
승차총승객수 has 507 (6.6%) zerosZeros
하차총승객수 has 354 (4.6%) zerosZeros

Reproduction

Analysis started2024-07-13 17:37:27.401332
Analysis finished2024-07-13 17:37:29.708013
Duration2.31 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size60.2 KiB
20230201
7688 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20230201
2nd row20230201
3rd row20230201
4th row20230201
5th row20230201

Common Values

ValueCountFrequency (%)
20230201 7688
100.0%

Length

2024-07-14T02:37:29.777331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-14T02:37:29.869170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20230201 7688
100.0%
Distinct94
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size60.2 KiB
2024-07-14T02:37:30.141991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.6230489
Min length3

Characters and Unicode

Total characters27854
Distinct characters23
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row601
2nd row9703
3rd row9703
4th row9703
5th row9703
ValueCountFrequency (%)
n26 244
 
3.2%
n15 229
 
3.0%
n37 201
 
2.6%
9408 144
 
1.9%
542 138
 
1.8%
9403 136
 
1.8%
9701 127
 
1.7%
661 125
 
1.6%
441 124
 
1.6%
541 123
 
1.6%
Other values (84) 6097
79.3%
2024-07-14T02:37:30.604191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 4101
14.7%
1 3764
13.5%
5 3475
12.5%
0 3043
10.9%
2 2754
9.9%
6 2630
9.4%
3 2482
8.9%
4 2450
8.8%
9 1007
 
3.6%
N 674
 
2.4%
Other values (13) 1474
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26356
94.6%
Uppercase Letter 893
 
3.2%
Other Letter 590
 
2.1%
Dash Punctuation 15
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 4101
15.6%
1 3764
14.3%
5 3475
13.2%
0 3043
11.5%
2 2754
10.4%
6 2630
10.0%
3 2482
9.4%
4 2450
9.3%
9 1007
 
3.8%
8 650
 
2.5%
Other Letter
ValueCountFrequency (%)
148
25.1%
148
25.1%
106
18.0%
98
16.6%
29
 
4.9%
20
 
3.4%
20
 
3.4%
15
 
2.5%
6
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
N 674
75.5%
B 148
 
16.6%
A 71
 
8.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 26371
94.7%
Latin 893
 
3.2%
Hangul 590
 
2.1%

Most frequent character per script

Common
ValueCountFrequency (%)
7 4101
15.6%
1 3764
14.3%
5 3475
13.2%
0 3043
11.5%
2 2754
10.4%
6 2630
10.0%
3 2482
9.4%
4 2450
9.3%
9 1007
 
3.8%
8 650
 
2.5%
Hangul
ValueCountFrequency (%)
148
25.1%
148
25.1%
106
18.0%
98
16.6%
29
 
4.9%
20
 
3.4%
20
 
3.4%
15
 
2.5%
6
 
1.0%
Latin
ValueCountFrequency (%)
N 674
75.5%
B 148
 
16.6%
A 71
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27264
97.9%
Hangul 590
 
2.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 4101
15.0%
1 3764
13.8%
5 3475
12.7%
0 3043
11.2%
2 2754
10.1%
6 2630
9.6%
3 2482
9.1%
4 2450
9.0%
9 1007
 
3.7%
N 674
 
2.5%
Other values (4) 884
 
3.2%
Hangul
ValueCountFrequency (%)
148
25.1%
148
25.1%
106
18.0%
98
16.6%
29
 
4.9%
20
 
3.4%
20
 
3.4%
15
 
2.5%
6
 
1.0%
Distinct99
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size60.2 KiB
2024-07-14T02:37:30.833783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length17.375
Min length12

Characters and Unicode

Total characters133579
Distinct characters184
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row601번(개화동~종로4가)
2nd row9703번(신성교통차고지~서울역)
3rd row9703번(신성교통차고지~서울역)
4th row9703번(신성교통차고지~서울역)
5th row9703번(신성교통차고지~서울역)
ValueCountFrequency (%)
9408번(성남 144
 
1.8%
분당~영등포 144
 
1.8%
n15번(우이동성원아파트~남태령역 142
 
1.8%
542번(군포버스공영차고지~신사역 138
 
1.7%
9403번(성남분당~을지로5가 136
 
1.7%
9701번(가좌동~서울역 127
 
1.6%
661번(부천상동~영등포역,신세계백화점 125
 
1.5%
441번(월암공영차고지~신사사거리 124
 
1.5%
n26번(중랑공영차고지~강서공영차고지 123
 
1.5%
541번(군포공영차고지~강남역 123
 
1.5%
Other values (93) 6782
83.6%
2024-07-14T02:37:31.187417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 7772
 
5.8%
( 7772
 
5.8%
~ 7688
 
5.8%
7294
 
5.5%
4557
 
3.4%
4546
 
3.4%
4251
 
3.2%
7 4107
 
3.1%
1 4055
 
3.0%
4041
 
3.0%
Other values (174) 77496
58.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 81672
61.1%
Decimal Number 26799
 
20.1%
Close Punctuation 7772
 
5.8%
Open Punctuation 7772
 
5.8%
Math Symbol 7688
 
5.8%
Uppercase Letter 893
 
0.7%
Other Punctuation 548
 
0.4%
Space Separator 420
 
0.3%
Dash Punctuation 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7294
 
8.9%
4557
 
5.6%
4546
 
5.6%
4251
 
5.2%
4041
 
4.9%
3318
 
4.1%
3140
 
3.8%
2857
 
3.5%
1571
 
1.9%
1523
 
1.9%
Other values (154) 44574
54.6%
Decimal Number
ValueCountFrequency (%)
7 4107
15.3%
1 4055
15.1%
5 3611
13.5%
0 3043
11.4%
2 2754
10.3%
6 2630
9.8%
3 2482
9.3%
4 2460
9.2%
9 1007
 
3.8%
8 650
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
N 674
75.5%
B 148
 
16.6%
A 71
 
8.0%
Other Punctuation
ValueCountFrequency (%)
, 411
75.0%
. 137
 
25.0%
Close Punctuation
ValueCountFrequency (%)
) 7772
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7772
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7688
100.0%
Space Separator
ValueCountFrequency (%)
420
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 81672
61.1%
Common 51014
38.2%
Latin 893
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7294
 
8.9%
4557
 
5.6%
4546
 
5.6%
4251
 
5.2%
4041
 
4.9%
3318
 
4.1%
3140
 
3.8%
2857
 
3.5%
1571
 
1.9%
1523
 
1.9%
Other values (154) 44574
54.6%
Common
ValueCountFrequency (%)
) 7772
15.2%
( 7772
15.2%
~ 7688
15.1%
7 4107
8.1%
1 4055
7.9%
5 3611
7.1%
0 3043
 
6.0%
2 2754
 
5.4%
6 2630
 
5.2%
3 2482
 
4.9%
Other values (7) 5100
10.0%
Latin
ValueCountFrequency (%)
N 674
75.5%
B 148
 
16.6%
A 71
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 81672
61.1%
ASCII 51907
38.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 7772
15.0%
( 7772
15.0%
~ 7688
14.8%
7 4107
7.9%
1 4055
7.8%
5 3611
7.0%
0 3043
 
5.9%
2 2754
 
5.3%
6 2630
 
5.1%
3 2482
 
4.8%
Other values (10) 5993
11.5%
Hangul
ValueCountFrequency (%)
7294
 
8.9%
4557
 
5.6%
4546
 
5.6%
4251
 
5.2%
4041
 
4.9%
3318
 
4.1%
3140
 
3.8%
2857
 
3.5%
1571
 
1.9%
1523
 
1.9%
Other values (154) 44574
54.6%

표준버스정류장ID
Real number (ℝ)

Distinct4014
Distinct (%)52.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4787591 × 108
Minimum1 × 108
Maximum9.998 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size67.7 KiB
2024-07-14T02:37:31.341348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.0100012 × 108
Q11.1100093 × 108
median1.19 × 108
Q32.0800012 × 108
95-th percentile2.2200062 × 108
Maximum9.998 × 108
Range8.998 × 108
Interquartile range (IQR)96999187

Descriptive statistics

Standard deviation64667584
Coefficient of variation (CV)0.43730979
Kurtosis74.93714
Mean1.4787591 × 108
Median Absolute Deviation (MAD)8999989
Skewness6.1489039
Sum1.13687 × 1012
Variance4.1818964 × 1015
MonotonicityNot monotonic
2024-07-14T02:37:31.484379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000384 12
 
0.2%
121000005 12
 
0.2%
121000007 12
 
0.2%
121000008 11
 
0.1%
100000380 11
 
0.1%
121000009 11
 
0.1%
121000006 11
 
0.1%
112000401 10
 
0.1%
121000003 10
 
0.1%
112000408 10
 
0.1%
Other values (4004) 7578
98.6%
ValueCountFrequency (%)
100000001 3
< 0.1%
100000002 3
< 0.1%
100000003 2
< 0.1%
100000004 3
< 0.1%
100000005 3
< 0.1%
100000006 1
 
< 0.1%
100000007 1
 
< 0.1%
100000008 1
 
< 0.1%
100000015 1
 
< 0.1%
100000016 1
 
< 0.1%
ValueCountFrequency (%)
999800005 2
< 0.1%
999800004 1
 
< 0.1%
999800003 1
 
< 0.1%
999033574 4
0.1%
998502944 1
 
< 0.1%
998502907 1
 
< 0.1%
998502062 1
 
< 0.1%
998501980 1
 
< 0.1%
998501973 1
 
< 0.1%
998501932 1
 
< 0.1%
Distinct3973
Distinct (%)51.7%
Missing0
Missing (%)0.0%
Memory size60.2 KiB
2024-07-14T02:37:31.797629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/