Overview

Dataset statistics

Number of variables9
Number of observations6931
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory521.3 KiB
Average record size in memory77.0 B

Variable types

Categorical2
Text4
Numeric3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12912/S/1/datasetView.do

Alerts

사용일자 has constant value ""Constant
등록일자 has constant value ""Constant
승차총승객수 is highly overall correlated with 하차총승객수High correlation
하차총승객수 is highly overall correlated with 승차총승객수High correlation
승차총승객수 has 477 (6.9%) zerosZeros
하차총승객수 has 302 (4.4%) zerosZeros

Reproduction

Analysis started2024-07-13 17:37:51.677742
Analysis finished2024-07-13 17:37:53.487997
Duration1.81 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.3 KiB
20230501
6931 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20230501
2nd row20230501
3rd row20230501
4th row20230501
5th row20230501

Common Values

ValueCountFrequency (%)
20230501 6931
100.0%

Length

2024-07-14T02:37:53.572290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-14T02:37:53.667671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20230501 6931
100.0%
Distinct84
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size54.3 KiB
2024-07-14T02:37:53.914998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.6013562
Min length3

Characters and Unicode

Total characters24961
Distinct characters17
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100
2nd row9408
3rd row9408
4th row9408
5th row9408
ValueCountFrequency (%)
n26 241
 
3.5%
n15 205
 
3.0%
n37 195
 
2.8%
542 138
 
2.0%
9701 127
 
1.8%
661 125
 
1.8%
441 124
 
1.8%
541 123
 
1.8%
9403 121
 
1.7%
5623 120
 
1.7%
Other values (74) 5412
78.1%
2024-07-14T02:37:54.292352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3812
15.3%
1 3324
13.3%
5 3134
12.6%
0 2719
10.9%
6 2527
10.1%
2 2498
10.0%
4 2310
9.3%
3 2236
9.0%
9 969
 
3.9%
N 641
 
2.6%
Other values (7) 791
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23960
96.0%
Uppercase Letter 792
 
3.2%
Other Letter 194
 
0.8%
Dash Punctuation 15
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3812
15.9%
1 3324
13.9%
5 3134
13.1%
0 2719
11.3%
6 2527
10.5%
2 2498
10.4%
4 2310
9.6%
3 2236
9.3%
9 969
 
4.0%
8 431
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
N 641
80.9%
B 148
 
18.7%
A 3
 
0.4%
Other Letter
ValueCountFrequency (%)
97
50.0%
77
39.7%
20
 
10.3%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23975
96.0%
Latin 792
 
3.2%
Hangul 194
 
0.8%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3812
15.9%
1 3324
13.9%
5 3134
13.1%
0 2719
11.3%
6 2527
10.5%
2 2498
10.4%
4 2310
9.6%
3 2236
9.3%
9 969
 
4.0%
8 431
 
1.8%
Latin
ValueCountFrequency (%)
N 641
80.9%
B 148
 
18.7%
A 3
 
0.4%
Hangul
ValueCountFrequency (%)
97
50.0%
77
39.7%
20
 
10.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24767
99.2%
Hangul 194
 
0.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3812
15.4%
1 3324
13.4%
5 3134
12.7%
0 2719
11.0%
6 2527
10.2%
2 2498
10.1%
4 2310
9.3%
3 2236
9.0%
9 969
 
3.9%
N 641
 
2.6%
Other values (4) 597
 
2.4%
Hangul
ValueCountFrequency (%)
97
50.0%
77
39.7%
20
 
10.3%
Distinct87
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size54.3 KiB
2024-07-14T02:37:54.550331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length21
Mean length17.36041
Min length12

Characters and Unicode

Total characters120325
Distinct characters176
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100번(하계동~용산구청)
2nd row9408번(구미동차고지~고속터미널)
3rd row9408번(구미동차고지~고속터미널)
4th row9408번(구미동차고지~고속터미널)
5th row9408번(구미동차고지~고속터미널)
ValueCountFrequency (%)
n15번(우이동성원아파트~남태령역 140
 
1.9%
542번(군포버스공영차고지~신사역 138
 
1.9%
9701번(가좌동~서울역 127
 
1.8%
661번(부천상동~영등포역,신세계백화점 125
 
1.7%
441번(월암공영차고지~신사사거리 124
 
1.7%
541번(군포공영차고지~강남역 123
 
1.7%
n26번(강서공영차고지~중랑공영차고지 121
 
1.7%
9403번(구미동차고지~중곡역 121
 
1.7%
공영차고지~여의도 120
 
1.7%
n26번(중랑공영차고지~강서공영차고지 120
 
1.7%
Other values (80) 5947
82.5%
2024-07-14T02:37:54.969369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 6998
 
5.8%
) 6998
 
5.8%
~ 6931
 
5.8%
6653
 
5.5%
4591
 
3.8%
4435
 
3.7%
4346
 
3.6%
4125
 
3.4%
7 3812
 
3.2%
1 3396
 
2.8%
Other values (166) 68040
56.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 73823
61.4%
Decimal Number 24032
 
20.0%
Open Punctuation 6998
 
5.8%
Close Punctuation 6998
 
5.8%
Math Symbol 6931
 
5.8%
Uppercase Letter 822
 
0.7%
Other Punctuation 431
 
0.4%
Space Separator 275
 
0.2%
Dash Punctuation 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6653
 
9.0%
4591
 
6.2%
4435
 
6.0%
4346
 
5.9%
4125
 
5.6%
3017
 
4.1%
2869
 
3.9%
2725
 
3.7%
1440
 
2.0%
1288
 
1.7%
Other values (145) 38334
51.9%
Decimal Number
ValueCountFrequency (%)
7 3812
15.9%
1 3396
14.1%
5 3134
13.0%
0 2719
11.3%
6 2527
10.5%
2 2498
10.4%
4 2310
9.6%
3 2236
9.3%
9 969
 
4.0%
8 431
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
N 641
78.0%
B 148
 
18.0%
A 18
 
2.2%
K 15
 
1.8%
Other Punctuation
ValueCountFrequency (%)
, 300
69.6%
. 131
30.4%
Open Punctuation
ValueCountFrequency (%)
( 6998
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6998
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6931
100.0%
Space Separator
ValueCountFrequency (%)
275
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 73823
61.4%
Common 45680
38.0%
Latin 822
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6653
 
9.0%
4591
 
6.2%
4435
 
6.0%
4346
 
5.9%
4125
 
5.6%
3017
 
4.1%
2869
 
3.9%
2725
 
3.7%
1440
 
2.0%
1288
 
1.7%
Other values (145) 38334
51.9%
Common
ValueCountFrequency (%)
( 6998
15.3%
) 6998
15.3%
~ 6931
15.2%
7 3812
8.3%
1 3396
7.4%
5 3134
6.9%
0 2719
 
6.0%
6 2527
 
5.5%
2 2498
 
5.5%
4 2310
 
5.1%
Other values (7) 4357
9.5%
Latin
ValueCountFrequency (%)
N 641
78.0%
B 148
 
18.0%
A 18
 
2.2%
K 15
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 73823
61.4%
ASCII 46502
38.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 6998
15.0%
) 6998
15.0%
~ 6931
14.9%
7 3812
8.2%
1 3396
7.3%
5 3134
6.7%
0 2719
 
5.8%
6 2527
 
5.4%
2 2498
 
5.4%
4 2310
 
5.0%
Other values (11) 5179
11.1%
Hangul
ValueCountFrequency (%)
6653
 
9.0%
4591
 
6.2%
4435
 
6.0%
4346
 
5.9%
4125
 
5.6%
3017
 
4.1%
2869
 
3.9%
2725
 
3.7%
1440
 
2.0%
1288
 
1.7%
Other values (145) 38334
51.9%

표준버스정류장ID
Real number (ℝ)

Distinct3738
Distinct (%)53.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5023111 × 108
Minimum1 × 108
Maximum9.998 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size61.0 KiB
2024-07-14T02:37:55.141626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.0100027 × 108
Q11.1200001 × 108
median1.190003 × 108
Q32.0900006 × 108
95-th percentile2.2200157 × 108
Maximum9.998 × 108
Range8.998 × 108
Interquartile range (IQR)97000052

Descriptive statistics

Standard deviation65901581
Coefficient of variation (CV)0.438668
Kurtosis72.080571
Mean1.5023111 × 108
Median Absolute Deviation (MAD)12000133
Skewness6.0197208
Sum1.0412518 × 1012
Variance4.3430184 × 1015
MonotonicityNot monotonic
2024-07-14T02:37:55.296313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
121000007 12
 
0.2%
121000006 11
 
0.2%
121000009 11
 
0.2%
121000008 11
 
0.2%
121000005 11
 
0.2%
121000014 10
 
0.1%
112000005 10
 
0.1%
111000004 10
 
0.1%
121000003 10
 
0.1%
111000006 10
 
0.1%
Other values (3728) 6825
98.5%
ValueCountFrequency (%)
100000001 3
< 0.1%
100000002 2
< 0.1%
100000003 2
< 0.1%
100000004 3
< 0.1%
100000005 1
 
< 0.1%
100000006 1
 
< 0.1%
100000007 1
 
< 0.1%
100000008 1
 
< 0.1%
100000015 1
 
< 0.1%
100000016 1
 
< 0.1%
ValueCountFrequency (%)
999800005 2
< 0.1%
999800004 1
 
< 0.1%
999033574 4
0.1%
998502062 1
 
< 0.1%
998501980 1
 
< 0.1%
998501976 1
 
< 0.1%
998501973 1
 
< 0.1%
998501932 1
 
< 0.1%
998501931 1
 
< 0.1%
998001700 1
 
< 0.1%
Distinct3699
Distinct (%)53.4%
Missing0
Missing (%)0.0%
Memory size54.3 KiB
2024-07-14T02:37:55.621993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/