Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows10
Duplicate rows (%)0.1%
Total size in memory507.8 KiB
Average record size in memory52.0 B

Variable types

Categorical2
Numeric3

Dataset

DescriptionSample
Author오아시스비즈니스
URLhttps://www.bigdata-realestate.kr/rebpp/usr/prd/prdInfoDetail.do?req_productId=72

Alerts

data_strd_ym has constant value ""Constant
Dataset has 10 (0.1%) duplicate rowsDuplicates
pnu is highly overall correlated with legaldong_cdHigh correlation
legaldong_cd is highly overall correlated with pnuHigh correlation

Reproduction

Analysis started2023-12-11 22:33:25.159209
Analysis finished2023-12-11 22:33:27.113754
Duration1.95 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

data_strd_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202306
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202306
2nd row202306
3rd row202306
4th row202306
5th row202306

Common Values

ValueCountFrequency (%)
202306 10000
100.0%

Length

2023-12-12T07:33:27.163062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:33:27.233065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202306 10000
100.0%

pnu
Real number (ℝ)

HIGH CORRELATION 

Distinct9081
Distinct (%)90.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1275431 × 1018
Minimum1.1110101 × 1018
Maximum1.1440124 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:33:27.316699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110101 × 1018
5-th percentile1.1110164 × 1018
Q11.1200115 × 1018
median1.1260105 × 1018
Q31.1350106 × 1018
95-th percentile1.144012 × 1018
Maximum1.1440124 × 1018
Range3.30023 × 1016
Interquartile range (IQR)1.49991 × 1016

Descriptive statistics

Standard deviation1.0156322 × 1016
Coefficient of variation (CV)0.0090074807
Kurtosis-1.0673278
Mean1.1275431 × 1018
Median Absolute Deviation (MAD)8.9974 × 1015
Skewness0.09893963
Sum4.4700513 × 1018
Variance1.0315088 × 1032
MonotonicityNot monotonic
2023-12-12T07:33:27.432741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1132010600107180007 6
 
0.1%
1132010600107160005 6
 
0.1%
1144010800100120041 6
 
0.1%
1129013400111710000 5
 
0.1%
1129013800103160003 5
 
0.1%
1135010500106510002 4
 
< 0.1%
1120011500102750028 4
 
< 0.1%
1141011700101310001 4
 
< 0.1%
1126010600107970000 4
 
< 0.1%
1126010500104790000 4
 
< 0.1%
Other values (9071) 9952
99.5%
ValueCountFrequency (%)
1111010100100940001 1
< 0.1%
1111010200100700010 1
< 0.1%
1111010400100510001 1
< 0.1%
1111010400100520004 1
< 0.1%
1111010500100170001 1
< 0.1%
1111010500100980003 1
< 0.1%
1111010500101350000 1
< 0.1%
1111010500101360000 1
< 0.1%
1111010500101400000 1
< 0.1%
1111010500101530001 1
< 0.1%
ValueCountFrequency (%)
1144012400102600012 2
< 0.1%
1144012400102590010 1
< 0.1%
1144012400102580003 1
< 0.1%
1144012400102570022 1
< 0.1%
1144012400102570019 1
< 0.1%
1144012400102570006 1
< 0.1%
1144012400102550020 1
< 0.1%
1144012400102550019 1
< 0.1%
1144012400102550017 1
< 0.1%
1144012400102540008 1
< 0.1%

legaldong_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct317
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11275431
Minimum11110101
Maximum11440124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:33:27.551178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110101
5-th percentile11110164
Q111200115
median11260105
Q311350106
95-th percentile11440120
Maximum11440124
Range330023
Interquartile range (IQR)149991

Descriptive statistics

Standard deviation101563.22
Coefficient of variation (CV)0.0090074807
Kurtosis-1.0673278
Mean11275431
Median Absolute Deviation (MAD)89974
Skewness0.09893963
Sum1.1275431 × 1011
Variance1.0315088 × 1010
MonotonicityNot monotonic
2023-12-12T07:33:27.664567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11260101 296
 
3.0%
11305103 285
 
2.9%
11440120 277
 
2.8%
11350105 262
 
2.6%
11305101 240
 
2.4%
11215101 217
 
2.2%
11215105 208
 
2.1%
11230106 199
 
2.0%
11215103 184
 
1.8%
11320107 183
 
1.8%
Other values (307) 7649
76.5%
ValueCountFrequency (%)
11110101 1
 
< 0.1%
11110102 1
 
< 0.1%
11110104 2
 
< 0.1%
11110105 6
 
0.1%
11110106 2
 
< 0.1%
11110107 10
0.1%
11110108 17
0.2%
11110109 3
 
< 0.1%
11110110 9
0.1%
11110111 7
0.1%
ValueCountFrequency (%)
11440124 58
 
0.6%
11440123 153
1.5%
11440122 85
 
0.9%
11440121 103
 
1.0%
11440120 277
2.8%
11440118 3
 
< 0.1%
11440116 1
 
< 0.1%
11440115 43
 
0.4%
11440114 26
 
0.3%
11440113 12
 
0.1%

induty_cd
Categorical

Distinct41
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A01
1707 
A03
1407 
C01
762 
B02
730 
C05
522 
Other values (36)
4872 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowC05
2nd rowB11
3rd rowB01
4th rowA14
5th rowC01

Common Values

ValueCountFrequency (%)
A01 1707
17.1%
A03 1407
14.1%
C01 762
 
7.6%
B02 730
 
7.3%
C05 522
 
5.2%
B01 399
 
4.0%
C06 373
 
3.7%
C03 368
 
3.7%
B05 330
 
3.3%
C07 316
 
3.2%
Other values (31) 3086
30.9%

Length

2023-12-12T07:33:27.785719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a01 1707
17.1%
a03 1407
14.1%
c01 762
 
7.6%
b02 730
 
7.3%
c05 522
 
5.2%
b01 399
 
4.0%
c06 373
 
3.7%
c03 368
 
3.7%
b05 330
 
3.3%
c07 316
 
3.2%
Other values (31) 3086
30.9%

snp_price_scor
Real number (ℝ)

Distinct3058
Distinct (%)30.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66.487985
Minimum35.48
Maximum94.81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:33:27.882505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35.48
5-th percentile54.11
Q161.4175
median66.62
Q371.61
95-th percentile78.43
Maximum94.81
Range59.33
Interquartile range (IQR)10.1925

Descriptive statistics

Standard deviation7.6268802
Coefficient of variation (CV)0.11471065
Kurtosis0.36987153
Mean66.487985
Median Absolute Deviation (MAD)5.1
Skewness-0.011733886
Sum664879.85
Variance58.169301
MonotonicityNot monotonic
2023-12-12T07:33:28.003939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
68.19 14
 
0.1%
60.93 14
 
0.1%
66.51 14
 
0.1%
68.8 13
 
0.1%
66.59 13
 
0.1%
66.24 13
 
0.1%
64.05 12
 
0.1%
66.88 12
 
0.1%
64.37 12
 
0.1%
70.03 12
 
0.1%
Other values (3048) 9871
98.7%
ValueCountFrequency (%)
35.48 1
< 0.1%
35.78 1
< 0.1%
38.08 1
< 0.1%
38.78 1
< 0.1%
38.87 1
< 0.1%
40.22 1
< 0.1%
40.62 1
< 0.1%
40.77 1
< 0.1%
41.33 1
< 0.1%
41.38 1
< 0.1%
ValueCountFrequency (%)
94.81 1
< 0.1%
92.91 1
< 0.1%
92.74 1
< 0.1%
92.57 1
< 0.1%
91.95 1
< 0.1%
91.91 1
< 0.1%
91.89 1
< 0.1%
91.69 1
< 0.1%
91.65 1
< 0.1%
91.42 1
< 0.1%

Interactions

2023-12-12T07:33:26.652624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:33:25.913380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:33:26.244099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:33:26.753387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:33:26.020182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:33:26.355778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:33:26.843623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:33:26.128640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:33:26.516253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:33:28.081259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdinduty_cdsnp_price_scor
pnu1.0001.0000.2820.576
legaldong_cd1.0001.0000.2820.576
induty_cd0.2820.2821.0000.357
snp_price_scor0.5760.5760.3571.000
2023-12-12T07:33:28.184131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdsnp_price_scorinduty_cd
pnu1.0001.000-0.1790.100
legaldong_cd1.0001.000-0.1790.100
snp_price_scor-0.179-0.1791.0000.130
induty_cd0.1000.1000.1301.000

Missing values

2023-12-12T07:33:26.991521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:33:27.077325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

data_strd_ympnulegaldong_cdinduty_cdsnp_price_scor
49382202306112901380010180001311290138C0571.44
26274202306112151030010243000211215103B1158.81
73731202306114101120010090010511410112B0174.98
69275202306113801070010089000111380107A1461.03
5279202306111101720010016000111110172C0168.49
3183202306111101510010139000011110151A1367.96
46774202306112901330010016005311290133A1355.28
38251202306112601010010161000611260101A0160.08
18289202306112001050010336000711200105C0571.71
42564202306112601050010127003311260105B0256.46
data_strd_ympnulegaldong_cdinduty_cdsnp_price_scor
38448202306112601010010181003911260101C0767.19
77696202306114401020010462000011440102B1968.0
49270202306112901380010068008811290138A0171.11
50993202306113051010010137002011305101A0167.44
30955202306112301010010114006611230101A0164.72
55679202306113051030010729000011305103A0865.97
25573202306112151030010066009211215103A0274.65
54582202306113051030010229005311305103A1272.84
28432202306112151050010618000311215105A0269.62
32137202306112301030011141004211230103C0158.68

Duplicate rows

Most frequently occurring

data_strd_ympnulegaldong_cdinduty_cdsnp_price_scor# duplicates
6202306112901340011171000011290134B1160.463
0202306111101480010060000011110148C0370.812
1202306112001120010575000011200112B1566.512
2202306112151030010226000511215103A0365.112
3202306112151040010256000111215104A0367.392
4202306112301030010990000011230103C0761.222
5202306112601060010571002711260106B0264.362
7202306113801040010470000711380104A0260.912
8202306114101160010067001611410116A0368.452
9202306114401040010179001411440104A0177.972