Overview

Brought to you by YData

Dataset statistics

Number of variables5
Number of observations230
Missing cells97
Missing cells (%)8.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory74.1 KiB
Average record size in memory329.7 B

Variable types

Text3
Categorical2

Alerts

Status is highly imbalanced (88.7%) Imbalance
Status has 97 (42.2%) missing values Missing
StationId has unique values Unique
StationName has unique values Unique

Reproduction

Analysis started2024-11-14 23:59:02.947662
Analysis finished2024-11-14 23:59:03.739076
Duration0.79 seconds
Software versionydata-profiling vv4.12.0
Download configurationconfig.json

Variables

StationId
Text

Unique 

Distinct230
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size14.1 KiB
2024-11-14T17:59:04.365963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters1150
Distinct characters29
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique230 ?
Unique (%)100.0%

Sample

1st rowAP001
2nd rowAP002
3rd rowAP003
4th rowAP004
5th rowAP005
ValueCountFrequency (%)
ap001 1
 
0.4%
dl012 1
 
0.4%
dl011 1
 
0.4%
ap003 1
 
0.4%
ap004 1
 
0.4%
ap005 1
 
0.4%
as001 1
 
0.4%
br001 1
 
0.4%
br002 1
 
0.4%
br003 1
 
0.4%
Other values (220) 220
95.7%
2024-11-14T17:59:05.449922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 371
32.3%
1 97
 
8.4%
2 59
 
5.1%
P 55
 
4.8%
H 53
 
4.6%
R 49
 
4.3%
L 47
 
4.1%
M 40
 
3.5%
D 40
 
3.5%
3 35
 
3.0%
Other values (19) 304
26.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1150
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 371
32.3%
1 97
 
8.4%
2 59
 
5.1%
P 55
 
4.8%
H 53
 
4.6%
R 49
 
4.3%
L 47
 
4.1%
M 40
 
3.5%
D 40
 
3.5%
3 35
 
3.0%
Other values (19) 304
26.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1150
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 371
32.3%
1 97
 
8.4%
2 59
 
5.1%
P 55
 
4.8%
H 53
 
4.6%
R 49
 
4.3%
L 47
 
4.1%
M 40
 
3.5%
D 40
 
3.5%
3 35
 
3.0%
Other values (19) 304
26.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1150
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 371
32.3%
1 97
 
8.4%
2 59
 
5.1%
P 55
 
4.8%
H 53
 
4.6%
R 49
 
4.3%
L 47
 
4.1%
M 40
 
3.5%
D 40
 
3.5%
3 35
 
3.0%
Other values (19) 304
26.4%

StationName
Text

Unique 

Distinct230
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size19.7 KiB
2024-11-14T17:59:05.987207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length44
Mean length30.091304
Min length17

Characters and Unicode

Total characters6921
Distinct characters64
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique230 ?
Unique (%)100.0%

Sample

1st rowSecretariat, Amaravati - APPCB
2nd rowAnand Kala Kshetram, Rajamahendravaram - APPCB
3rd rowTirumala, Tirupati - APPCB
4th rowPWD Grounds, Vijayawada - APPCB
5th rowGVM Corporation, Visakhapatnam - APPCB
ValueCountFrequency (%)
234
 
20.3%
delhi 38
 
3.3%
hspcb 28
 
2.4%
nagar 27
 
2.3%
dpcc 24
 
2.1%
mpcb 22
 
1.9%
uppcb 22
 
1.9%
kspcb 17
 
1.5%
cpcb 16
 
1.4%
wbpcb 14
 
1.2%
Other values (470) 712
61.7%
2024-11-14T17:59:06.985043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
924
 
13.4%
a 755
 
10.9%
r 367
 
5.3%
C 318
 
4.6%
P 311
 
4.5%
i 302
 
4.4%
B 263
 
3.8%
- 254
 
3.7%
e 235
 
3.4%
l 233
 
3.4%
Other values (54) 2959
42.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6921
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
924
 
13.4%
a 755
 
10.9%
r 367
 
5.3%
C 318
 
4.6%
P 311
 
4.5%
i 302
 
4.4%
B 263
 
3.8%
- 254
 
3.7%
e 235
 
3.4%
l 233
 
3.4%
Other values (54) 2959
42.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6921
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
924
 
13.4%
a 755
 
10.9%
r 367
 
5.3%
C 318
 
4.6%
P 311
 
4.5%
i 302
 
4.4%
B 263
 
3.8%
- 254
 
3.7%
e 235
 
3.4%
l 233
 
3.4%
Other values (54) 2959
42.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6921
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
924
 
13.4%
a 755
 
10.9%
r 367
 
5.3%
C 318
 
4.6%
P 311
 
4.5%
i 302
 
4.4%
B 263
 
3.8%
- 254
 
3.7%
e 235
 
3.4%
l 233
 
3.4%
Other values (54) 2959
42.8%

City
Text

Distinct127
Distinct (%)55.2%
Missing0
Missing (%)0.0%
Memory size14.6 KiB
2024-11-14T17:59:07.489751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length14
Mean length7.2913043
Min length4

Characters and Unicode

Total characters1677
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique106 ?
Unique (%)46.1%

Sample

1st rowAmaravati
2nd rowRajamahendravaram
3rd rowTirupati
4th rowVijayawada
5th rowVisakhapatnam
ValueCountFrequency (%)
delhi 38
 
16.1%
mumbai 13
 
5.5%
bengaluru 10
 
4.2%
kolkata 7
 
3.0%
patna 6
 
2.5%
hyderabad 6
 
2.5%
noida 6
 
2.5%
lucknow 5
 
2.1%
gurugram 4
 
1.7%
chennai 4
 
1.7%
Other values (118) 137
58.1%
2024-11-14T17:59:08.689487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 312
18.6%
i 132
 
7.9%
r 124
 
7.4%
h 99
 
5.9%
u 98
 
5.8%
l 96
 
5.7%
e 87
 
5.2%
n 82
 
4.9%
d 60
 
3.6%
o 46
 
2.7%
Other values (36) 541
32.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1677
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 312
18.6%
i 132
 
7.9%
r 124
 
7.4%
h 99
 
5.9%
u 98
 
5.8%
l 96
 
5.7%
e 87
 
5.2%
n 82
 
4.9%
d 60
 
3.6%
o 46
 
2.7%
Other values (36) 541
32.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1677
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 312
18.6%
i 132
 
7.9%
r 124
 
7.4%
h 99
 
5.9%
u 98
 
5.8%
l 96
 
5.7%
e 87
 
5.2%
n 82
 
4.9%
d 60
 
3.6%
o 46
 
2.7%
Other values (36) 541
32.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1677
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 312
18.6%
i 132
 
7.9%
r 124
 
7.4%
h 99
 
5.9%
u 98
 
5.8%
l 96
 
5.7%
e 87
 
5.2%
n 82
 
4.9%
d 60
 
3.6%
o 46
 
2.7%
Other values (36) 541
32.3%

State
Categorical

Distinct21
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size14.9 KiB
Delhi
38 
Haryana
29 
Uttar Pradesh
26 
Maharashtra
22 
Karnataka
20 
Other values (16)
95 

Length

Max length14
Median length11
Mean length8.8478261
Min length5

Characters and Unicode

Total characters2035
Distinct characters36
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)2.2%

Sample

1st rowAndhra Pradesh
2nd rowAndhra Pradesh
3rd rowAndhra Pradesh
4th rowAndhra Pradesh
5th rowAndhra Pradesh

Common Values

ValueCountFrequency (%)
Delhi 38
16.5%
Haryana 29
12.6%
Uttar Pradesh 26
11.3%
Maharashtra 22
9.6%
Karnataka 20
8.7%
Madhya Pradesh 16
7.0%
West Bengal 14
 
6.1%
Rajasthan 10
 
4.3%
Bihar 10
 
4.3%
Punjab 8
 
3.5%
Other values (11) 37
16.1%

Length

2024-11-14T17:59:09.242243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pradesh 47
15.9%
delhi 38
12.8%
haryana 29
9.8%
uttar 26
8.8%
maharashtra 22
 
7.4%
karnataka 20
 
6.8%
madhya 16
 
5.4%
west 14
 
4.7%
bengal 14
 
4.7%
rajasthan 10
 
3.4%
Other values (14) 60
20.3%

Most occurring characters

ValueCountFrequency (%)
a 494
24.3%
r 198
 
9.7%
h 177
 
8.7%
e 128
 
6.3%
t 124
 
6.1%
n 100
 
4.9%
s 97
 
4.8%
d 77
 
3.8%
l 72
 
3.5%
66
 
3.2%
Other values (26) 502
24.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2035
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 494
24.3%
r 198
 
9.7%
h 177
 
8.7%
e 128
 
6.3%
t 124
 
6.1%
n 100
 
4.9%
s 97
 
4.8%
d 77
 
3.8%
l 72
 
3.5%
66
 
3.2%
Other values (26) 502
24.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2035
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 494
24.3%
r 198
 
9.7%
h 177
 
8.7%
e 128
 
6.3%
t 124
 
6.1%
n 100
 
4.9%
s 97
 
4.8%
d 77
 
3.8%
l 72
 
3.5%
66
 
3.2%
Other values (26) 502
24.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2035
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 494
24.3%
r 198
 
9.7%
h 177
 
8.7%
e 128
 
6.3%
t 124
 
6.1%
n 100
 
4.9%
s 97
 
4.8%
d 77
 
3.8%
l 72
 
3.5%
66
 
3.2%
Other values (26) 502
24.7%

Status
Categorical

Imbalance  Missing 

Distinct2
Distinct (%)1.5%
Missing97
Missing (%)42.2%
Memory size12.1 KiB
Active
131 
Inactive
 
2

Length

Max length8
Median length6
Mean length6.0300752
Min length6

Characters and Unicode

Total characters802
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowActive
2nd rowActive
3rd rowActive
4th rowActive
5th rowActive

Common Values

ValueCountFrequency (%)
Active 131
57.0%
Inactive 2
 
0.9%
(Missing) 97
42.2%

Length

2024-11-14T17:59:09.780658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-14T17:59:10.148851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
active 131
98.5%
inactive 2
 
1.5%

Most occurring characters

ValueCountFrequency (%)
c 133
16.6%
t 133
16.6%
i 133
16.6%
v 133
16.6%
e 133
16.6%
A 131
16.3%
I 2
 
0.2%
n 2
 
0.2%
a 2
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 802
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
c 133
16.6%
t 133
16.6%
i 133
16.6%
v 133
16.6%
e 133
16.6%
A 131
16.3%
I 2
 
0.2%
n 2
 
0.2%
a 2
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 802
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
c 133
16.6%
t 133
16.6%
i 133
16.6%
v 133
16.6%
e 133
16.6%
A 131
16.3%
I 2
 
0.2%
n 2
 
0.2%
a 2
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 802
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
c 133
16.6%
t 133
16.6%
i 133
16.6%
v 133
16.6%
e 133
16.6%
A 131
16.3%
I 2
 
0.2%
n 2
 
0.2%
a 2
 
0.2%

Correlations

2024-11-14T17:59:10.378576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
StateStatus
State1.0000.000
Status0.0001.000

Missing values

2024-11-14T17:59:03.288866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-14T17:59:03.626548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

StationIdStationNameCityStateStatus
0AP001Secretariat, Amaravati - APPCBAmaravatiAndhra PradeshActive
1AP002Anand Kala Kshetram, Rajamahendravaram - APPCBRajamahendravaramAndhra PradeshNaN
2AP003Tirumala, Tirupati - APPCBTirupatiAndhra PradeshNaN
3AP004PWD Grounds, Vijayawada - APPCBVijayawadaAndhra PradeshNaN
4AP005GVM Corporation, Visakhapatnam - APPCBVisakhapatnamAndhra PradeshActive
5AS001Railway Colony, Guwahati - APCBGuwahatiAssamActive
6BR001Collectorate, Gaya - BSPCBGayaBiharNaN
7BR002SFTI Kusdihra, Gaya - BSPCBGayaBiharNaN
8BR003Industrial Area, Hajipur - BSPCBHajipurBiharNaN
9BR004Muzaffarpur Collectorate, Muzaffarpur - BSPCBMuzaffarpurBiharNaN
StationIdStationNameCityStateStatus
220WB005Ghusuri, Howrah - WBPCBHowrahWest BengalNaN
221WB006Padmapukur, Howrah - WBPCBHowrahWest BengalNaN
222WB007Ballygunge, Kolkata - WBPCBKolkataWest BengalActive
223WB008Bidhannagar, Kolkata - WBPCBKolkataWest BengalActive
224WB009Fort William, Kolkata - WBPCBKolkataWest BengalActive
225WB010Jadavpur, Kolkata - WBPCBKolkataWest BengalActive
226WB011Rabindra Bharati University, Kolkata - WBPCBKolkataWest BengalActive
227WB012Rabindra Sarobar, Kolkata - WBPCBKolkataWest BengalActive
228WB013Victoria, Kolkata - WBPCBKolkataWest BengalActive
229WB014Ward-32 Bapupara, Siliguri - WBPCBSiliguriWest BengalNaN