Berdasarkan isu #172: anfrek: Normal
Referensi Isu:
7329.pdf
)Deskripsi Isu:
Diskusi Isu:
Strategi:
import numpy as np
import pandas as pd
from scipy import stats
# contoh data diambil dari buku
# hidrologi: Aplikasi Metode Statistik untuk Analisa Data hal. 12
_DEBIT = [
149.4, 132.4, 125.0, 121.0, 114.7, 109.0, 101.7, 99.2, 97.8, 97.4, 91.1,
90.0, 89.1, 84.6, 83.8, 83.6, 78.6, 77.8, 73.0, 68.5, 65.0, 45.2, 41.6
]
# Data aslinya Tabel 3.4
# 109.0, 125.0, 121.0, 97.4, 78.6, 149.4, 90.0, 114.1, 91.1, 84.6, 132.4,
# 83.8, 73.0, 65.0, 97.8, 77.8, 45.2, 68.5, 83.6, 191.7, 99.2, 41.6, 89.1
# terdapat perbedaan nilai (max seharusnya 191.7) dan nilai 114.1 -> 114.7
data = pd.DataFrame(
data=_DEBIT, columns=['debit'], index=range(1, 24)
)
data
debit | |
---|---|
1 | 149.4 |
2 | 132.4 |
3 | 125.0 |
4 | 121.0 |
5 | 114.7 |
6 | 109.0 |
7 | 101.7 |
8 | 99.2 |
9 | 97.8 |
10 | 97.4 |
11 | 91.1 |
12 | 90.0 |
13 | 89.1 |
14 | 84.6 |
15 | 83.8 |
16 | 83.6 |
17 | 78.6 |
18 | 77.8 |
19 | 73.0 |
20 | 68.5 |
21 | 65.0 |
22 | 45.2 |
23 | 41.6 |
Terdapat 1 tabel untuk modul hk172
yaitu:
t_normal_sw
: Tabel nilai $k$ dari Tabel 3.3 Nilai Variabel Reduksi Gauss. Sumber: hidrologi: Aplikasi Metode Statistik untuk Analisa Data.Dalam modul hk172
nilai $k$ akan dibangkitkan menggunakan scipy
secara default
. Mohon diperhatikan jika ingin menggunakan nilai $k$ yang berasal dari sumber lain.
# Tabel Nilai Variabel Reduksi Gauss
# Dari buku hidrologi: Aplikasi Metode Statistik untuk Analisa Data. hal.119
# KODE: SW
_DATA_SW = [
[1.001, 0.999, -3.050],
[1.005, 0.995, -2.580],
[1.010, 0.990, -2.330],
[1.050, 0.950, -1.640],
[1.110, 0.900, -1.280],
[1.250, 0.800, -0.840],
[1.330, 0.750, -0.670],
[1.430, 0.700, -0.520],
[1.670, 0.600, -0.250],
[2.000, 0.500, 0.000],
[2.500, 0.400, 0.250],
[3.330, 0.300, 0.520],
[4.000, 0.250, 0.670],
[5.000, 0.200, 0.840],
[10.000, 0.100, 1.280],
[20.000, 0.050, 1.640],
[50.000, 0.200, 2.050],
[100.000, 0.010, 2.330],
[200.000, 0.005, 2.580],
[500.000, 0.002, 2.880],
[1000.000, 0.001, 3.090],
]
_COL_SW = ['periode_ulang', 'peluang', 'k']
t_normal_sw = pd.DataFrame(
data=_DATA_SW, columns=_COL_SW
)
t_normal_sw
periode_ulang | peluang | k | |
---|---|---|---|
0 | 1.001 | 0.999 | -3.05 |
1 | 1.005 | 0.995 | -2.58 |
2 | 1.010 | 0.990 | -2.33 |
3 | 1.050 | 0.950 | -1.64 |
4 | 1.110 | 0.900 | -1.28 |
5 | 1.250 | 0.800 | -0.84 |
6 | 1.330 | 0.750 | -0.67 |
7 | 1.430 | 0.700 | -0.52 |
8 | 1.670 | 0.600 | -0.25 |
9 | 2.000 | 0.500 | 0.00 |
10 | 2.500 | 0.400 | 0.25 |
11 | 3.330 | 0.300 | 0.52 |
12 | 4.000 | 0.250 | 0.67 |
13 | 5.000 | 0.200 | 0.84 |
14 | 10.000 | 0.100 | 1.28 |
15 | 20.000 | 0.050 | 1.64 |
16 | 50.000 | 0.200 | 2.05 |
17 | 100.000 | 0.010 | 2.33 |
18 | 200.000 | 0.005 | 2.58 |
19 | 500.000 | 0.002 | 2.88 |
20 | 1000.000 | 0.001 | 3.09 |
def _find_k_in_table(return_period, table):
x = table.periode_ulang
y = table.k
return np.interp(return_period, x, y)
def _calc_prob_in_table(k, table):
x = table.k
y = table.peluang
return np.interp(k, x, y)
def find_K(return_period, source='scipy'):
if source.lower() == 'soewarno':
return _find_k_in_table(return_period, t_normal_sw)
elif source.lower() == 'scipy':
return_period = np.array(return_period)
return stats.norm.ppf(1 - 1/return_period)
def calc_x_normal(x, return_period=[5], source='scipy', show_stat=False):
return_period = np.array(return_period)
x = np.array(x)
x_mean = np.mean(x)
x_std = np.std(x, ddof=1)
n = x.size
k = find_K(return_period, source=source)
if show_stat:
print(f'x_mean = {x_mean:.5f}')
print(f'x_std = {x_std:.5f}')
print(f'k = {k}')
val_x = x_mean + k * x_std
return val_x
def freq_normal(
df, col=None,
return_period=[2, 5, 10, 20, 25, 50, 100], show_stat=False, source='scipy',
col_name='Normal', index_name='Kala Ulang'):
col = df.columns[0] if col is None else col
x = df[col].copy()
arr = calc_x_normal(
x, return_period=return_period, show_stat=show_stat, source=source)
result = pd.DataFrame(
data=arr, index=return_period, columns=[col_name]
)
result.index.name = index_name
return result
def calc_prob(k, source='scipy'):
if source.lower() == 'soewarno':
k = np.array(k)
return 1 - _calc_prob_in_table(k, t_normal_sw)
elif source.lower() == 'scipy':
return stats.norm.cdf(k)
find_K(return_period, ...)
¶Function: find_K(return_period, source='scipy')
Fungsi find_K(...)
digunakan untuk mencari nilai $K$ dari berbagai sumber berdasarkan kala ulang.
return_period
: kala ulang. Dapat diisi dengan scalar atau array_like.source
: sumber nilai $k$, scipy
(default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'
).Catatan: Fungsi ini sama dengan .hk124
.
find_K(10)
1.2815515655446004
find_K([2, 5, 10], source='soewarno')
array([0. , 0.84, 1.28])
# perbandingan antara masing-masing sumber
_rp = [2, 5, 10, 15, 20, 25, 27, 50, 100]
source_test = ['soewarno', 'scipy']
for _source in source_test:
print(f'k {_source:10}= {find_K(_rp, source=_source)}')
k soewarno = [0. 0.84 1.28 1.46 1.64 1.70833333 1.73566667 2.05 2.33 ] k scipy = [0. 0.84162123 1.28155157 1.50108595 1.64485363 1.75068607 1.78615556 2.05374891 2.32634787]
calc_x_normal(x, ...)
¶Function: calc_x_normal(x, return_period=[5], source='scipy', show_stat=False)
Fungsi calc_x_lognormal(...)
digunakan untuk mencari besar $X$ berdasarkan kala ulang (return period), yang hasilnya dalam bentuk numpy.array
.
x
: array.return_period
: kala ulang (tahun), [5]
(default).source
: sumber nilai $k$, 'scipy'
(default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'
).show_stat
: menampilkan parameter statistik. False
(default).calc_x_normal(data.debit)
array([114.00269627])
calc_x_normal(data.debit, show_stat=True)
x_mean = 92.15217 x_std = 25.96242 k = [0.84162123]
array([114.00269627])
calc_x_normal(data.debit, return_period=[5, 10, 15, 20, 21], show_stat=True)
x_mean = 92.15217 x_std = 25.96242 k = [0.84162123 1.28155157 1.50108595 1.64485363 1.66839119]
array([114.00269627, 125.42435149, 131.12399486, 134.85655151, 135.46764366])
freq_normal(df, ...)
¶Function: freq_normal(df, col=None, return_period=[2, 5, 10, 20, 25, 50, 100], show_stat=False, source='scipy', col_name='Normal')
Fungsi freq_normal(...)
merupakan fungsi kembangan lebih lanjut dari calc_x_normal(...)
yang menerima input pandas.DataFrame
dan memiliki luaran berupa pandas.DataFrame
.
df
: pandas.DataFrame
.col
: nama kolom, None
(default). Jika tidak diisi menggunakan kolom pertama dalam df
sebagai data masukan.return_period
: kala ulang (tahun), [2, 5, 10, 20, 25, 50, 100]
(default).source
: sumber nilai $k$, 'scipy'
(default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'
).show_stat
: menampilkan parameter statistik. False
(default).col_name
: nama kolom luaran, Log Normal
(default).freq_normal(data)
Normal | |
---|---|
Kala Ulang | |
2 | 92.152174 |
5 | 114.002696 |
10 | 125.424351 |
20 | 134.856552 |
25 | 137.604218 |
50 | 145.472462 |
100 | 152.549790 |
freq_normal(data, source='soewarno', col_name='Normal (Soewarno)')
Normal (Soewarno) | |
---|---|
Kala Ulang | |
2 | 92.152174 |
5 | 113.960605 |
10 | 125.384069 |
20 | 134.730540 |
25 | 136.504638 |
50 | 145.375131 |
100 | 152.644608 |
freq_normal(data, 'debit', source='scipy', col_name='Normal (scipy)', show_stat=True)
x_mean = 92.15217 x_std = 25.96242 k = [0. 0.84162123 1.28155157 1.64485363 1.75068607 2.05374891 2.32634787]
Normal (scipy) | |
---|---|
Kala Ulang | |
2 | 92.152174 |
5 | 114.002696 |
10 | 125.424351 |
20 | 134.856552 |
25 | 137.604218 |
50 | 145.472462 |
100 | 152.549790 |
_res = []
for _s in ['soewarno', 'scipy']:
_res += [freq_normal(data, 'debit', source=_s, col_name=f'Normal ({_s})')]
pd.concat(_res, axis=1)
Normal (soewarno) | Normal (scipy) | |
---|---|---|
Kala Ulang | ||
2 | 92.152174 | 92.152174 |
5 | 113.960605 | 114.002696 |
10 | 125.384069 | 125.424351 |
20 | 134.730540 | 134.856552 |
25 | 136.504638 | 137.604218 |
50 | 145.375131 | 145.472462 |
100 | 152.644608 | 152.549790 |
calc_prob(k, ...)
¶Function: calc_prob(k, source='scipy')
Fungsi calc_prob(...)
digunakan untuk mencari nilai peluang/probabilitas dari berbagai sumber berdasarkan nilai $K$.
k
: $K$ (faktor frekuensi).source
: sumber nilai peluang, scipy
(default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'
).calc_prob(-0.25)
0.4012936743170763
calc_prob(0.52, source='soewarno')
0.7
# perbandingan antara masing-masing sumber
_k = [
-3.09, -2.58, -2.33, -1.67, 0. , 0.84, 1.28, 1.5 , 1.64,
1.75, 1.79, 2.05, 2.33
]
source_test = ['soewarno', 'scipy']
for _source in source_test:
print(f'prob {_source:10}= {calc_prob(_k, source=_source)}')
prob soewarno = [0.001 0.005 0.01 0.04826087 0.5 0.8 0.9 0.93055556 0.95 0.9097561 0.89512195 0.8 0.99 ] prob scipy = [0.00100078 0.00494002 0.00990308 0.04745968 0.5 0.79954581 0.89972743 0.9331928 0.94949742 0.95994084 0.96327304 0.97981778 0.99009692]
- 20220416 - 1.1.1 - koreksi nama kolom
- 20220323 - 1.1.0 - tambah argumen index_name="Kala Ulang" pada fungsi freq_normal() untuk penamaan index
- 20220316 - 1.0.5 - ubah fungsi _calc_prob(...) (hasil menjadi 1-P)
- 20220315 - 1.0.4 - ubah nama fungs _find_prob_in_table -> _calc_prob_in_table(...)
- 20220315 - 1.0.3 - Ubah nama fungsi find_prob(...) -> calc_prob(...)
- 20220314 - 1.0.2 - Typo
- 20220314 - 1.0.1 - Tambah fungsi find_prob(...)
- 20220311 - 1.0.0 - Initial
Source code in this notebook is licensed under a MIT License. Data in this notebook is licensed under a Creative Common Attribution 4.0 International.