Berdasarkan isu #124: anfrek: Log Normal 2 Parameter
Referensi Isu:
Deskripsi Isu:
Diskusi Isu:
Strategi:
import numpy as np
import pandas as pd
from scipy import stats
# contoh data diambil dari buku
# hidrologi: Aplikasi Metode Statistik untuk Analisa Data hal. 144
_DEBIT = [
58.3, 50.5, 46.0, 41.8, 38.2, 37.9, 37.7, 35.3, 35.2, 33.4, 31.9,
31.1, 30.9, 30.1, 28.8, 24.7, 23.6, 23.5, 23.1, 22.5, 21.1, 20.5,
20.5, 20.3, 20.2, 18.7, 17.2, 14.9, 12.4, 11.8
]
data = pd.DataFrame(
data=_DEBIT, columns=['debit'], index=range(1, 31)
)
data
debit | |
---|---|
1 | 58.3 |
2 | 50.5 |
3 | 46.0 |
4 | 41.8 |
5 | 38.2 |
6 | 37.9 |
7 | 37.7 |
8 | 35.3 |
9 | 35.2 |
10 | 33.4 |
11 | 31.9 |
12 | 31.1 |
13 | 30.9 |
14 | 30.1 |
15 | 28.8 |
16 | 24.7 |
17 | 23.6 |
18 | 23.5 |
19 | 23.1 |
20 | 22.5 |
21 | 21.1 |
22 | 20.5 |
23 | 20.5 |
24 | 20.3 |
25 | 20.2 |
26 | 18.7 |
27 | 17.2 |
28 | 14.9 |
29 | 12.4 |
30 | 11.8 |
Terdapat 1 tabel untuk modul hk124
yaitu:
t_normal_sw
: Tabel nilai $k$ dari Tabel 3.3 Nilai Variabel Reduksi Gauss. Sumber: hidrologi: Aplikasi Metode Statistik untuk Analisa Data.Dalam modul hk124
nilai $k$ akan dibangkitkan menggunakan scipy
secara default
. Mohon diperhatikan jika ingin menggunakan nilai $k$ yang berasal dari sumber lain.
# Tabel Nilai Variabel Reduksi Gauss
# Dari buku hidrologi: Aplikasi Metode Statistik untuk Analisa Data. hal.119
# KODE: SW
_DATA_SW = [
[1.001, 0.999, -3.050],
[1.005, 0.995, -2.580],
[1.010, 0.990, -2.330],
[1.050, 0.950, -1.640],
[1.110, 0.900, -1.280],
[1.250, 0.800, -0.840],
[1.330, 0.750, -0.670],
[1.430, 0.700, -0.520],
[1.670, 0.600, -0.250],
[2.000, 0.500, 0.000],
[2.500, 0.400, 0.250],
[3.330, 0.300, 0.520],
[4.000, 0.250, 0.670],
[5.000, 0.200, 0.840],
[10.000, 0.100, 1.280],
[20.000, 0.050, 1.640],
[50.000, 0.200, 2.050],
[100.000, 0.010, 2.330],
[200.000, 0.005, 2.580],
[500.000, 0.002, 2.880],
[1000.000, 0.001, 3.090],
]
_COL_SW = ['periode_ulang', 'peluang', 'k']
t_normal_sw = pd.DataFrame(
data=_DATA_SW, columns=_COL_SW
)
t_normal_sw
periode_ulang | peluang | k | |
---|---|---|---|
0 | 1.001 | 0.999 | -3.05 |
1 | 1.005 | 0.995 | -2.58 |
2 | 1.010 | 0.990 | -2.33 |
3 | 1.050 | 0.950 | -1.64 |
4 | 1.110 | 0.900 | -1.28 |
5 | 1.250 | 0.800 | -0.84 |
6 | 1.330 | 0.750 | -0.67 |
7 | 1.430 | 0.700 | -0.52 |
8 | 1.670 | 0.600 | -0.25 |
9 | 2.000 | 0.500 | 0.00 |
10 | 2.500 | 0.400 | 0.25 |
11 | 3.330 | 0.300 | 0.52 |
12 | 4.000 | 0.250 | 0.67 |
13 | 5.000 | 0.200 | 0.84 |
14 | 10.000 | 0.100 | 1.28 |
15 | 20.000 | 0.050 | 1.64 |
16 | 50.000 | 0.200 | 2.05 |
17 | 100.000 | 0.010 | 2.33 |
18 | 200.000 | 0.005 | 2.58 |
19 | 500.000 | 0.002 | 2.88 |
20 | 1000.000 | 0.001 | 3.09 |
def _find_k_in_table(return_period, table):
x = table.periode_ulang
y = table.k
return np.interp(return_period, x, y)
def _calc_prob_in_table(k, table):
x = table.k
y = table.peluang
return np.interp(k, x, y)
def find_K(return_period, source='scipy'):
if source.lower() == 'soewarno':
return _find_k_in_table(return_period, t_normal_sw)
elif source.lower() == 'scipy':
return_period = np.array(return_period)
return stats.norm.ppf(1 - 1/return_period)
def calc_x_lognormal(x, return_period=[5], source='scipy', show_stat=False):
return_period = np.array(return_period)
y = np.log10(x)
y_mean = np.mean(y)
y_std = np.std(y, ddof=1)
n = len(y)
k = find_K(return_period, source=source)
if show_stat:
print(f'y_mean = {y_mean:.5f}')
print(f'y_std = {y_std:.5f}')
print(f'k = {k}')
val_y = y_mean + k * y_std
val_x = np.power(10, val_y)
return val_x
def freq_lognormal(
df, col=None,
return_period=[2, 5, 10, 20, 25, 50, 100], show_stat=False, source='scipy',
col_name='Log Normal', index_name='Kala Ulang'):
col = df.columns[0] if col is None else col
x = df[col].copy()
arr = calc_x_lognormal(
x, return_period=return_period, show_stat=show_stat, source=source)
result = pd.DataFrame(
data=arr, index=return_period, columns=[col_name]
)
result.index.name = index_name
return result
def calc_prob(k, source='scipy'):
if source.lower() == 'soewarno':
k = np.array(k)
return 1 - _calc_prob_in_table(k, t_normal_sw)
elif source.lower() == 'scipy':
return stats.norm.cdf(k)
find_K(return_period, ...)
¶Function: find_K(return_period, source='scipy')
Fungsi find_K(...)
digunakan untuk mencari nilai $K$ dari berbagai sumber berdasarkan kala ulang.
return_period
: kala ulang. Dapat diisi dengan scalar atau array_like.source
: sumber nilai $k$, scipy
(default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'
).find_K(10)
1.2815515655446004
find_K([2, 5, 10], source='soewarno')
array([0. , 0.84, 1.28])
# perbandingan antara masing-masing sumber
_rp = [2, 5, 10, 15, 20, 25, 27, 50, 100]
source_test = ['soewarno', 'scipy']
for _source in source_test:
print(f'k {_source:10}= {find_K(_rp, source=_source)}')
k soewarno = [0. 0.84 1.28 1.46 1.64 1.70833333 1.73566667 2.05 2.33 ] k scipy = [0. 0.84162123 1.28155157 1.50108595 1.64485363 1.75068607 1.78615556 2.05374891 2.32634787]
calc_x_lognormal(x, ...)
¶Function: calc_x_lognormal(x, return_period=[5], source='scipy', show_stat=False)
Fungsi calc_x_lognormal(...)
digunakan untuk mencari besar $X$ berdasarkan kala ulang (return period), yang hasilnya dalam bentuk numpy.array
.
x
: array.return_period
: kala ulang (tahun), [5]
(default).source
: sumber nilai $k$, 'scipy'
(default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'
).show_stat
: menampilkan parameter statistik. False
(default).calc_x_lognormal(data.debit)
array([37.23254445])
calc_x_lognormal(data.debit, show_stat=True)
y_mean = 1.42638 y_std = 0.17174 k = [0.84162123]
array([37.23254445])
calc_x_lognormal(data.debit, return_period=[5, 10, 15, 20, 21], show_stat=True)
y_mean = 1.42638 y_std = 0.17174 k = [0.84162123 1.28155157 1.50108595 1.64485363 1.66839119]
array([37.23254445, 44.30748222, 48.32593502, 51.15300628, 51.63135738])
freq_lognormal(df, ...)
¶Function: freq_lognormal(df, col=None, return_period=[2, 5, 10, 20, 25, 50, 100], show_stat=False, source='scipy', col_name='Log Normal')
Fungsi freq_lognormal(...)
merupakan fungsi kembangan lebih lanjut dari calc_x_lognormal(...)
yang menerima input pandas.DataFrame
dan memiliki luaran berupa pandas.DataFrame
.
df
: pandas.DataFrame
.col
: nama kolom, None
(default). Jika tidak diisi menggunakan kolom pertama dalam df
sebagai data masukan.return_period
: kala ulang (tahun), [2, 5, 10, 20, 25, 50, 100]
(default).source
: sumber nilai $k$, 'scipy'
(default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'
).show_stat
: menampilkan parameter statistik. False
(default).col_name
: nama kolom luaran, Log Normal
(default).freq_lognormal(data)
Log Normal | |
---|---|
Kala Ulang | |
2 | 26.692012 |
5 | 37.232544 |
10 | 44.307482 |
20 | 51.153006 |
25 | 53.339262 |
50 | 60.130596 |
100 | 66.974904 |
freq_lognormal(data, source='soewarno', col_name='Log Normal (Soewarno)')
Log Normal (Soewarno) | |
---|---|
Kala Ulang | |
2 | 26.692012 |
5 | 37.208682 |
10 | 44.280305 |
20 | 51.054919 |
25 | 52.453355 |
50 | 60.041518 |
100 | 67.071701 |
freq_lognormal(data, 'debit', source='scipy', col_name=f'Log Normal (scipy)', show_stat=True)
y_mean = 1.42638 y_std = 0.17174 k = [0. 0.84162123 1.28155157 1.64485363 1.75068607 2.05374891 2.32634787]
Log Normal (scipy) | |
---|---|
Kala Ulang | |
2 | 26.692012 |
5 | 37.232544 |
10 | 44.307482 |
20 | 51.153006 |
25 | 53.339262 |
50 | 60.130596 |
100 | 66.974904 |
_res = []
for _s in ['soewarno', 'scipy']:
_res += [freq_lognormal(data, 'debit', source=_s, col_name=f'Log Normal ({_s})')]
pd.concat(_res, axis=1)
Log Normal (soewarno) | Log Normal (scipy) | |
---|---|---|
Kala Ulang | ||
2 | 26.692012 | 26.692012 |
5 | 37.208682 | 37.232544 |
10 | 44.280305 | 44.307482 |
20 | 51.054919 | 51.153006 |
25 | 52.453355 | 53.339262 |
50 | 60.041518 | 60.130596 |
100 | 67.071701 | 66.974904 |
calc_prob(k, ...)
¶Function: calc_prob(k, source='scipy')
Fungsi calc_prob(...)
digunakan untuk mencari nilai peluang/probabilitas dari berbagai sumber berdasarkan nilai $K$.
k
: $K$ (faktor frekuensi).source
: sumber nilai peluang, scipy
(default). Sumber yang dapat digunakan antara lain: Soewarno ('soewarno'
).Catatan: Fungsi ini sama saja dengan yang di modul hk172 atau #172 (Anfrek: Normal).
calc_prob(-0.25)
0.4012936743170763
calc_prob(0.52, source='soewarno')
0.7
# perbandingan antara masing-masing sumber
_k = [
-3.09, -2.58, -2.33, -1.67, 0. , 0.84, 1.28, 1.5 , 1.64,
1.75, 1.79, 2.05, 2.33
]
source_test = ['soewarno', 'scipy']
for _source in source_test:
print(f'prob {_source:10}= {calc_prob(_k, source=_source)}')
prob soewarno = [0.001 0.005 0.01 0.04826087 0.5 0.8 0.9 0.93055556 0.95 0.9097561 0.89512195 0.8 0.99 ] prob scipy = [0.00100078 0.00494002 0.00990308 0.04745968 0.5 0.79954581 0.89972743 0.9331928 0.94949742 0.95994084 0.96327304 0.97981778 0.99009692]
- 20220323 - 1.1.0 - tambah argumen index_name="Kala Ulang" pada fungsi freq_lognormal() untuk penamaan index
- 20220316 - 1.0.3 - ubah fungsi _calc_prob(...) (hasil menjadi 1-P)
- 20220315 - 1.0.2 - ubah nama fungsi find_prob -> calc_prob(...)
- 20220314 - 1.0.1 - Tambah Fungsi find_prob(...)
- 20220310 - 1.0.0 - Initial
Source code in this notebook is licensed under a MIT License. Data in this notebook is licensed under a Creative Common Attribution 4.0 International.