You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/data/interpolation.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Data and Interpolation
2
2
3
-
[← Previous: Hypothesis Tests](../statistics/hypothesis-tests.md) | [Back to Index](../index.md) | [Next: Linear Regression →](regression.md)
3
+
[← Previous: ODE Solvers](../mathematics/ode-solvers.md) | [Back to Index](../index.md) | [Next: Linear Regression →](regression.md)
4
4
5
5
Interpolation is the process of estimating values between known data points. Given a set of $n$ data points $(x_1, y_1), (x_2, y_2), \ldots, (x_n, y_n)$, interpolation constructs a function $p(x)$ that passes through all the data points, i.e., $p(x_i) = y_i$ for all $i$. This is distinct from regression, which fits a function that approximates the data while minimizing some error criterion.
6
6
@@ -360,4 +360,4 @@ for (double t = 0; t <= 1; t += 0.2)
360
360
361
361
---
362
362
363
-
[← Previous: Hypothesis Tests](../statistics/hypothesis-tests.md) | [Back to Index](../index.md) | [Next: Linear Regression →](regression.md)
363
+
[← Previous: ODE Solvers](../mathematics/ode-solvers.md) | [Back to Index](../index.md) | [Next: Linear Regression →](regression.md)
Copy file name to clipboardExpand all lines: docs/data/time-series.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Time Series
2
2
3
-
[← Previous: Linear Regression](regression.md) | [Back to Index](../index.md) | [Next: Random Generation →](../sampling/random-generation.md)
3
+
[← Previous: Linear Regression](regression.md) | [Back to Index](../index.md) | [Next: Descriptive Statistics →](../statistics/descriptive.md)
4
4
5
5
The ***Numerics*** library provides a comprehensive `TimeSeries` class for working with time-indexed data. This class supports regular and irregular time intervals, statistical operations, transformations, and analysis methods essential for hydrological and environmental data.
6
6
@@ -721,4 +721,4 @@ for (int m = 1; m <= 12; m++)
721
721
722
722
---
723
723
724
-
[← Previous: Linear Regression](regression.md) | [Back to Index](../index.md) | [Next: Random Generation →](../sampling/random-generation.md)
724
+
[← Previous: Linear Regression](regression.md) | [Back to Index](../index.md) | [Next: Descriptive Statistics →](../statistics/descriptive.md)
@@ -27,7 +27,7 @@ public enum ParameterEstimationMethod
27
27
|**Method of Moments**| Simple, fast | Inefficient, biased for small samples | Quick estimates, stable parameters |
28
28
|**Method of Percentiles**| Intuitive, robust | Less efficient | Expert judgment, special cases |
29
29
30
-
**Recommendation for Hydrological Applications:** L-Moments are recommended by USGS [[1]](#1) for flood frequency analysis due to superior performance with small samples and robustness to outliers.
30
+
**Recommendation for Hydrological Applications:** L-moments are recommended by USGS [[1]](#1) for flood frequency analysis due to superior performance with small samples and robustness to outliers.
L-skewness is bounded in $[-1, 1]$ and L-kurtosis in $[\frac{1}{4}(5\tau_3^2 - 1),\; 1]$, unlike conventional skewness and kurtosis which are unbounded. This boundedness makes L-moment ratios more interpretable and stable.
182
+
183
+
**Sample estimation.** Given a sorted sample $x_{1:n} \leq x_{2:n} \leq \cdots \leq x_{n:n}$, the unbiased sample PWM estimators are:
The `Statistics.LinearMoments()` method computes these sample PWMs and returns the array $[\lambda_1,\; \lambda_2,\; \tau_3,\; \tau_4]$.
190
+
191
+
**Why L-moments are preferred for small samples.** Conventional moments involve powers of deviations from the mean, so a single extreme observation can dominate the skewness or kurtosis estimate. L-moments use only linear combinations of order statistics, which makes them far more robust to outliers and nearly unbiased even for samples as small as $n = 10$. For hydrological applications where sample sizes are often 30--60 years of annual data, this robustness is critical.
192
+
193
+
**L-moment ratio diagrams.** Plotting sample L-skewness ($\tau_3$) against L-kurtosis ($\tau_4$) and comparing to the theoretical curves of candidate distributions is a powerful tool for distribution identification. Each distribution family traces a distinct curve (or point) in L-moment ratio space, making visual comparison straightforward [[2]](#2).
194
+
149
195
### Properties of L-Moments
150
196
151
-
1.**More robust** than conventional moments - less influenced by outliers
197
+
1.**More robust** than conventional moments -- less influenced by outliers
152
198
2.**Less biased** for small samples
153
-
3.**More efficient** - smaller sampling variance
154
-
4.**Bounded** - L-moment ratios are bounded, unlike conventional moments
199
+
3.**More efficient** -- smaller sampling variance
200
+
4.**Bounded** -- L-moment ratios are bounded, unlike conventional moments
155
201
5.**Nearly unbiased** even for very small samples (n = 10)
MLE finds parameters that maximize the likelihood of observing the data [[3]](#3):
262
+
Maximum Likelihood Estimation (MLE) finds the parameter values that make the observed data most probable under the assumed model [[3]](#3).
263
+
264
+
### Mathematical Formulation
265
+
266
+
Given independent observations $x_1, x_2, \ldots, x_n$ from a distribution with PDF $f(x|\boldsymbol{\theta})$, the likelihood function is the joint probability of the data viewed as a function of the parameters:
For some distributions (e.g., Normal, Exponential), the MLE has a closed-form solution. For most distributions used in hydrology (GEV, LP3, Weibull), the optimization must be solved numerically. The library uses constrained optimization with initial values derived from L-moment estimates.
285
+
286
+
**Fisher Information and standard errors.** The Fisher Information matrix quantifies the curvature of the log-likelihood surface at the maximum:
**Strengths:** Asymptotically efficient (achieves the lowest possible variance among consistent estimators), asymptotically unbiased, invariant under reparameterization, provides a natural framework for model comparison via AIC and BIC.
305
+
306
+
**Weaknesses:** Requires numerical optimization that may fail to converge, sensitive to outliers, can be biased and inefficient for small samples, requires specification of the full probability model.
307
+
308
+
### Using MLE
217
309
218
310
```cs
219
311
usingNumerics.Distributions;
@@ -275,7 +367,49 @@ catch (Exception ex)
275
367
276
368
## Method of Moments
277
369
278
-
MOM matches sample moments with theoretical moments:
370
+
The Method of Moments (MOM) is the oldest and simplest approach to parameter estimation. The core idea is to equate sample moments to the corresponding theoretical moments of the distribution and solve for the unknown parameters.
371
+
372
+
### Mathematical Formulation
373
+
374
+
Given a sample $x_1, x_2, \ldots, x_n$, the first four sample moments are the mean, standard deviation, skewness, and kurtosis:
375
+
376
+
```math
377
+
\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i
378
+
```
379
+
380
+
```math
381
+
s = \sqrt{\frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2}
The `Statistics.ProductMoments()` method returns these four quantities as the array $[\bar{x},\; s,\; \hat{\gamma},\; \hat{\kappa}]$.
393
+
394
+
MOM estimation sets the theoretical moments equal to the sample moments and solves for the distribution parameters. For a two-parameter distribution, only the first two moments (mean and standard deviation) are needed. For three-parameter distributions, skewness is also required.
395
+
396
+
**Example: Normal distribution.** The Normal($\mu$, $\sigma$) has $E[X] = \mu$ and $\text{SD}[X] = \sigma$. Equating sample to theoretical moments yields:
397
+
398
+
```math
399
+
\hat{\mu} = \bar{x}, \quad \hat{\sigma} = s
400
+
```
401
+
402
+
**Example: Gamma distribution.** The Gamma($\kappa$, $\theta$) has $E[X] = \kappa\theta$ and $\text{Var}[X] = \kappa\theta^2$. Solving for the parameters:
**Weaknesses:** Not statistically efficient (higher variance than MLE), can produce invalid parameters for skewed distributions, estimates are sensitive to outliers because conventional moments give disproportionate weight to extreme values.
@@ -298,6 +432,53 @@ Console.WriteLine($" Sample mean = {moments[0]:F2}");
298
432
Console.WriteLine($" Sample std dev = {moments[1]:F2}");
299
433
```
300
434
435
+
## Method of Percentiles
436
+
437
+
The Method of Percentiles (also called least-squares fitting or quantile matching) estimates parameters by matching theoretical quantiles of the distribution to empirical quantiles computed from the data.
438
+
439
+
### Mathematical Formulation
440
+
441
+
Given a sorted sample $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$, each observation is assigned a plotting position $p_i$ that estimates $F(x_{(i)})$. A common choice is the Weibull plotting position:
442
+
443
+
```math
444
+
p_i = \frac{i}{n + 1}
445
+
```
446
+
447
+
The parameters $\boldsymbol{\theta}$ are then chosen so that the theoretical quantile function (inverse CDF) matches the observed data as closely as possible. For a distribution with quantile function $F^{-1}(p;\,\boldsymbol{\theta})$, the parameters minimize the sum of squared differences:
For a two-parameter distribution, it is sufficient to select two percentiles (e.g., the median and the 84th percentile) and solve the resulting system of two equations:
**Strengths:** Intuitive and easy to visualize, always produces estimates, moderately robust to outliers in the tails, useful when expert judgment suggests specific quantile targets.
460
+
461
+
**Weaknesses:** Uses only selected data points or gives equal weight to all quantiles (not statistically efficient), lower precision than MLE or L-moments for most distributions.
462
+
463
+
## Estimation Method Comparison
464
+
465
+
The choice of estimation method depends on sample size, data quality, and application requirements. The following table summarizes the key trade-offs:
466
+
467
+
| Method | Efficiency | Small Samples | Robustness | Complexity | Best For |
-**n < 50:** Prefer L-moments. With small samples, robustness matters more than asymptotic efficiency, and L-moment estimates are nearly unbiased.
477
+
-**n > 100:** MLE becomes competitive and provides standard errors via Fisher Information, enabling confidence intervals and hypothesis tests.
478
+
-**Skewed distributions:** L-moments substantially outperform MOM, because conventional skewness estimates are highly variable for small samples.
479
+
-**US flood frequency analysis:** L-moments are recommended by USGS Bulletin 17C [[1]](#1). The Expected Moments Algorithm (EMA) extends the framework to handle censored and historical data.
480
+
-**Model selection:** When comparing candidate distributions, MLE enables the use of information criteria (AIC, BIC) for objective model ranking.
481
+
301
482
## Distribution-Specific Estimation
302
483
303
484
### Log-Pearson Type III (USGS Bulletin 17C)
@@ -501,26 +682,6 @@ foreach (var (name, dist) in candidates)
0 commit comments