Conversation
- np.split: splits array into equal sub-arrays (raises ArgumentException if not divisible) - np.array_split: splits array into sub-arrays allowing unequal division - Supports int (number of sections) and int[] (split indices) - Supports axis parameter for multi-dimensional arrays - Follows NumPy's swap-slice-swap approach for axis handling - Returns views (not copies) matching NumPy behavior - Full test coverage with NumPy 2.4.2 verified behavior NumPy reference: numpy/lib/_shape_base_impl.py (v2.4.2)
Weibull distribution implementation matching NumPy 2.x behavior: - weibull(a, size=None) - NumPy-aligned with optional size - weibull(a, Shape size) - Shape overload - weibull(a, int size) - 1D convenience - weibull(a, params int[] size) - multi-dimensional Features: - Inverse transform method: X = (-ln(1-U))^(1/a) - Validates a >= 0 (throws ArgumentException for negative) - a=0 returns all zeros (NumPy behavior) - a=1 reduces to exponential distribution (mean=1) - Returns float64 dtype Tests cover: - Scalar and array returns - Shape overloads - Validation (negative a throws, a=0 returns zeros) - Statistical properties (a=1 exponential, a=2 mean~0.886) - Non-negative output values - Heavy tail behavior (small a) and concentrated values (large a)
Standard Cauchy (Lorentz) distribution with loc=0, scale=1. Implementation: - standard_cauchy(Shape? size = null) - NumPy-aligned with optional size - standard_cauchy(Shape size) - Shape overload - standard_cauchy(int size) - 1D convenience - standard_cauchy(params int[] size) - multi-dimensional Algorithm: - Inverse transform: X = tan(pi * (U - 0.5)) where U ~ Uniform(0,1) Properties (verified in tests): - Median = 0 (mean/variance undefined due to heavy tails) - Interquartile range = 2 (Q1=-1, Q3=1) - Heavy tails produce extreme values - Returns float64 dtype Tests cover: - Scalar and array returns - Shape overloads - Statistical properties (median, IQR) - Heavy tail behavior - Reproducibility with seed
…ibution) - Draw samples from the von Mises distribution on [-pi, pi] - Matches NumPy 2.4.2 algorithm from distributions.c exactly: - kappa < 1e-8: uniform distribution on circle - 1e-5 <= kappa <= 1e6: rejection sampling algorithm - kappa > 1e6: wrapped normal approximation - Supports mu (mean direction) and kappa (concentration) parameters - kappa >= 0 required (raises ArgumentException otherwise) - Full test coverage including edge cases and statistical properties NumPy reference: numpy/random/src/distributions/distributions.c (v2.4.2)
F distribution (Fisher distribution) for ANOVA tests. Implementation: - f(dfnum, dfden, Shape? size = null) - NumPy-aligned - f(dfnum, dfden, Shape size) - Shape overload - f(dfnum, dfden, int size) - 1D convenience - f(dfnum, dfden, params int[] size) - multi-dimensional Algorithm: - F = (chi2_num * dfden) / (chi2_den * dfnum) - Reuses existing chisquare() which uses gamma() - Added GammaSample/MarsagliaSample helpers for scalar generation Validation: - dfnum > 0 required (throws ArgumentException) - dfden > 0 required (throws ArgumentException) Properties (verified): - Mean = dfden/(dfden-2) for dfden > 2 - All values positive - Returns float64 dtype Tests: 13 covering scalar/array, validation, statistics
Implements the logarithmic series distribution following NumPy 2.4.2 behavior exactly. The distribution returns positive integers k >= 1 with probability P(k) = -p^k / (k * ln(1-p)). Implementation details: - Uses NumPy's rejection sampling algorithm from distributions.c - Parameter validation: p must be in range [0, 1), throws for NaN - Returns int64 dtype (matches NumPy on 64-bit systems) - Edge cases: p=0 returns all 1s, small p produces mostly 1s Algorithm (from NumPy's random_logseries): 1. r = log(1-p) 2. If V >= p, return 1 3. q = 1 - exp(r * U) 4. If V <= q^2, return floor(1 + log(V)/log(q)) 5. If V >= q, return 1 6. Otherwise return 2 Tests cover: - Scalar and array output shapes - Small/high p value distribution characteristics - Validation (negative p, p=1, p>1, NaN) - Statistical mean approximation to theoretical value - All values >= 1 constraint Verified against NumPy 2.4.2 output via Python battletest.
Add standard_normal(Shape? size = null) overload to NumPyRandom: - Returns 0-d NDArray scalar when size is null (matches NumPy behavior) - Uses existing NextGaussian() helper for Box-Muller transform - Delegates to normal(0, 1.0, size) for array generation - Verified: randn, standard_normal, and normal(0,1) produce identical results Statistical verification: - mean ~= 0 (tested: 0.0098) - std ~= 1 (tested: 0.9933) - variance ~= 1 (tested: 0.9867) Tests: 15 test cases covering shapes, statistics, reproducibility
- Implements Student's t distribution matching NumPy 2.4.2 API - Supports df (degrees of freedom) parameter - Supports size parameter for array output - Includes unit tests
- Implements triangular distribution matching NumPy 2.4.2 API - Parameters: left, mode, right for triangular shape - Supports size parameter for array output - Includes unit tests validating shape, dtype, and bounds
Exhaustive inventory of all public APIs exposed by NumPy 2.4.2 as np.* covering constants, data types, array creation, manipulation, math functions, statistics, linear algebra, random sampling, FFT, and more. This serves as the authoritative reference for API parity work.
Documents all 142 np.* APIs in NumSharp 0.41.x with status: - 118 working, 12 partial, 12 broken/stub - Covers array creation, math, statistics, manipulation, random sampling - Tracks known behavioral differences from NumPy 2.x - Lists missing functions and TODO items for future work
…ions - hsplit: splits horizontally (column-wise) along axis 1 for 2D+, axis 0 for 1D - vsplit: splits vertically (row-wise) along axis 0, requires ndim >= 2 - dsplit: splits along depth (axis 2), requires ndim >= 3 All support both integer (equal division) and indices array overloads. Delegate to existing np.split with appropriate axis parameter. NumPy-aligned parameter validation and error messages.
Comprehensive tests based on NumPy 2.4.2 behavior: - hsplit: 1D/2D/3D arrays, indices overload, 0D error handling - vsplit: 2D/3D arrays, indices overload, 1D error handling - dsplit: 3D/4D arrays, indices overload, 1D/2D error handling Verifies shape correctness and element values match NumPy output.
Draw samples from the Dirichlet distribution - a distribution over vectors that sum to 1. Used for Bayesian inference and topic modeling. Algorithm: Draw Y_i ~ Gamma(alpha_i, 1), then normalize X = Y / sum(Y). Supports double[] and NDArray alpha, Shape or int[] size parameters. Output shape is (*size, k) where k is length of alpha.
Draw samples from the Gumbel distribution used to model distribution of maximum/minimum values. Common in extreme value statistics. PDF: p(x) = (1/scale) * exp(-(x-loc)/scale) * exp(-exp(-(x-loc)/scale)) Mean = loc + scale * γ (Euler-Mascheroni ≈ 0.5772) Algorithm matches NumPy's random_gumbel in distributions.c.
Draw samples from hypergeometric distribution - models drawing without replacement (e.g., white/black marbles from an urn). Parameters: ngood, nbad (populations), nsample (draws). Returns: number of "good" items in sample. Mean = nsample * ngood / (ngood + nbad) Uses complement optimization for large samples.
Draw samples from Laplace distribution - similar to Gaussian but with sharper peak and fatter tails. Represents difference between two independent exponential random variables. PDF: f(x; μ, λ) = (1/2λ) * exp(-|x - μ| / λ) Algorithm matches NumPy's random_laplace in distributions.c.
Draw samples from logistic distribution - similar to normal but with heavier tails. Used in extreme value problems, finance, growth modeling. PDF: f(x; μ, s) = exp(-(x-μ)/s) / (s * (1 + exp(-(x-μ)/s))^2) Mean = loc, Variance = scale^2 * π^2 / 3 Uses inverse transform: X = loc + scale * ln(U / (1 - U))
Draw samples from multinomial distribution - multivariate generalization of binomial. Models n experiments where each yields one of k outcomes. Algorithm from NumPy: for each category i, draw binomial with adjusted probability, subtract from remaining trials. Output shape is (*size, k) where k is length of pvals. Each row sums to n.
Draw samples from N-dimensional normal distribution specified by mean vector and covariance matrix. Algorithm: Cholesky decomposition L = cholesky(cov), then X = mean + L @ Z where Z ~ N(0, I). Supports check_valid modes: "warn", "raise", "ignore". Includes fallback for nearly positive-semidefinite matrices.
Draw samples from negative binomial - models number of failures before n successes with probability p per trial. Mean = n * (1-p) / p, Variance = n * (1-p) / p^2 Algorithm: Gamma-Poisson mixture (Y ~ Gamma, X ~ Poisson(Y)). Includes Marsaglia-Tsang gamma sampling and Knuth/rejection Poisson.
Draw samples from noncentral chi-square - generalization of chi-square with non-centrality parameter. Mean = df + nonc. NumPy algorithm: - nonc == 0: return chisquare(df) - df > 1: return chisquare(df-1) + (N(0,1) + sqrt(nonc))^2 - df <= 1: return chisquare(df + 2*Poisson(nonc/2))
Draw samples from noncentral F distribution - used for power analysis in hypothesis testing when null hypothesis is false. Algorithm from NumPy: t = noncentral_chisquare(dfnum, nonc) * dfden return t / (chisquare(dfden) * dfnum)
Draw samples from Pareto II (Lomax) distribution - not classical Pareto. Used for modeling heavy-tailed phenomena like wealth distribution. PDF: f(x; a) = a / (1 + x)^(a+1) for x >= 0 Mean = 1/(a-1) for a > 1 Uses inverse transform: X = (1 / U^(1/a)) - 1
Draw samples in [0, 1] from power distribution with exponent a-1. Also known as power function distribution - inverse of Pareto. PDF: P(x; a) = a * x^(a-1) for 0 <= x <= 1, a > 0 Special case of Beta distribution. Uses inverse transform: U^(1/a)
Draw samples from Rayleigh distribution - models wind speed when East/North components have identical zero-mean Gaussian distributions. PDF: P(x; scale) = (x/scale^2) * exp(-x^2/(2*scale^2)) Mean = scale * sqrt(π/2) ≈ 1.253 * scale Algorithm: scale * sqrt(-2 * log(1 - U))
Draw samples from standard exponential distribution (scale=1). Equivalent to exponential(scale=1.0, size=size). Mean = 1, Variance = 1 Uses inverse transform: X = -log(1 - U) where U ~ Uniform(0, 1)
Draw samples from standard Gamma distribution (scale=1). For different scale, multiply result: scale * standard_gamma(shape). PDF: p(x) = x^(shape-1) * e^(-x) / Gamma(shape) shape=0 returns all zeros. Reuses SampleStandardGamma() from np.random.standard_t.cs.
Draw samples from Wald (inverse Gaussian) distribution - first studied in relation to Brownian motion. Approaches Gaussian as scale → ∞. PDF: P(x;μ,λ) = √(λ/(2πx³)) * exp(-λ(x-μ)²/(2μ²x)) Algorithm from NumPy's random_wald in distributions.c.
Draw samples from Zipf distribution - models word frequency in texts, city sizes, and many power-law phenomena. PMF: p(k) = k^(-a) / ζ(a) where ζ is Riemann zeta function Requires a > 1. Returns positive integers (long). Uses rejection sampling algorithm from NumPy's random_zipf.
This was referenced Apr 12, 2026
Closed
Closed
Closed
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Major refactor of NumSharp's random sampling module to achieve 1-to-1 parity with NumPy 2.x.
int[](non-params) +params long[]size overloads across all distributionsStats
Breaking Changes
rand(5, 10)no longer compilesrand(5L, 10L)ornew Shape(5, 10)[0]or use.GetValue()Related
Test plan