Skip to content

Bezierv: Bรฉzier Random Variables

bezierv logo

bezierv is a Python package for fitting, analyzing, and sampling from Bรฉzier random variables. Bรฉzier random variables can adapt to virtually any continuous distribution shape.

New to Bรฉzier distributions?

Start with our Quick Start Guide for a hands-on introduction, or explore the Interactive Demo to see Bรฉzier curves in action.


โœจ Key Features

  • ๐ŸŽฏ Flexible Fitting: Adapt to any continuous distribution shape
  • โšก Multiple Algorithms: 3 MSE algorithms (projgrad, nonlinear, neldermead) plus MLE fitting
  • ๐Ÿ”„ Convolution Support: Compute sums of random variables exactly or via Monte Carlo
  • ๐ŸŽฎ Interactive Tools: Browser-based curve editor with real-time updates
  • ๐Ÿ“Š Rich Visualization: Built-in plotting for CDFs, PDFs, and control points
  • ๐Ÿ”ข Statistical Functions: Moments, quantiles, sampling, and probability calculations

Quick Start

Installation

Install bezierv using pip:

pip install bezierv

Basic Example

Fit a Bรฉzier distribution to your data in just a few lines:

import numpy as np
from bezierv import DistFit

# Generate sample data (replace with your own)
rng = np.random.default_rng(42)
data = rng.beta(2, 5, 1000)  # Skewed distribution

# Fit Bรฉzier distribution
fitter = DistFit(data, n=5)  # 5 control segments (6 control points)
bezier_rv, mse = fitter.fit(method='mse', algorithm='projgrad')

print(f"Fit completed with MSE: {mse:.6f}")

# Use the fitted distribution
samples = bezier_rv.random(100)      # Generate new samples
mean = bezier_rv.get_mean()            # Compute mean
q90 = bezier_rv.quantile(0.90)         # 90th percentile
cdf_val = bezier_rv.cdf_x(0.5)         # P(X โ‰ค 0.5)

print(f"Mean: {mean:.3f}, 90% quantile: {q90:.3f}")

Visualization

Compare your fitted distribution with the empirical data:

import matplotlib.pyplot as plt

# Create side-by-side plots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# Plot CDF comparison
bezier_rv.plot_cdf(data, ax=ax1)
ax1.set_title("Cumulative Distribution Function")

# Plot PDF
bezier_rv.plot_pdf(ax=ax2)
ax2.set_title("Probability Density Function")

plt.tight_layout()
plt.show()

๐ŸŽฎ Interactive Visualization

Launch an interactive Bรฉzier curve editor to explore how control points affect distribution shape:

from bezierv.classes.bezierv import InteractiveBezierv
from bokeh.plotting import curdoc

# Define initial control points
controls_x = [0.0, 0.25, 0.75, 1.0]  # X-coordinates (domain)
controls_z = [0.0, 0.1, 0.9, 1.0]    # Z-coordinates (CDF values)

# Create interactive editor
editor = InteractiveBezierv(controls_x, controls_z)

# Launch in Bokeh server
curdoc().add_root(editor.layout)
curdoc().title = "Bรฉzier Distribution Editor"

Save as bezier_app.py and run:

python -m bokeh serve --show bezier_app.py

This opens an interactive tool in your browser where you can: - โœ๏ธ Edit control points by clicking and dragging with the Point Draw Tool - โž• Add/remove points to change complexity (add with a click with the Point Draw Tool, delete with button) - ๐Ÿ“Š View real-time updates of both CDF and PDF - ๐Ÿ’พ Export control points as CSV


๐Ÿ”„ Convolution: Sums of Random Variables

Monte Carlo Convolution (Fast)

from bezierv import DistFit, Convolver

# Fit two separate distributions
rng = np.random.default_rng(42)
data1 = rng.gamma(2, 2, 1000)
data2 = rng.exponential(1, 1000)

rv1, _ = DistFit(data1, n=4).fit(method='mse', algorithm='projgrad')
rv2, _ = DistFit(data2, n=4).fit(method='mse', algorithm='projgrad')

# Compute their sum via Monte Carlo
convolver = Convolver([rv1, rv2])
sum_rv, _ = convolver.convolve(n_sims=10000, rng=42)

print(f"Sum mean: {sum_rv.get_mean():.3f}")

๐Ÿ”ง Fitting Algorithms

Choose the best algorithm for your use case:

Objective Algorithm Call
MSE Projected Gradient method='mse', algorithm='projgrad'
MSE Nonlinear Optimization method='mse', algorithm='nonlinear'
MSE Nelder-Mead method='mse', algorithm='neldermead'
MLE Primal Gradient method='mle'

Algorithm Comparison Example

import numpy as np
from bezierv import DistFit

rng = np.random.default_rng(42)
data = rng.beta(2, 5, 1000)

# MSE-based algorithms
mse_algorithms = ["projgrad", "nonlinear", "neldermead"]
for algo in mse_algorithms:
    fitter = DistFit(data, n=5)
    bz, mse = fitter.fit(method='mse', algorithm=algo)
    print(f"mse/{algo:12s}: MSE = {mse:.6f}, Mean = {bz.get_mean():.4f}")

# MLE fitting
fitter = DistFit(data, n=5)
bz_mle, nll = fitter.fit(method='mle')
print(f"mle/primgrad  : NLL = {nll:.6f}, Mean = {bz_mle.get_mean():.4f}")

๐Ÿ“Š Advanced Examples

Multi-Modal Distributions

Fit complex, multi-modal distributions:

# Create bimodal data
data_bimodal = np.concatenate([
    np.random.normal(2, 0.5, 500),    # First mode
    np.random.normal(8, 0.8, 500)     # Second mode
])

# Use more control points for complex shapes
fitter = DistFit(data_bimodal, n=10)
bimodal_rv, mse = fitter.fit(method='mse', algorithm='nonlinear')

# Visualize the complex fit
bimodal_rv.plot_pdf()

๐ŸŽฏ Best Practices

Choosing the Number of Control Points

  • Simple data: n=3-5 (few parameters, fast fitting)
  • Complex shapes: n=6-10 (more flexibility)
  • Multi-modal: n=8-15 (capture multiple peaks)

Overfitting

More control points โ‰  always better. Start simple and increase complexity only if needed.

Algorithm Selection Guide

  1. Start with method='mse', algorithm='projgrad' - fastest, works well for most cases
  2. Try method='mse', algorithm='nonlinear' if you need highest MSE accuracy
  3. Use method='mle' to fit by maximum likelihood (returns NLL instead of MSE)

Performance Tips

# For large datasets, consider subsampling for initial fit
if len(data) > 10000:
    rng = np.random.default_rng(42)
    subset = rng.choice(data, 5000, replace=False)
    fitter = DistFit(subset, n=5)
    quick_fit, _ = fitter.fit(method='mse', algorithm='projgrad')

๐Ÿ“š Next Steps


๐Ÿ“„ Citation

If you use bezierv in your research, please cite the accompanying paper (forthcoming on arXiv):

@article{leiva2026bezierv,
  title   = {Computational Framework for {B\'{e}zier} Distributions},
  author  = {Leiva, Esteban and Medaglia, Andr\'{e}s L. and Zuluaga, Luis F.},
  year    = {2026},
  note    = {Manuscript under review}
}