Bezierv: Flexible Bézier Random Variables¶

bezierv is a Python package for fitting, analyzing, and sampling from Bézier-based random variables. Bézier random variables can adapt to virtually any continuous distribution shape.

New to Bézier distributions?

Start with our Quick Start Guide for a hands-on introduction, or explore the Interactive Demo to see Bézier curves in action.

✨ Key Features¶

🎯 Flexible Fitting: Adapt to any continuous distribution shape
⚡ Multiple Algorithms: Choose from 4 optimization methods
🔄 Convolution Support: Compute sums of random variables exactly or via Monte Carlo
🎮 Interactive Tools: Browser-based curve editor with real-time updates
📊 Rich Visualization: Built-in plotting for CDFs, PDFs, and control points
🔢 Statistical Functions: Moments, quantiles, sampling, and probability calculations

Quick Start¶

Installation¶

Install bezierv using pip:

pip install bezierv

Basic Example¶

Fit a Bézier distribution to your data in just a few lines:

import numpy as np
from bezierv.classes.distfit import DistFit

# Generate sample data (replace with your own)
np.random.seed(42)
data = np.random.beta(2, 5, 1000)  # Skewed distribution

# Fit Bézier distribution
fitter = DistFit(data, n=5)  # 5 control segments (6 control points)
bezier_rv, mse = fitter.fit(method="projgrad")

print(f"Fit completed with MSE: {mse:.6f}")

# Use the fitted distribution
samples = bezier_rv.random(100)      # Generate new samples
mean = bezier_rv.get_mean()            # Compute mean
q90 = bezier_rv.quantile(0.90)         # 90th percentile
cdf_val = bezier_rv.cdf_x(0.5)         # P(X ≤ 0.5)

print(f"Mean: {mean:.3f}, 90% quantile: {q90:.3f}")

Visualization¶

Compare your fitted distribution with the empirical data:

import matplotlib.pyplot as plt

# Create side-by-side plots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# Plot CDF comparison
bezier_rv.plot_cdf(data, ax=ax1)
ax1.set_title("Cumulative Distribution Function")

# Plot PDF
bezier_rv.plot_pdf(ax=ax2)
ax2.set_title("Probability Density Function")

plt.tight_layout()
plt.show()

🎮 Interactive Visualization¶

Launch an interactive Bézier curve editor to explore how control points affect distribution shape:

from bezierv.classes.bezierv import InteractiveBezierv
from bokeh.plotting import curdoc

# Define initial control points
controls_x = [0.0, 0.25, 0.75, 1.0]  # X-coordinates (domain)
controls_z = [0.0, 0.1, 0.9, 1.0]    # Z-coordinates (CDF values)

# Create interactive editor
editor = InteractiveBezierv(controls_x, controls_z)

# Launch in Bokeh server
curdoc().add_root(editor.layout)
curdoc().title = "Bézier Distribution Editor"

Save as bezier_app.py and run:

bokeh serve --show bezier_app.py

This opens an interactive tool in your browser where you can: - ✏️ Edit control points by clicking and dragging - ➕ Add/remove points to change complexity - 📊 View real-time updates of both CDF and PDF - 💾 Export control points as CSV

🔄 Convolution: Sums of Random Variables¶

Monte Carlo Convolution (Fast)¶

from bezierv.classes.convolver import Convolver

# Fit two separate distributions
data1 = np.random.gamma(2, 2, 1000)
data2 = np.random.exponential(1, 1000)

rv1 = DistFit(data1, n=4).fit()[0]
rv2 = DistFit(data2, n=4).fit()[0]

# Compute their sum via Monte Carlo
convolver = Convolver([rv1, rv2])
sum_rv = convolver.convolve(n_sims=10000, rng=42)

print(f"Sum mean: {sum_rv.get_mean():.3f}")

🔧 Fitting Algorithms¶

Choose the best algorithm for your use case:

Algorithm	Method Call
Projected Gradient	`method="projgrad"`
Projected Subgradient	`method="projsubgrad"`
Nonlinear Optimization	`method="nonlinear"`
Nelder-Mead	`method="neldermead"`

Algorithm Comparison Example¶

methods = ["projgrad", "projsubgrad", "nonlinear", "neldermead"]
results = {}

for method in methods:
    fitter = DistFit(data, n=5)
    bz, mse = fitter.fit(method=method, max_iter_PG=1000)
    results[method] = {"mse": mse, "mean": bz.get_mean()}
    print(f"{method:12s}: MSE = {mse:.6f}, Mean = {bz.get_mean():.6f}")

📊 Advanced Examples¶

Fit complex, multi-modal distributions:

# Create bimodal data
data_bimodal = np.concatenate([
    np.random.normal(2, 0.5, 500),    # First mode
    np.random.normal(8, 0.8, 500)     # Second mode
])

# Use more control points for complex shapes
fitter = DistFit(data_bimodal, n=10)
bimodal_rv, mse = fitter.fit(method="nonlinear")

# Visualize the complex fit
bimodal_rv.plot_pdf()

🎯 Best Practices¶

Choosing the Number of Control Points¶

Simple data: n=3-5 (few parameters, fast fitting)
Complex shapes: n=6-10 (more flexibility)
Multi-modal: n=8-15 (capture multiple peaks)

Overfitting

More control points ≠ always better. Start simple and increase complexity only if needed.

Algorithm Selection Guide¶

Start with projgrad - fastest and works well for most cases
Try nonlinear if you need highest accuracy and can afford to fail

Performance Tips¶

# For large datasets, consider subsampling for initial fit
if len(data) > 10000:
    subset = np.random.choice(data, 5000, replace=False)
    fitter = DistFit(subset, n=5)
    quick_fit, _ = fitter.fit(method="projgrad")

📚 Next Steps¶

🔧 API Reference - Complete function documentation
📖 Tutorials - Step-by-step learning with examples
🐛 Issues - Report bugs or request features