Statistical Moments Module
This module provides functions for estimating and shrinking statistical moments (mean and covariance matrix) of asset returns.
It includes: - estimate_sample_moments: For computing weighted sample mean and covariance. - shrink_mean_jorion: Implements the Bayes-Stein shrinkage estimator for the mean vector. - shrink_covariance_ledoit_wolf: Implements the Ledoit-Wolf shrinkage estimator for the covariance matrix.
- pyvallocation.moments.estimate_sample_moments(R: numpy.ndarray | pandas.Series | pandas.DataFrame, p: numpy.ndarray | pandas.Series | pandas.DataFrame) Tuple[numpy.ndarray | pandas.Series | pandas.DataFrame, numpy.ndarray | pandas.Series | pandas.DataFrame][source]
Estimates the weighted mean vector and covariance matrix from scenarios.
This function computes the first two statistical moments (mean and covariance) of asset returns, given a set of scenarios and their associated probabilities. The scenarios R represent different possible outcomes for asset returns, and p represents the probability of each scenario.
- Parameters:
R (ArrayLike) – A 2D array-like object (e.g.,
numpy.ndarray,pandas.DataFrame) of shape (T, N), where T is the number of scenarios/observations and N is the number of assets. Each row represents a scenario of asset returns.p (ArrayLike) – A 1D array-like object (e.g.,
numpy.ndarray,pandas.Series) of shape (T,), representing the probabilities associated with each scenario in R. These probabilities must be non-negative and sum to one.
- Returns:
- A tuple containing:
mu (ArrayLike): The weighted mean vector of asset returns. If R or p were pandas objects, mu will be a
pandas.Series.S (ArrayLike): The weighted covariance matrix of asset returns. If R or p were pandas objects, S will be a
pandas.DataFrame.
- Return type:
Tuple[ArrayLike, ArrayLike]
- Raises:
ValueError – If p has a length mismatch with R, or if p contains negative values or does not sum to one.
- pyvallocation.moments.shrink_covariance_ledoit_wolf(R: numpy.ndarray | pandas.Series | pandas.DataFrame, S_hat: numpy.ndarray | pandas.Series | pandas.DataFrame, target: str = 'identity') numpy.ndarray | pandas.Series | pandas.DataFrame[source]
Applies the Ledoit–Wolf shrinkage estimator for the covariance matrix [Ledoit and Wolf, 2004].
This estimator provides a well-conditioned covariance matrix, especially useful when the number of observations is small relative to the number of assets, or when the sample covariance matrix is ill-conditioned. It shrinks the sample covariance matrix towards a structured target matrix.
- Parameters:
R (ArrayLike) – A 2D array-like object (e.g.,
numpy.ndarray,pandas.DataFrame) of shape (T, N), where T is the number of observations and N is the number of assets. These are the returns data.S_hat (ArrayLike) – The sample covariance matrix (2D array-like, N×N). Can be a
numpy.ndarrayorpandas.DataFrame.target (str, optional) – The shrinkage target. -
"identity": Shrinks towards a scaled identity matrix. -"constant_correlation": Shrinks towards a constant correlation matrix. Defaults to"identity".
- Returns:
The shrunk covariance matrix. If R or S_hat were pandas objects, the output will be a
pandas.DataFrame.- Return type:
ArrayLike
- Raises:
ValueError – If input dimensions are invalid (e.g., T = 0, or S_hat shape mismatch), or if an unsupported target is specified.
Notes
The function calculates various components of the Ledoit-Wolf formula:
F: The target matrix.
pi_mat, pi_hat, diag_pi, off_pi, rho_hat: Components related to the estimation of the optimal shrinkage intensity.
gamma_hat: The squared Frobenius norm of the difference between the sample covariance and the target matrix.
kappa: Intermediate value for shrinkage intensity.
delta: The optimal shrinkage intensity, clipped between 0 and 1.
The final shrunk covariance matrix is ensured to be positive semi-definite using ensure_psd_matrix.
- pyvallocation.moments.shrink_mean_jorion(mu: numpy.ndarray | pandas.Series | pandas.DataFrame, S: numpy.ndarray | pandas.Series | pandas.DataFrame, T: int) numpy.ndarray | pandas.Series | pandas.DataFrame[source]
Applies Bayes–Stein shrinkage to the mean vector as in Jorion [Jorion, 1986].
This shrinkage estimator aims to improve the out-of-sample performance of mean estimates, especially when the number of assets (N) is large relative to the number of observations (T). It shrinks the sample mean towards a common mean (e.g., the global minimum variance portfolio mean).
- Parameters:
mu (ArrayLike) – The sample mean vector (1D array-like, length N). Can be a
numpy.ndarrayorpandas.Series.S (ArrayLike) – The sample covariance matrix (2D array-like, N×N). Can be a
numpy.ndarrayorpandas.DataFrame.T (int) – The number of observations (scenarios) used to estimate mu and S.
- Returns:
The Bayes-Stein shrunk mean vector. If mu was a
pandas.Series, the output will also be apandas.Series.- Return type:
ArrayLike
- Raises:
ValueError – If input dimensions are invalid (e.g., T <= 0, N <= 2, or S shape mismatch), or if the covariance matrix S is singular.
Notes
A small jitter (1e-8 * identity matrix) is added to S before inversion to handle potential singularity issues. The shrinkage intensity v is clipped between 0 and 1 to ensure a valid shrinkage factor.