Statistical Moments Module

This module provides functions for estimating and shrinking statistical moments (mean and covariance matrix) of asset returns.

It includes: - estimate_sample_moments: For computing weighted sample mean and covariance. - shrink_mean_jorion: Implements the Bayes-Stein shrinkage estimator for the mean vector. - shrink_covariance_ledoit_wolf: Implements the Ledoit-Wolf shrinkage estimator for the covariance matrix.

pyvallocation.moments.estimate_sample_moments(R: numpy.ndarray | pandas.Series | pandas.DataFrame, p: numpy.ndarray | pandas.Series | pandas.DataFrame) Tuple[numpy.ndarray | pandas.Series | pandas.DataFrame, numpy.ndarray | pandas.Series | pandas.DataFrame][source]

Estimates the weighted mean vector and covariance matrix from scenarios.

This function computes the first two statistical moments (mean and covariance) of asset returns, given a set of scenarios and their associated probabilities. The scenarios R represent different possible outcomes for asset returns, and p represents the probability of each scenario.

Parameters:
  • R (ArrayLike) – A 2D array-like object (e.g., numpy.ndarray, pandas.DataFrame) of shape (T, N), where T is the number of scenarios/observations and N is the number of assets. Each row represents a scenario of asset returns.

  • p (ArrayLike) – A 1D array-like object (e.g., numpy.ndarray, pandas.Series) of shape (T,), representing the probabilities associated with each scenario in R. These probabilities must be non-negative and sum to one.

Returns:

A tuple containing:
  • mu (ArrayLike): The weighted mean vector of asset returns. If R or p were pandas objects, mu will be a pandas.Series.

  • S (ArrayLike): The weighted covariance matrix of asset returns. If R or p were pandas objects, S will be a pandas.DataFrame.

Return type:

Tuple[ArrayLike, ArrayLike]

Raises:

ValueError – If p has a length mismatch with R, or if p contains negative values or does not sum to one.

pyvallocation.moments.shrink_covariance_ledoit_wolf(R: numpy.ndarray | pandas.Series | pandas.DataFrame, S_hat: numpy.ndarray | pandas.Series | pandas.DataFrame, target: str = 'identity') numpy.ndarray | pandas.Series | pandas.DataFrame[source]

Applies the Ledoit–Wolf shrinkage estimator for the covariance matrix [Ledoit and Wolf, 2004].

This estimator provides a well-conditioned covariance matrix, especially useful when the number of observations is small relative to the number of assets, or when the sample covariance matrix is ill-conditioned. It shrinks the sample covariance matrix towards a structured target matrix.

Parameters:
  • R (ArrayLike) – A 2D array-like object (e.g., numpy.ndarray, pandas.DataFrame) of shape (T, N), where T is the number of observations and N is the number of assets. These are the returns data.

  • S_hat (ArrayLike) – The sample covariance matrix (2D array-like, N×N). Can be a numpy.ndarray or pandas.DataFrame.

  • target (str, optional) – The shrinkage target. - "identity": Shrinks towards a scaled identity matrix. - "constant_correlation": Shrinks towards a constant correlation matrix. Defaults to "identity".

Returns:

The shrunk covariance matrix. If R or S_hat were pandas objects, the output will be a pandas.DataFrame.

Return type:

ArrayLike

Raises:

ValueError – If input dimensions are invalid (e.g., T = 0, or S_hat shape mismatch), or if an unsupported target is specified.

Notes

The function calculates various components of the Ledoit-Wolf formula:

  • F: The target matrix.

  • pi_mat, pi_hat, diag_pi, off_pi, rho_hat: Components related to the estimation of the optimal shrinkage intensity.

  • gamma_hat: The squared Frobenius norm of the difference between the sample covariance and the target matrix.

  • kappa: Intermediate value for shrinkage intensity.

  • delta: The optimal shrinkage intensity, clipped between 0 and 1.

The final shrunk covariance matrix is ensured to be positive semi-definite using ensure_psd_matrix.

pyvallocation.moments.shrink_mean_jorion(mu: numpy.ndarray | pandas.Series | pandas.DataFrame, S: numpy.ndarray | pandas.Series | pandas.DataFrame, T: int) numpy.ndarray | pandas.Series | pandas.DataFrame[source]

Applies Bayes–Stein shrinkage to the mean vector as in Jorion [Jorion, 1986].

This shrinkage estimator aims to improve the out-of-sample performance of mean estimates, especially when the number of assets (N) is large relative to the number of observations (T). It shrinks the sample mean towards a common mean (e.g., the global minimum variance portfolio mean).

Parameters:
  • mu (ArrayLike) – The sample mean vector (1D array-like, length N). Can be a numpy.ndarray or pandas.Series.

  • S (ArrayLike) – The sample covariance matrix (2D array-like, N×N). Can be a numpy.ndarray or pandas.DataFrame.

  • T (int) – The number of observations (scenarios) used to estimate mu and S.

Returns:

The Bayes-Stein shrunk mean vector. If mu was a pandas.Series, the output will also be a pandas.Series.

Return type:

ArrayLike

Raises:

ValueError – If input dimensions are invalid (e.g., T <= 0, N <= 2, or S shape mismatch), or if the covariance matrix S is singular.

Notes

A small jitter (1e-8 * identity matrix) is added to S before inversion to handle potential singularity issues. The shrinkage intensity v is clipped between 0 and 1 to ensure a valid shrinkage factor.