pyvallocation.bayesian module

class pyvallocation.bayesian.NIWParams(T1: int, mu1: numpy.typing.NDArray.numpy.floating | pandas.Series.numpy.floating, nu1: int, sigma1: numpy.typing.NDArray.numpy.floating | pandas.DataFrame.numpy.floating)[source]

Bases: object

A container for the parameters of a Normal-Inverse-Wishart (NIW) posterior distribution. :no-inheritance:

These parameters are the result of a Bayesian update, combining an NIW prior with market data, as detailed in Meucci (2005). The formulas for these posterior parameters are given in Eqs. (11)-(14).

T1: int

mu1: numpy.typing.NDArray.numpy.floating | pandas.Series.numpy.floating

nu1: int

sigma1: numpy.typing.NDArray.numpy.floating | pandas.DataFrame.numpy.floating

class pyvallocation.bayesian.NIWPosterior(prior_mu: numpy.typing.NDArray.numpy.floating | pandas.Series.numpy.floating, prior_sigma: numpy.typing.NDArray.numpy.floating | pandas.DataFrame.numpy.floating, t0: int, nu0: int)[source]

Bases: object

Computes and manages Normal-Inverse-Wishart (NIW) posterior parameters.

This class implements the Bayesian update rules for an NIW distribution, which is the conjugate prior for a multivariate normal likelihood with unknown mean and covariance. The methodology follows Section 3 of Meucci (2005). It provides methods to calculate posterior parameters, classical-equivalent estimators, and factors used in robust Bayesian asset allocation.

The model assumes that asset returns are independently and identically distributed according to a normal distribution. The investor’s prior knowledge is modeled as an NIW distribution.

How to Use:

Initialize the NIWPosterior object with prior parameters:
- prior_mu (\(\mu_0\)): The prior estimate for the mean vector.
- prior_sigma (\(\Sigma_0\)): The prior scale matrix for the covariance.
- t0 (\(T_0\)): The confidence in prior_mu, expressed as a
  pseudo-count of observations.
- nu0 (\(\nu_0\)): The confidence in prior_sigma, expressed as a
  pseudo-count of observations.
Call the update() method with sample statistics from observed data:
- sample_mu (\(\hat{\mu}\)): The mean vector from the data.
- sample_sigma (\(\hat{\Sigma}\)): The covariance matrix from the data.
- n_obs (\(T\)): The number of observations in the data sample.
The update() method returns an NIWParams object with the posterior parameters (\(T_1, \mu_1, \nu_1, \Sigma_1\)), which are a blend of the prior and the market data.
Use accessor methods like get_mu_ce(), get_S_mu(), etc., to retrieve various quantities derived from the posterior distribution.

prior_mu: The prior mean vector (\(\mu_0\)).

prior_sigma: The prior scale matrix (\(\Sigma_0\)).

t0: The prior pseudo-count for the mean (\(T_0\)).

nu0: The prior pseudo-count for the covariance (\(\nu_0\)).

N: The number of assets.

_asset_index: Stores pandas.Index if pandas objects are used.

_posterior: Stores the computed posterior parameters.

cred_radius_mu(p_mu: float) → float[source]

Computes the credibility factor \(\gamma_\mu\) for the mean’s uncertainty.

This factor, \(\gamma_\mu\), appears in the simplified robust mean-variance optimization problem (Eq. 19). It scales the portfolio’s posterior standard deviation to penalize for estimation risk in the mean vector. Its formula is given by Eq. (20):

\[\gamma_\mu = \sqrt{ \frac{q_\mu^2}{T_1} \frac{\nu_1}{\nu_1 - 2} }\]

where \(q_\mu^2 = Q_{\chi^2_N}(p_{mu})\) is the squared radius factor from the chi-square distribution.

Parameters:

p_mu – The confidence level for \(\mu\) (0 < p_mu < 1), which reflects aversion to estimation risk.

Returns:

The credibility factor \(\gamma_\mu\).

Raises:

RuntimeError – If posterior parameters have not been computed.
ValueError – If \(\nu_1 \le 2\) or p_mu is not in (0,1).

cred_radius_sigma_factor(p_sigma: float) → float[source]

Computes the scaling factor for the worst-case portfolio variance.

In the robust framework, the maximum possible variance of a portfolio within the uncertainty ellipsoid for \(\Sigma\) is not simply \(w'\Sigma_1 w\), but a scaled version of it. This method computes that scaling factor, which we can call \(C_\Sigma\). The derivation is shown in Appendix 7.2, and the final result is presented in the maximization step in Eq. (47):

\[\max_{\Sigma \in \Theta_\Sigma} w'\Sigma w = \underbrace{ \left[ \frac{\nu_1}{\nu_1 + N + 1} + \sqrt{\frac{2\nu_1^2 q_\Sigma^2}{(\nu_1 + N + 1)^3}} \right] }_{C_\Sigma} (w'\Sigma_1 w)\]

where \(q_\Sigma^2 = Q_{\chi^2_{dof}}(p_\Sigma)\) with \(dof = N(N+1)/2\).

Parameters:

p_sigma – The confidence level for \(\Sigma\) (0 < p_sigma < 1), reflecting aversion to estimation risk.

Returns:

The credibility factor \(C_\Sigma\) for scaling the portfolio variance.

Raises:

RuntimeError – If posterior parameters have not been computed.
ValueError – If p_sigma is not in (0,1) or internal terms are invalid.

get_S_mu() → npt.NDArray[np.floating] | 'pd.DataFrame[np.floating]'[source]

Computes the scatter matrix \(S_{\mu}\) for the posterior of \(\mu\).

This matrix describes the dispersion of the marginal posterior distribution of \(\mu\) and is used to define the location-dispersion ellipsoid for robust optimization. It is defined in Eq. (16):

\[S_{\mu} = \frac{1}{T_1} \frac{\nu_1}{\nu_1 - 2} \Sigma_1\]

This computation requires the posterior degrees of freedom \(\nu_1 > 2\).

Returns:

The scatter matrix \(S_{\mu}\) as a NumPy array or pandas DataFrame.

Raises:

RuntimeError – If posterior parameters have not been computed.
ValueError – If \(\nu_1 \le 2\), as the scatter matrix is not defined.

get_mu_ce() → npt.NDArray[np.floating] | 'pd.Series[np.floating]'[source]

Computes the classical-equivalent estimator for the mean, \(\hat{\mu}_{ce}\).

For the NIW model, this estimator is the posterior mean \(\mu_1\), as defined in Eq. (15).

\[\hat{\mu}_{ce} = \mu_1\]

Returns:: The posterior mean vector \(\mu_1\) as a NumPy array or pandas Series.
Raises:: RuntimeError – If posterior parameters have not been computed via update().

get_posterior() → NIWParams | None[source]

Retrieves the computed posterior parameters.

Returns:: A NIWParams instance containing the posterior parameters (\(T_1\), \(\mu_1\), \(\nu_1\), \(\Sigma_1\)), or None if update() has not been called.

get_sigma_ce() → npt.NDArray[np.floating] | 'pd.DataFrame[np.floating]'[source]

Computes the classical-equivalent estimator for the covariance, \(\hat{\Sigma}_{ce}\).

This estimator is a shrunk version of the posterior scale matrix \(\Sigma_1\), as defined in Eq. (17). It serves as the center of the uncertainty ellipsoid for \(\Sigma\).

\[\hat{\Sigma}_{ce} = \frac{\nu_1}{\nu_1 + N + 1} \Sigma_1\]

Returns:

The classical-equivalent estimator \(\hat{\Sigma}_{ce}\) as a NumPy array or pandas DataFrame.

Raises:

RuntimeError – If posterior parameters have not been computed.
ValueError – If \(\nu_1 + N + 1 = 0\), which is highly unlikely with valid inputs.

update(sample_mu: npt.NDArray[np.floating] | 'pd.Series[np.floating]', sample_sigma: npt.NDArray[np.floating] | 'pd.DataFrame[np.floating]', n_obs: int) → NIWParams[source]

Updates the posterior parameters using sample statistics.

This method implements the Bayesian update rules for the NIW parameters as given by Eqs. (11)–(14) in Meucci (2005).

\[\begin{split}\begin{align*} T_1 &= T_0 + T \\ \mu_1 &= \frac{T_0\mu_0 + T\hat{\mu}}{T_1} \\ \nu_1 &= \nu_0 + T \\ \Sigma_1 &= \frac{1}{\nu_1} \left[ \nu_0\Sigma_0 + T\hat{\Sigma} + \frac{(\mu_0 - \hat{\mu})(\mu_0 - \hat{\mu})'}{\frac{1}{T} + \frac{1}{T_0}} \right] \end{align*}\end{split}\]

The resulting posterior parameters blend the investor’s prior with information from the market, with the balance determined by the relative confidence levels (\(T_0, \nu_0\)) versus the amount of data (\(T\)).

Parameters:

sample_mu – 1D array (length N) of sample means (\(\hat{\mu}\)).
sample_sigma – 2D array (N x N) of the sample covariance matrix (\(\hat{\Sigma}\)).
n_obs – The number of observations in the sample (\(T\)).

Returns:

A NIWParams instance with the updated posterior parameters. If pandas objects were used in initialization, returns \(\mu_1\) as a Series and \(\Sigma_1\) as a DataFrame.

Raises:

ValueError – If sample statistics have inconsistent shapes or n_obs is not a positive integer.

pyvallocation.bayesian.chi2_quantile(p: float, dof: int, sqrt: bool = False) → float[source]

Computes the quantile of the chi-square (χ²) distribution.

This function is used to determine the size of the uncertainty ellipsoids for the market parameters (mean and covariance). The size is determined by a radius factor, \(q\), which is set according to a quantile of the chi-square distribution.

For the mean vector \(\mu\), under the assumption that its posterior distribution is normal, the squared Mahalanobis distance is chi-square distributed. The radius factor squared, \(q_\mu^2\), is set using a quantile of the \(\chi^2_N\) distribution.

\[q_\mu^2 = Q_{\chi_N^2}(p_\mu)\]

For the covariance matrix \(\Sigma\), a similar approach is used based on a heuristic argument that the Mahalanobis distance behaves like a \(\chi^2\) distribution with \(N(N+1)/2\) degrees of freedom .

\[q_\Sigma^2 = Q_{\chi_{N(N+1)/2}^2}(p_\Sigma)\]

Parameters:

p – The probability level (0 < p < 1) for the quantile.
dof – The degrees of freedom for the chi-square distribution.
sqrt – If True, returns the square root of the quantile. Defaults to False.

Returns:

The chi-square quantile \(Q_{\chi²}(p)\) or \(\sqrt{Q_{\chi²}(p)}\) if sqrt is True.

Raises:

ValueError – If p is not strictly between 0 and 1.