GP.autocorrelate

GP.autocorrelate#

GP.autocorrelate(p: Any, Y: Array, max_sep_l: int | None = None, max_sep_t: int | None = None, include_gp_mean: bool | None = True, mat: Array | None = None, plot: bool | None = True, plot_kwargs: dict | None = None, zero_centre: bool | None = False, cov: bool | None = False) Array | Tuple[Array, Figure][source]#

Performs a quick (and approximate) 2D autocorrelation using jax.scipy.signal.correlate2d on the observed data after subtraction of the GP predictive mean to examine if there is any remaining correlation in the residuals.

Note

This function also assumes the data is evenly spaced in both dimensions. It is also not an exact autocorrelation as the mean is not subtracted for each set of residuals and therefore it is assumed the residuals always have mean zero. Also instead of dividing by the standard deviations of the specific residuals being multiplied together, all values are divided by the overall variance of the residuals. This can result in some values having correlation lying outside the interval [-1, 1] but runs very efficiently and should be reasonably accurate unless considering correlations between widely separated parts of the data. For this reason, by default only half the separation of the data is visualised in the plots.

If include_gp_mean = False then shows the autocorrelation of the observed data minus the mean function (without the GP predictive mean) which is useful for visualising what kernel function to use when fitting with a GP.

Can also just input a general matrix mat to run an autocorrelation on, in which case the inputs p and Y are ignored.

Parameters:
  • p (PyTree) – Pytree of hyperparameters used to calculate the covariance matrix in addition to any mean function parameters which may be needed to calculate the mean function.

  • Y (JAXArray) – Observed data to fit, must be of shape (N_l, N_t).

  • max_sep_l (int, optional) – The maximum separation of wavelengths/rows to visualise the correlation of. This is given as an integer referring to the number of rows apart to show. Defaults to half the number of rows in the observed data Y.

  • max_sep_t (int, optional) – The maximum separation of time/columns to visualise the correlation of. This is given as an integer referring to the number of columns apart to show. Defaults to half the number of columns in the observed data Y.

  • include_gp_mean (bool, optional) – Whether to subtract the GP predictive mean from the observed data when calculating the residuals. If False will still subtract the deterministic mean function. Useful for visualising correlation in a data set to aid in kernel choice before fitting with a GP.

  • mat (JAXArray, optional) – Instead of using the residuals of the observed data, providing a matrix for this argument will calculate the autocorrelation of this given matrix instead.

  • plot (bool, optional) – If True then will produce a plot visualising the autocorrelation. Defaults to True.

  • zero_centre (bool, optional) – Whether to set the correlation at zero separation to 0. Since the correlation at zero separation is always 1 it can make it hard to visualise correlation values which are very small so setting this to True can aid visualisation. Defaults to False.

  • cov (bool, optional) – If True will return the autocovariance matrix instead.

Returns:

Returns the autocorrelation matrix of the residuals. If plot = True then will return a tuple with the generated plt.Figure also added.

Return type:

JAXArray or (JAXArray, plt.Figure)