2c-exercise_numpy_solutions.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# NumPy Exercise: Correlated Gaussian Random Variables"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In many situations, we need to have a series of correlated Gaussian random variables, which can then be transformed into other distributions of interest (uniform, lognormal, etc.). Let's see how to do that with NumPy in Python.\n",
    "\n",
    "### Given:  \n",
    "\n",
    "|Variable | Value | Description |\n",
    "| ---: | :---: | :--- |\n",
    "|`n_real` | `1E6` | number of realizations|\n",
    "|`n_vars` | 3 | number of variables to correlate|\n",
    "|`cov` | `[[ 1. ,  0.2,  0.4], [ 0.2,  0.8,  0.3], [ 0.4,  0.3,  1.1]]` | covariance matrix|\n",
    "\n",
    "### Theory\n",
    "\n",
    "The procedure for generating correlated Gaussian is as follows:  \n",
    "1. Sample `[n_vars x n_real]` (uncorrelated) normal random variables\n",
    "2. Calculate `chol_mat`, the Cholesky decomposition of the covariance matrix\n",
    "3. Matrix-multiply your random variables with `chol_mat` to produce a `[n_vars x n_real]` array of correlated Gaussian variables\n",
    "\n",
    "### Exercise\n",
    "\n",
    "Do the following:  \n",
    "1. Fill in the blank cells below so that the code follows the theory outlined above.\n",
    "2. Calculate the variances of the three samples of random variables. Does it match the diagonal of the covariance matrix?\n",
    "3. Calculate the correlation coefficient between the first and second random samples. Does it match `cov[0, 1]`?\n",
    "\n",
    "### Hints\n",
    "\n",
    "- In the arrays of random variables, each row `i` corresponds to a *sample* of random variable `i` (just FYI).\n",
    "- Google is your friend :)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np  # import any needed modules here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "n_real = int(1E6)  # number of realizations\n",
    "n_vars = 3  # number of random variables we want to correlate\n",
    "cov = np.array([[ 1. ,  0.2,  0.4], [ 0.2,  0.8,  0.3], [ 0.4,  0.3,  1.1]])  # covariance matrix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "unc_vars = np.random.randn(n_vars, n_real)  # create [n_vars x n_real] array of uncorrelated (unc) normal random variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "chol_mat = np.linalg.cholesky(cov)  # calculate the cholesky decomposition of the covariance matrix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "cor_vars = chol_mat @ unc_vars  # [n_vars x n_real] array of correlated (cor) random variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cor_vars.var(axis=1)  # calculate variances of each sample of random variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.corrcoef(cor_vars[0, :], cor_vars[1, :])  # calculate the correlation coefficient between the first and second random samples"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}