Compare revisions

Jennifer Rinker · Jennifer Rinker · 610fe28f · 610fe28f · 610fe28f · 610fe28f
--- a/materials/2c-exercise_numpy_solutions.ipynb
+++ b/materials/2c-exercise_numpy_solutions.ipynb
+%% Cell type:markdown id: tags:
+
+# NumPy Exercise: Correlated Gaussian Random Variables
+
+%% Cell type:markdown id: tags:
+
+In many situations, we need to have a series of correlated Gaussian random variables, which can then be transformed into other distributions of interest (uniform, lognormal, etc.). Let's see how to do that with NumPy in Python.
+
+### Given:
+
+|Variable | Value | Description |
+| ---: | :---: | :--- |
+|`n_real` | `1E6` | number of realizations|
+|`n_vars` | 3 | number of variables to correlate|
+|`cov` | `[[ 1. ,  0.2,  0.4], [ 0.2,  0.8,  0.3], [ 0.4,  0.3,  1.1]]` | covariance matrix|
+
+### Theory
+
+The procedure for generating correlated Gaussian is as follows:
+1. Sample `[n_vars x n_real]` (uncorrelated) normal random variables
+2. Calculate `chol_mat`, the Cholesky decomposition of the covariance matrix
+3. Matrix-multiply your random variables with `chol_mat` to produce a `[n_vars x n_real]` array of correlated Gaussian variables
+
+### Exercise
+
+Do the following:
+1. Fill in the blank cells below so that the code follows the theory outlined above.
+2. Calculate the variances of the three samples of random variables. Does it match the diagonal of the covariance matrix?
+3. Calculate the correlation coefficient between the first and second random samples. Does it match `cov[0, 1]`?
+
+### Hints
+
+- In the arrays of random variables, each row `i` corresponds to a *sample* of random variable `i` (just FYI).
+- Google is your friend :)
+
+%% Cell type:code id: tags:
+
+``` python
+import numpy as np  # import any needed modules here
+```
+
+%% Cell type:code id: tags:
+
+``` python
+n_real = int(1E6)  # number of realizations
+n_vars = 3  # number of random variables we want to correlate
+cov = np.array([[ 1. ,  0.2,  0.4], [ 0.2,  0.8,  0.3], [ 0.4,  0.3,  1.1]])  # covariance matrix
+```
+
+%% Cell type:code id: tags:
+
+``` python
+unc_vars = np.random.randn(n_vars, n_real)  # create [n_vars x n_real] array of uncorrelated (unc) normal random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+chol_mat = np.linalg.cholesky(cov)  # calculate the cholesky decomposition of the covariance matrix
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cor_vars = chol_mat @ unc_vars  # [n_vars x n_real] array of correlated (cor) random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cor_vars.var(axis=1)  # calculate variances of each sample of random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+np.corrcoef(cor_vars[0, :], cor_vars[1, :])  # calculate the correlation coefficient between the first and second random samples
+```
+%% Cell type:markdown id: tags:
+
+# NumPy Exercise: Correlated Gaussian Random Variables
+
+%% Cell type:markdown id: tags:
+
+In many situations, we need to have a series of correlated Gaussian random variables, which can then be transformed into other distributions of interest (uniform, lognormal, etc.). Let's see how to do that with NumPy in Python.
+
+### Given:
+
+|Variable | Value | Description |
+| ---: | :---: | :--- |
+|`n_real` | `1E6` | number of realizations|
+|`n_vars` | 3 | number of variables to correlate|
+|`cov` | `[[ 1. ,  0.2,  0.4], [ 0.2,  0.8,  0.3], [ 0.4,  0.3,  1.1]]` | covariance matrix|
+
+### Theory
+
+The procedure for generating correlated Gaussian is as follows:
+1. Sample `[n_vars x n_real]` (uncorrelated) normal random variables
+2. Calculate `chol_mat`, the Cholesky decomposition of the covariance matrix
+3. Matrix-multiply your random variables with `chol_mat` to produce a `[n_vars x n_real]` array of correlated Gaussian variables
+
+### Exercise
+
+Do the following:
+1. Fill in the blank cells below so that the code follows the theory outlined above.
+2. Calculate the variances of the three samples of random variables. Does it match the diagonal of the covariance matrix?
+3. Calculate the correlation coefficient between the first and second random samples. Does it match `cov[0, 1]`?
+
+### Hints
+
+- In the arrays of random variables, each row `i` corresponds to a *sample* of random variable `i` (just FYI).
+- Google is your friend :)
+
+%% Cell type:code id: tags:
+
+``` python
+import numpy as np  # import any needed modules here
+```
+
+%% Cell type:code id: tags:
+
+``` python
+n_real = int(1E6)  # number of realizations
+n_vars = 3  # number of random variables we want to correlate
+cov = np.array([[ 1. ,  0.2,  0.4], [ 0.2,  0.8,  0.3], [ 0.4,  0.3,  1.1]])  # covariance matrix
+```
+
+%% Cell type:code id: tags:
+
+``` python
+unc_vars = np.random.randn(n_vars, n_real)  # create [n_vars x n_real] array of uncorrelated (unc) normal random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+chol_mat = np.linalg.cholesky(cov)  # calculate the cholesky decomposition of the covariance matrix
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cor_vars = chol_mat @ unc_vars  # [n_vars x n_real] array of correlated (cor) random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cor_vars.var(axis=1)  # calculate variances of each sample of random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+np.corrcoef(cor_vars[0, :], cor_vars[1, :])  # calculate the correlation coefficient between the first and second random samples
+```
--- a/materials/3b-exercise_matplotlib.ipynb
+++ b/materials/3b-exercise_matplotlib.ipynb
+%% Cell type:markdown id: tags:
+
+# Matplotlib Exercise: Visualizing Correlated Gaussian Random Variables
+
+%% Cell type:markdown id: tags:
+
+Now that we know how to generate correlated random variables, let's visualize them.
+
+### Exercise
+
+Make the following plots. All plots must have x and y labels, titles, and legends if there is more than one dataset in the same axes.
+
+1. Overlaid histograms of your samples of uncorrelated random variables with 30 bins (use `histtype='step'`)
+2. A scatterplot of $X_2$ vs $X_1$ with marker size equal to 2. Overlay the the theoretical line ($y=x$) in a black, dashed line.
+3. Overlaid histograms of your samples of correlated random variables with 30 bins (use `histtype='step'`)
+
+### Hints
+
+- In the arrays of random variables, each row `i` corresponds to a *sample* of random variable `i` (just FYI).
+- Google is your friend :)
+
+%% Cell type:code id: tags:
+
+``` python
+import matplotlib.pyplot as plt  # need to import matplotlib, of course
+import numpy as np  # import any needed modules here
+```
+
+%% Cell type:code id: tags:
+
+``` python
+n_real = int(1E6)  # number of realizations
+n_vars = 3  # number of random variables we want to correlate
+cov = np.array([[ 1. ,  0.2,  0.4], [ 0.2,  0.8,  0.3], [ 0.4,  0.3,  1.1]])  # covariance matrix
+```
+
+%% Cell type:code id: tags:
+
+``` python
+unc_vars = np.random.randn(n_vars, n_real)  # create [n_vars x n_real] array of uncorrelated (unc) normal random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+chol_mat = np.linalg.cholesky(cov)  # calculate the cholesky decomposition of the covariance matrix
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cor_vars = chol_mat @ unc_vars  # [n_vars x n_real] array of correlated (cor) random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cor_vars.var(axis=1)  # calculate variances of each sample of random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+np.corrcoef(cor_vars[0, :], cor_vars[1, :])  # calculate the correlation coefficient between the first and second random samples
+```
+
+%% Cell type:markdown id: tags:
+
+## Plot 1: Histogram of Uncorrelated Variables
+
+%% Cell type:markdown id: tags:
+
+Make a plot with overlaid histograms of your samples of uncorrelated random variables with 30 bins (use histtype='step').
+
+%% Cell type:code id: tags:
+
+``` python
+  # insert code here
+```
+
+%% Cell type:markdown id: tags:
+
+## Plot 2: Scatterplot of X2 vs. X1
+
+%% Cell type:markdown id: tags:
+
+Make a scatterplot of $X_2$ vs $X_1$ with marker size equal to 2. Overlay the the theoretical line ($y=x$) in a black, dashed line.
+
+%% Cell type:code id: tags:
+
+``` python
+  # insert code here
+```
+
+%% Cell type:markdown id: tags:
+
+## Plot 3: Histogram of Correlated Variables
+
+%% Cell type:markdown id: tags:
+
+Make a plot with overlaid histograms of your samples of uncorrelated random variables with 30 bins (use histtype='step').
+
+%% Cell type:code id: tags:
+
+``` python
+  # insert code here
+```
+%% Cell type:markdown id: tags:
+
+# Matplotlib Exercise: Visualizing Correlated Gaussian Random Variables
+
+%% Cell type:markdown id: tags:
+
+Now that we know how to generate correlated random variables, let's visualize them.
+
+### Exercise
+
+Make the following plots. All plots must have x and y labels, titles, and legends if there is more than one dataset in the same axes.
+
+1. Overlaid histograms of your samples of uncorrelated random variables with 30 bins (use `histtype='step'`)
+2. A scatterplot of $X_2$ vs $X_1$ with marker size equal to 2. Overlay the the theoretical line ($y=x$) in a black, dashed line.
+3. Overlaid histograms of your samples of correlated random variables with 30 bins (use `histtype='step'`)
+
+### Hints
+
+- In the arrays of random variables, each row `i` corresponds to a *sample* of random variable `i` (just FYI).
+- Google is your friend :)
+
+%% Cell type:code id: tags:
+
+``` python
+import matplotlib.pyplot as plt  # need to import matplotlib, of course
+import numpy as np  # import any needed modules here
+```
+
+%% Cell type:code id: tags:
+
+``` python
+n_real = int(1E6)  # number of realizations
+n_vars = 3  # number of random variables we want to correlate
+cov = np.array([[ 1. ,  0.2,  0.4], [ 0.2,  0.8,  0.3], [ 0.4,  0.3,  1.1]])  # covariance matrix
+```
+
+%% Cell type:code id: tags:
+
+``` python
+unc_vars = np.random.randn(n_vars, n_real)  # create [n_vars x n_real] array of uncorrelated (unc) normal random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+chol_mat = np.linalg.cholesky(cov)  # calculate the cholesky decomposition of the covariance matrix
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cor_vars = chol_mat @ unc_vars  # [n_vars x n_real] array of correlated (cor) random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+cor_vars.var(axis=1)  # calculate variances of each sample of random variables
+```
+
+%% Cell type:code id: tags:
+
+``` python
+np.corrcoef(cor_vars[0, :], cor_vars[1, :])  # calculate the correlation coefficient between the first and second random samples
+```
+
+%% Cell type:markdown id: tags:
+
+## Plot 1: Histogram of Uncorrelated Variables
+
+%% Cell type:markdown id: tags:
+
+Make a plot with overlaid histograms of your samples of uncorrelated random variables with 30 bins (use histtype='step').
+
+%% Cell type:code id: tags:
+
+``` python
+  # insert code here
+```
+
+%% Cell type:markdown id: tags:
+
+## Plot 2: Scatterplot of X2 vs. X1
+
+%% Cell type:markdown id: tags:
+
+Make a scatterplot of $X_2$ vs $X_1$ with marker size equal to 2. Overlay the the theoretical line ($y=x$) in a black, dashed line.
+
+%% Cell type:code id: tags:
+
+``` python
+  # insert code here
+```
+
+%% Cell type:markdown id: tags:
+
+## Plot 3: Histogram of Correlated Variables
+
+%% Cell type:markdown id: tags:
+
+Make a plot with overlaid histograms of your samples of uncorrelated random variables with 30 bins (use histtype='step').
+
+%% Cell type:code id: tags:
+
+``` python
+  # insert code here
+```
--- a/materials/3c-exercise_matplotlib_solutions.ipynb
+++ b/materials/3c-exercise_matplotlib_solutions.ipynb
--- a/materials/4c-exercise_pandas_solutions.ipynb
+++ b/materials/4c-exercise_pandas_solutions.ipynb
No results found