Download (right-click, save target as ...) this page as a jupyterlab notebook from: Lab14-TH


Laboratory 14: Correlation

LAST NAME, FIRST NAME

R00000000

ENGR 1330 Laboratory 14 - Homework

Exercise 1. Revisit Concrete

Recall in Lab10-TH that you accessed a file of concrete strength and related mixture variables.

Then you changed some column names

Then you did the mulitple plots

So it's a cool plot, but the meaningful data science question is which variable(s) have predictive value for estimating concrete strength?

Answer by:

  1. Determine the correlation coefficient for the variable pairs.
  2. Rank the predictive value of the variables from highest magnitude to lowest magnitude.
  3. Build a linear data model based on the Cement variable, what is its correlation coefficient? $Strength_{model} = \beta_0 + \beta_1 \cdot Cement $
  4. Build a scatterplot of of the data model and the observations, and use the plot to find values of the two parameters.
  5. Your assessment of data model utility for this database?

Repeat the exercise using Age as the predictor variable.