Download (right-click, save target as ...) this page as a jupyterlab notebook Lab20-TH


Laboratory 20: Regression Goodness of Fit

LAST NAME, FIRST NAME

R00000000

ENGR 1330 Laboratory 20 - Homework


In [1]:
# Preamble script block to identify host, user, and kernel
import sys
! hostname
! whoami
print(sys.executable)
print(sys.version)
print(sys.version_info)
atomickitty
sensei
/opt/jupyterhub/bin/python3
3.8.10 (default, Sep 28 2021, 16:10:42) 
[GCC 9.3.0]
sys.version_info(major=3, minor=8, micro=10, releaselevel='final', serial=0)

Exercise: Ice Cream Cones!


The 'icecreamcone.csv' file, has the recordings on daily temperature, relative humidity, cone strength, and cone weight based on noon readings for 20 days of cone making. Follow the steps and answer the following questions:

  • Step1: Read the "icecreamcone.csv" file as a dataframe. Explore the dataframe and in a markdown cell breifly describe it in your own words.

  • Step2: Calculate and compare the correlation coefficient of the cone's weight with all the other parameters. In a markdown cell, explain the results and state which parameters have the strongest and weakest relationship with the cone's weight of a vehicle.

  • Step3: Use linear regression modeling with statsmodels, get the linear model's coefficients, make a plot and VISUALLY assess the quality of a linear fit with humidity as the predictor, and cone's weight as outcome. Then, use RMSE, Pearson's r, and NSE to describe the performance of your model. Explain the result of this analysis in a markdown cell.

  • Step4: Use linear regression modeling with statsmodels, get the linear model's coefficients, make a plot and VISUALLY assess the quality of a linear fit with cone's strength as the predictor, and cone's weight as outcome. Then, use RMSE, Pearson's r, and NSE to describe the performance of your model. Explain the result of this analysis in a markdown cell.

  • Step5: Use multiple linear regression modeling with scikit-learn and use all the three predictor parameters to predict cone's weight. Then, use RMSE, Pearson's r, and NSE to describe the performance of your model. Explain the result of this analysis in a markdown cell.

  • Step6: As a conclusion, make a statement about the quality of the three predictive models you wrote and compare their performances.

Data Source: V.T. Huang, S.T. Luebbers, J.B. Lindamood, P.M.T. Hansen (1989). "Ice Cream Cone Baking: 2. Textured Characteristics of Rolled Sugar Cones," Food Hydrocolloids, Vol. 3, #1, pp. 41-55.

In [1]:
import requests # Module to process http/https requests
remote_url="http://54.243.252.9/engr-1330-webroot/4-Databases/icecreamcone.csv"  # set the url
rget = requests.get(remote_url, allow_redirects=True)  # get the remote resource, follow imbedded links
open('icecreamcone.csv','wb').write(rget.content); # extract from the remote the contents, assign to a local file same name
In [10]:
#Step1:Read the "icecreamcone.csv" file as a dataframe
In [11]:
# Explore the dataframe: Describe the df
In [12]:
#Step2: Calculate and compare the correlation coefficient
#What can we infer?
In [ ]:

In [13]:
#Step3: humidity as the predictor
In [14]:
#GOF metrics:
In [15]:
#Step4: strength as the predictor
In [16]:
#GOF metrics:
In [17]:
#Step5: 3 predictor - Multiple Linear Regression
In [18]:
#GOF metrics:
In [19]:
#Step6:
In [4]:

References

This notebook was inspired by several blogposts including:

Here are some great reads on these topics:

Here are some great videos on these topics:

In [ ]: