Download (right-click, save target as ...) this page as a Jupyterlab notebook from: ES-4
Use the National Bridge Inventory Database and perform a rudimentary data analysis and content summary (just use ES-2 results).
Extract a Texas-subset, Extract a North Dakota-subset.
Use the North Dakota subset and make a classification model (logistic) that identifies failed (code 3 or smaller) and adequate bridges using the prediction variables described in Prediction of Bridge Component Ratings Using Ordinal Logistic Regression Model
Repeat with the Texas subset.
Comment on the performance of your model(s).
Use the Algerian forest fire dataset https://archive.ics.uci.edu/ml/datasets/Algerian+Forest+Fires+Dataset++
Make a classification model (logistic) that predicts fire/not-fire based on remaining variables.
Does region (one of the input variables) matter?
Suppose you make a model based on only a single region. How well does that model perform on the other region.?
Repeat the above problems (1 and 2) using KNN as the classification engine. Be sure to use the same training sets. Does KNN perform any better/worse (your opinion) than logistic regression?