18.9,1.29452,1.09,14.6,1.19,1.02,0.0007,0.0013,2650,999

	ID	Location	Q_m3_s	q_m2_s	U_m_s	W_m	H_m	R_m	S_m_m	D16_m	D50_m	D84_m	D90_m	Sorting	Rhos_kg_m3	Rho_kg_m3	qb_kg_m_s	C_ppht	Record_Number	Froude	GammaS_N_m3	GammaF_N_m3	Tau0_kg_m_s2	TauStar	Ustar_m_s	ManningN	WP_m	A_m2
10137	3	East Fork 1976	18.9	1.29452	1.09	14.6	1.19	1.02	0.0007	0.0003	0.0013	0.013	0.025	6.58281	2650	999.728	0.116	8.96248	10423	0.319075	25987.6	9803.98	7.00004	0.332722	0.0836776	0.0245955	16.9995	17.3394


'Q_m3_s',
 'q_m2_s',
 'U_m_s',
 'W_m',
 'H_m',
 'R_m',
 'S_m_m',
 'D50_m',
 'Rhos_kg_m3',
 'Rho_kg_m3']


\subsection*{Homework: Feature Reduction and Distance Metric in KNN Regression}

Using your previous implementation of \texttt{KNeighborsRegressor}, complete the following tasks:

\begin{enumerate}
    \item Use the \texttt{permutation\_importance} function from \texttt{sklearn.inspection} to estimate the importance of each feature in your dataset.
    \item Identify the top 5 or 6 most important features based on the permutation importance scores.
    \item Retrain the \texttt{KNeighborsRegressor} model using only the reduced feature set. Keep the same number of neighbors as in your full model.
    \item Compare the performance of the reduced model to the full model using at least the following metrics:
    \begin{itemize}
        \item Coefficient of determination ($R^2$ score)
        \item Root mean square error (RMSE)
    \end{itemize}
    \item For a fixed input vector (you may use the one from the earlier example), show the estimate computed using both models (full and reduced). Are they close? Provide a short explanation.
    \item Repeat the modeling using two different distance powers: $p=1$ (Manhattan) and $p=2$ (Euclidean). Comment on how the distance metric affects:
    \begin{itemize}
        \item Feature importance rankings
        \item Estimated values
        \item Model performance
    \end{itemize}
\end{enumerate}

\textbf{Note:} This exercise aims to build intuition around (a) which features matter most in KNN regression and (b) how the choice of distance metric can influence model behavior.