Download (right-click, save target as ...) this page as a jupyterlab notebook from: Lab25
LAST NAME, FIRST NAME
R00000000
ENGR 1330 Laboratory 25
Population: In statistics, a population is the entire pool from which a statistical sample is drawn. A population may refer to an entire group of people, objects, events, hospital visits, or measurements. 
Sample: In statistics and quantitative research methodology, a sample is a set of individuals or objects collected or selected from a statistical population by a defined procedure. The elements of a sample are known as sample points, sampling units or observations.
Distribution (Data Model): A data distribution is a function or a listing which shows all the possible values (or intervals) of the data. It also (and this is important) tells you how often each value occurs. 
From https://www.investopedia.com/terms
https://www.statisticshowto.com/data-distribution/
The file 08068500.pkf is an actual WATSTORE formatted file for a USGS gage at Spring Creek, Texas. The first few lines of the file look like:
Z08068500                       USGS 
H08068500       3006370952610004848339SW12040102409    409     72.6             
N08068500       Spring Ck nr Spring, TX
Y08068500       
308068500       19290530  483007              34.30        1879          
308068500       19390603    838               13.75                    
308068500       19400612   3420               21.42                    
308068500       19401125  42700               33.60                    
308068500       19420409  14200               27.78                    
308068500       19430730   8000               25.09                    
308068500       19440319   5260               23.15                    
308068500       19450830  31100               32.79                    
308068500       19460521  12200               27.97                 
The first column are some agency codes that identify the station , the second column after the fourth row is a date in YYYYMMDD format, the third column is a discharge in CFS, the fourth and fifth column are not relevant for this laboratory exercise. The file was downloadef from
https://nwis.waterdata.usgs.gov/tx/nwis/peak?site_no=08068500&agency_cd=USGS&format=hn2
In the original file there are a couple of codes that are manually removed:
The laboratory task is to fit the data models to this data, decide the best model from visual perspective, and report from that data model the magnitudes of peak flow associated with the probebilitiess below (i.e. populate the table)
| Exceedence Probability | Flow Value | Remarks | 
|---|---|---|
| 25% | ???? | 75% chance of greater value | 
| 50% | ???? | 50% chance of greater value | 
| 75% | ???? | 25% chance of greater value | 
| 90% | ???? | 10% chance of greater value | 
| 99% | ???? | 1% chance of greater value (in flood statistics, this is the 1 in 100-yr chance event) | 
| 99.8% | ???? | 0.002% chance of greater value (in flood statistics, this is the 1 in 500-yr chance event) | 
| 99.9% | ???? | 0.001% chance of greater value (in flood statistics, this is the 1 in 1000-yr chance event) | 
The first step is to read the file, skipping the first part, then build a dataframe:
# Get the datafile
# Read the data file
amatrix = [] # null list to store matrix reads
rowNumA = 0
matrix1=[]
col0=[]
col1=[]
col2=[]
with open('08068500.pkf','r') as afile:
    lines_after_4 = afile.readlines()[4:]
afile.close() # Disconnect the file
howmanyrows = len(lines_after_4)
for i in range(howmanyrows):
    matrix1.append(lines_after_4[i].strip().split())
for i in range(howmanyrows):
    col0.append(matrix1[i][0])
    col1.append(matrix1[i][1])
    col2.append(matrix1[i][2])
# col2 is date, col3 is peak flow
#now build a datafranem
import pandas 
df = pandas.DataFrame(col0)
df['date']= col1
df['flow']= col2
df.head()
Now explore if you can plot the dataframe as a plot of peaks versus date.
# Plot here
From here on you can proceede using the lecture notebook as a go-by, although you should use functions as much as practical to keep your work concise
# Descriptive Statistics 
# Weibull Plotting Position Function
# Normal Quantile Function 
# Fitting Data to Normal Data Model 
| Exceedence Probability | Flow Value | Remarks | 
|---|---|---|
| 25% | ???? | 75% chance of greater value | 
| 50% | ???? | 50% chance of greater value | 
| 75% | ???? | 25% chance of greater value | 
| 90% | ???? | 10% chance of greater value | 
| 99% | ???? | 1% chance of greater value (in flood statistics, this is the 1 in 100-yr chance event) | 
| 99.8% | ???? | 0.002% chance of greater value (in flood statistics, this is the 1 in 500-yr chance event) | 
| 99.9% | ???? | 0.001% chance of greater value (in flood statistics, this is the 1 in 1000-yr chance event) |