Download (right-click, save target as ...) this page as a jupyterlab notebook from: Lab12


Laboratory 12: Practice with Pandas

LAST NAME, FIRST NAME

R00000000

ENGR 1330 Laboratory 12 - In-Lab

Exercise 1

Profile your computer

Run the script below exactly as written

In [6]:
import sys
! hostname
! whoami
print(sys.executable)
atomickitty
sensei
/opt/jupyterhub/bin/python3

Example

Population Lines

Use pandas to read a dataframe from the file http://54.243.252.9/engr-1330-webroot/4-Databases/census_18.csv.
Then produce a line plot of the counts by age for the 2010 census, x-axis will be the series age, y-axis will be the census values for 2010.

In [12]:
# get the file (using requests, or just download to your computer by hand)
import requests # Module to process http/https requests
#
remote_url="http://54.243.252.9/engr-1330-webroot/4-Databases/census_18.csv"  # set the url
rget = requests.get(remote_url, allow_redirects=True)  # get the remote resource, follow imbedded links
#
junk = open('census_18.csv','wb').write(rget.content) # extract from the remote the contents, assign to a local file same name
In [13]:
# read the file into a dataframe
import pandas as pd
df = pd.read_csv('census_18.csv')
df.head() # Examine dataframe layout
Out[13]:
AGE 2010 2014
0 0 3951330 3949775
1 1 3957888 3949776
2 2 4090862 3959664
3 3 4111920 4007079
4 4 4077551 4005716
In [14]:
# plotting 
df.plot.line(x="AGE", y="2010", label="duh", c="blue")# Make a plot fill in the parameters
Out[14]:
<AxesSubplot:xlabel='AGE'>

Exercise 2:

Using your dataframe from above, plot both the 2010 and 2014 census values by age. Plot the 2010 distribution in blue and the 2014 distribution in red.

In [ ]:
ax = df.plot.line(x="", y="", label="", c="blue") # fill in the parameters
df.plot.line(x="", y="", label="", c="red", ax=ax)

Exercise 3.

  1. What is population for age = 9 for the 2010 census?
  2. What is population for age = 9 for the 2014 census?
  3. Is the portion of population over 9 years old increasing? decreasing? staying the same?
In [ ]:
# your code here

Bonus

Put the new histogram and the previous one next to each other and explain what you can infer by comparing them.