Laboratory 5: Classifiles- On Classes and Files

In [1]:
# Preamble script block to identify host, user, and kernel
import sys
! hostname
! whoami
print(sys.executable)
print(sys.version)
print(sys.version_info)
DESKTOP-EH6HD63
desktop-eh6hd63\farha
C:\Users\Farha\Anaconda3\python.exe
3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
sys.version_info(major=3, minor=7, micro=4, releaselevel='final', serial=0)

Full name:

R#:

Title of the notebook:

Date:


Part I : Classes and Objects

In object-oriented programming, a class is an extensible program-code-template for creating objects, providing initial values for state (member variables) and implementations of behavior (member functions or methods). In many languages, the class name is used as the name for the class (the template itself), the name for the default constructor of the class (a subroutine that creates objects), and as the type of objects generated by instantiating the class; these distinct concepts are easily conflated.

When an object is created by a constructor of the class, the resulting object is called an instance of the class, and the member variables specific to the object are called instance variables, to contrast with the class variables shared across the class. Classes provide a means of bundling data and functionality together. Creating a new class creates a new type of object, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state. Class instances can also have methods (defined by its class) for modifying its state.

Class definitions, like function definitions (def statements) must be executed before they have any effect. (You could conceivably place a class definition in a branch of an if statement, or inside a function.)

In practice, the statements inside a class definition will usually be function definitions, but other statements are allowed, and sometimes useful — we’ll come back to this later. The function definitions inside a class normally have a peculiar form of argument list, dictated by the calling conventions for methods — again, this is explained later.

When a class definition is entered, a new namespace is created, and used as the local scope — thus, all assignments to local variables go into this new namespace. In particular, function definitions bind the name of the new function here.

When a class definition is left normally (via the end), a class object is created. This is basically a wrapper around the contents of the namespace created by the class definition; we’ll learn more about class objects in the next section. The original local scope (the one in effect just before the class definition was entered) is reinstated, and the class object is bound here to the class name given in the class definition header (ClassName in the example).

whatis an object

.....

Learn more at

  1. https://docs.python.org/3/tutorial/classes.html
  2. https://en.wikipedia.org/wiki/Class_(computer_programming)


Example

Let's write a class named 'Personnel' to store the information of a company's employees and generate email addresses for them with the following format "firstname.lastname@DunderMifflinPaperCompany.com"

Here are the employee names:

First Name Last Name Age
Michael Scott 43
Pam Beesly 24
Jim Halpert 26
Dwight Schrute 36
Creed Bratton 63
In [2]:
# First, let's create an empty class and call it "Personnel"
class Personnel:                 #This is how we create classes - Kinda similar to functions
    pass                         #This is to avoid getting an error - because we want to leave the class empty, for now.

emp_1 = Personnel()               #Let's define an instance for the Personnel class
emp_2 = Personnel()               #Let's define another instance for the Personnel class
emp_3 = Personnel()               
In [3]:
print(emp_1)                      #Let's see what are those? They are identified as "Personnel" objects
print(emp_2)                      #In other words, these are Instances for the Personnel class
print(emp_3)
<__main__.Personnel object at 0x000001C4AC3D2A48>
<__main__.Personnel object at 0x000001C4AC3D22C8>
<__main__.Personnel object at 0x000001C4AC3D2548>
In [4]:
emp_1.first = 'Michael'           #We can give attributes to these instances, For example
emp_1.last = 'Scott'
emp_1.age = 43

emp_2.first = 'Pam'
emp_2.last = 'Beesly'
emp_2.age = 24

emp_3.first = 'Jim'
emp_3.last = 'Halpert'
emp_3.age = 26
In [5]:
print(emp_1.first)                #Now let's see how successful we were...
print(emp_2.last)
print(emp_3.age)
Michael
Beesly
26
In [6]:
# We have managed to successfuly create instances for a class called Personnel and add attributes to them.
# BUT there is really no benefit to it if we were supposed do everything like this and one by one. It is also prone to error.
# So instead of an empty class, we can define our Personnel class with more details:

class Personnel:                                             #No change form the last time
    """Some documentation"""
    def __init__(self,first,last,age):                       #__init__ : because we are initializing the class and constructing it | self: it is a convention-strongly recommended
        self.first= first                                    #set all the "instance variables"
        self.last= last
        self.age= age
        self.email = first + '.' + last + '@DunderMifflinPaperCompany.com'     # use the given format to generate the email addresses
In [7]:
emp_1 = Personnel('Michael','Scott',43)                      #Fill the instances
emp_2 = Personnel('Pam','Beesly',24)
emp_3 = Personnel('Jim','Halpert',26)
emp_4 = Personnel('Dwight','Schrute',36)
emp_5 = Personnel('Creed','Bratton',63)
In [8]:
print(emp_1.email)                                           #Let's check if our email address generator is working as expected
print(emp_2.email)
print(emp_3.email)
print(emp_4.email)
print(emp_5.email)
Michael.Scott@DunderMifflinPaperCompany.com
Pam.Beesly@DunderMifflinPaperCompany.com
Jim.Halpert@DunderMifflinPaperCompany.com
Dwight.Schrute@DunderMifflinPaperCompany.com
Creed.Bratton@DunderMifflinPaperCompany.com
In [10]:
# At this point, we have done everything the question has asked. But let's go a little further and see if we can get the full name combination.
# One way would be this:
print('{} {}'.format(emp_1.first,emp_1.last))             #This is again too manual and needs too much typing!
Michael Scott
In [11]:
# We can update out Class with a "Method":
class Personnel:                                             #No change form the last time
    def __init__(self,first,last,age):
        self.first= first
        self.last= last
        self.age= age
        self.email = first + '.' + last + '@DunderMifflinPaperCompany.com'     # use the given format to generate the email addresses
    def fullname(self):                                 #This is a method! - very similar to a function within our Class
        return '{} {}'.format(self.first,self.last)     #Mind the difference from the last section we ran

                                     #Mind the difference in syntax
In [12]:
emp_1 = Personnel('Michael','Scott',43)                      #Fill the instances
emp_2 = Personnel('Pam','Beesly',24)
emp_3 = Personnel('Jim','Halpert',26)
emp_4 = Personnel('Dwight','Schrute',36)
emp_5 = Personnel('Creed','Bratton',63)
In [13]:
print(Personnel.fullname(emp_4))                              #Let's check if we can get Dwight's full name
print(emp_4.fullname()) 
Dwight Schrute
Dwight Schrute

Example:

Given that we are in 2021, use the script from the previous example and modify the "Personnel" class so that it can calculate the year each employee was born.

Here are the employee names:

First Name Last Name Age
Michael Scott 43
Pam Beesly 24
Jim Halpert 26
Dwight Schrute 36
Creed Bratton 63
In [14]:
class Personnel:                                             #No change form the last time
    def __init__(self,first,last,age):
        self.first= first
        self.last= last
        self.age= age
        self.email = first + '.' + last + '@DunderMifflinPaperCompany.com'     # use the given format to generate the email addresses
    def fullname(self):                                 #This is a method! - very similar to a function within our Class
        return '{} {}'.format(self.first,self.last)     #Mind the difference from the last section we ran
    def yearborn(self):
        year= 2020- self.age
        return year
In [15]:
emp_1 = Personnel('Michael','Scott',43)                      #Fill the instances
emp_2 = Personnel('Pam','Beesly',24)
emp_3 = Personnel('Jim','Halpert',26)
emp_4 = Personnel('Dwight','Schrute',36)
emp_5 = Personnel('Creed','Bratton',63)
In [16]:
print(Personnel.yearborn(emp_5))                              #Let's check if we can get what year Creed was born
print(emp_5.yearborn())
1957
1957

Part II : Files and Filesystems

Background

A computer file is a computer resource for recording data discretely (not in the secretive context, but specifically somewhere on a piece of hardware) in a computer storage device. Just as words can be written to paper, so can information be written to a computer file. Files can be edited and transferred through the internet on that particular computer system.

There are different types of computer files, designed for different purposes. A file may be designed to store a picture, a written message, a video, a computer program, or a wide variety of other kinds of data. Some types of files can store several types of information at once.

By using computer programs, a person can open, read, change, save, and close a computer file. Computer files may be reopened, modified, and copied an arbitrary number of times.

Typically, files are organised in a file system, which keeps track of where the files are located on disk and enables user access.

File system

In computing, a file system or filesystem, controls how data is stored and retrieved. Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stops and the next begins. By separating the data into pieces and giving each piece a name, the data is isolated and identified. Taking its name from the way paper-based data management system is named, each group of data is called a “file”. The structure and logic rules used to manage the groups of data and their names is called a “file system”.

Path

A path, the general form of the name of a file or directory, specifies a unique location in a file system. A path points to a file system location by following the directory tree hierarchy expressed in a string of characters in which path components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash (”/”), the backslash character (”\”), or colon (”:”), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems, and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths. As an example consider the following two files:

  1. /Users/theodore/MyGit/@atomickitty/hurri-sensors/.git/Guest.conf
  2. /etc/apache2/users/Guest.conf

They both have the same file name, but are located on different paths. Failure to provide the path when addressing the file can be a problem. Another way to interpret is that the two unique files actually have different names, and only part of those names is common (Guest.conf) The two names above (including the path) are called fully qualified filenames (or absolute names), a relative path (usually relative to the file or program of interest depends on where in the directory structure the file lives. If we are currently in the .git directory (the first file) the path to the file is just the filename.

We have experienced path issues with dependencies on .png files - in general your JupyterLab notebooks on CoCalc can only look at the local directory which is why we have to copy files into the directory for things to work.

File Types

  1. Text Files. Text files are regular files that contain information readable by the user. This information is stored in ASCII. You can display and print these files. The lines of a text file must not contain NULL characters, and none can exceed a prescribed (by architecture) length, including the new-line character. The term text file does not prevent the inclusion of control or other nonprintable characters (other than NUL). Therefore, standard utilities that list text files as inputs or outputs are either able to process the special characters gracefully or they explicitly describe their limitations within their individual sections.

  2. Binary Files. Binary files are regular files that contain information readable by the computer. Binary files may be executable files that instruct the system to accomplish a job. Commands and programs are stored in executable, binary files. Special compiling programs translate ASCII text into binary code. The only difference between text and binary files is that text files have lines of less than some length, with no NULL characters, each terminated by a new-line character.

  3. Directory Files. Directory files contain information the system needs to access all types of files, but they do not contain the actual file data. As a result, directories occupy less space than a regular file and give the file system structure flexibility and depth. Each directory entry represents either a file or a subdirectory. Each entry contains the name of the file and the file's index node reference number (i-node). The i-node points to the unique index node assigned to the file. The i-node describes the location of the data associated with the file. Directories are created and controlled by a separate set of commands.

File Manipulation

For this laboratory we will learn just a handfull of file manipulations which are quite useful. Files can be "created","read","updated", or "deleted" (CRUD).


Example: Create a file, write to it.

Below is an example of creating a file that does not yet exist. The script is a bit pendandic on purpose.

First will use some system commands to view the contents of the local directory

In [31]:
import sys
! del  myfirstfile.txt  # delete file if it exists, Use rm -f on Mac 
%pwd  # list name of working directory, note it includes path, so it is an absolute path
Out[31]:
'C:\\Users\\Farha'
In [ ]:
 
In [32]:
# create file example
externalfile = open("myfirstfile.txt",'w') # create connection to file, set to write (w), file does not need to exist
mymessage = 'message in a bottle' #some object to write, in this case a string
externalfile.write(mymessage)# write the contents of mymessage to the file
externalfile.close() # close the file connection

At this point our new file should exist, lets list the directory and see if that is so

Sure enough, its there, we will use a bash command cat to look at the contents of the file.

In [33]:
! type myfirstfile.txt
message in a bottle

Thats about it, use of system commands, of course depends on the system, the examples above should work OK on CoCalc or a Macintosh; on Winderz the shell commands are a little different. If you have the linux subsystem installed then these should work as is.


Example: Read from an existing file.

We will continue using the file we just made, and read from it the example is below

In [34]:
# read file example
externalfile = open("myfirstfile.txt",'r') # create connection to file, set to read (r), file must exist
silly_string = externalfile.read() # read the contents
externalfile.close() # close the file connection
print(silly_string)
message in a bottle

Example: Update a file.

This example continues with our same file, but we will now add contents without destroying existing contents. The keyword is append

In [35]:
externalfile = open("myfirstfile.txt",'a') # create connection to file, set to append (a), file does not need to exist
externalfile.write('\n') # adds a newline character
what_to_add = 'I love rock-and-roll, put another dime in the jukebox baby ... \n' 
externalfile.write(what_to_add) # add a string including the linefeed
what_to_add = '... the waiting is the hardest part \n' 
externalfile.write(what_to_add) # add a string including the linefeed
mylist = [1,2,3,4,5] # a list of numbers
what_to_add = ','.join(map(repr, mylist)) + "\n" # one way to write the list
externalfile.write(what_to_add)
what_to_add = ','.join(map(repr, mylist[0:len(mylist)])) + "\n" # another way to write the list
externalfile.write(what_to_add)
externalfile.close()

As before we can examine the contents using a shell command sent from the notebook.

In [36]:
! type myfirstfile.txt
message in a bottle
I love rock-and-roll, put another dime in the jukebox baby ... 
... the waiting is the hardest part 
1,2,3,4,5
1,2,3,4,5

A little discussion on the part where we wrote numbers

what_to_add = ','.join(map(repr, mylist[0:len(mylist)])) + "\n"

Here are descriptions of the two functions map and repr

map(function, iterable, ...) Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. If one iterable is shorter than another it is assumed to be extended with None items. If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation). The iterable arguments may be a sequence or any iterable object; the result is always a list.

repr(object) Return a string containing a printable representation of an object. This is the same value yielded by conversions (reverse quotes). It is sometimes useful to be able to access this operation as an ordinary function. For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(), otherwise the representation is a string enclosed in angle brackets that contains the name of the type of the object together with additional information often including the name and address of the object. A class can control what this function returns for its instances by defining a repr() method.

What they do in this script is important. The statement:

what_to_add = ’,’.join(map(repr, mylist[0:len(mylist)])) + "\n"

is building a string that will be comprised of elements of mylist[0:len(mylist)]. The repr() function gets these elements as they are represented in the computer, the delimiter a comma is added using the join method in Python, and because everything is now a string the

... + "\n"

puts a linefeed character at the end of the string so the output will start a new line the next time something is written.

Example: Delete a file

Delete can be done by a system call as we did above to clear the local directory

In a JupyterLab notebook, we can either use

import sys
! del  myfirstfile.txt  # delete file if it exists, Use rm -f on Mac

or

import os
os.remove("myfirstfile.txt")

they both have same effect, both equally dangerous to your filesystem.

Learn more about CRUD with text files at https://www.guru99.com/reading-and-writing-files-in-python.html

Learn more about file delete at https://www.dummies.com/programming/python/how-to-delete-a-file-in-python/

In [37]:
# import os
file2kill = "myfirstfile.txt"
try:
    os.remove(file2kill) # file must exist or will generate an exception
except:
    pass # example of using pass to improve readability
print(file2kill, " missing or deleted !")
myfirstfile.txt  missing or deleted !


Here are some great reads on this topic:

Here are some great videos on these topics:



Exercise: Your Favorite Quotation!

  • create a text file, name it "MyFavoriteQuotation".
  • Write your favorite quotation in the file.
  • Read the file.
  • Add this string to it in a new line : "And that's something I wish I had said..."
  • Show the final outcome.
In [40]:
# create the "My Favorite Quotation" file:
externalfile = open("MyFavoriteQuotation.txt",'w')         # create connection to file, set to write (w)
myquotation = 'The path of the righteous man is beset on all sides by the inequities of the selfish and the tyranny of evil men. Blessed is he who, in the name of charity and good will, shepherds the weak through the valley of darkness. For he is truly his brother’s keeper and the finder of lost children. And I will strike down upon thee with great vengeance and furious anger those who attempt to poison and destroy my brothers. And you will know my name is the Lord when I lay my vengeance upon you.' #My choice: quotation from Pulp Fiction
externalfile.write(myquotation)# write the contents of mymessage to the file
externalfile.close() # close the file connection
#Let's read the file
#! type MyFavoriteQuotation.txt 
# Let's add the string
externalfile = open("MyFavoriteQuotation.txt",'a')  #create connection to file, set to append (a)
externalfile.write('\n') # adds a newline character
what_to_add = "And that's something I wish I had said ... \n"
externalfile.write(what_to_add)
externalfile.close()
#Let's read the file one last time
! type MyFavoriteQuotation.txt 
The path of the righteous man is beset on all sides by the inequities of the selfish and the tyranny of evil men. Blessed is he who, in the name of charity and good will, shepherds the weak through the valley of darkness. For he is truly his brother’s keeper and the finder of lost children. And I will strike down upon thee with great vengeance and furious anger those who attempt to poison and destroy my brothers. And you will know my name is the Lord when I lay my vengeance upon you.
And that's something I wish I had said ... 

In [ ]: