ENGR 1330 Computational Thinking with Data Science

Last GitHub Commit Date: 31 January 2021

Lesson 6 Classes, Objects, and File Handling:


Special Script Blocks


Objectives

Classes and Objects

In object-oriented programming, a class is an extensible program-code-template for creating objects, providing initial values for state (member variables) and implementations of behavior (member functions or methods). In many languages, the class name is used as the name for the class (the template itself), the name for the default constructor of the class (a subroutine that creates objects), and as the type of objects generated by instantiating the class; these distinct concepts are easily conflated.

When an object is created by a constructor of the class, the resulting object is called an instance of the class, and the member variables specific to the object are called instance variables, to contrast with the class variables shared across the class.

Classes provide a means of bundling data and functionality together. Creating a new class creates a new type of object, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state. Class instances can also have methods (defined by its class) for modifying its state.

Class definitions, like function definitions (def statements) must be executed before they have any effect. (You could conceivably place a class definition in a branch of an if statement, or inside a function.)

In practice, the statements inside a class definition will usually be function definitions, but other statements are allowed, and sometimes useful — we’ll come back to this later. The function definitions inside a class normally have a peculiar form of argument list, dictated by the calling conventions for methods — again, this is explained later.

When a class definition is entered, a new namespace is created, and used as the local scope — thus, all assignments to local variables go into this new namespace. In particular, function definitions bind the name of the new function here.

When a class definition is left normally (via the end), a class object is created. This is basically a wrapper around the contents of the namespace created by the class definition; we’ll learn more about class objects in the next section. The original local scope (the one in effect just before the class definition was entered) is reinstated, and the class object is bound here to the class name given in the class definition header (ClassName in the example).

What is an object?

An object is simply a collection of data (variables) and methods (functions) that act on those data. Similarly, a class is a blueprint for that object.

We can think of class as a sketch (prototype) of a house. It contains all the details about the floors, doors, windows etc. Based on these descriptions we build the house. House is the object.

As many houses can be made from a house's blueprint, we can create many objects from a class. An object is also called an instance of a class and the process of creating this object is called instantiation

Learn more at

  1. https://docs.python.org/3/tutorial/classes.html
  2. https://en.wikipedia.org/wiki/Class_(computer_programming)

An Example:

Write a class named 'Tax' to calculate the state tax (in dollars) of Employees at Texas Tech University based on their annual salary.
The state tax is 16% if the annual salary is below 80,000 dollars and 22% if the salary is more than 80,000 dollars.

Employee Annual salary (dollars)
Bob 1,50,000
Mary 78,000
John 55,000
Danny 1,75,000

Notes:

  1. Use docstrings to describe the purpose of the class.
  2. Use if....else conditional statements within the method of the class to choose the relevant tax % based on the annual salary.
  3. Create an object for employee and display the output as shown below.

Bob's tax amount (in dollars): AMOUNT
Mary's tax amount (in dollars): AMOUNT
John's tax amount (in dollars): AMOUNT
Danny's tax amount (in dollars): AMOUNT

Numbers, strings, lists, and dictionaries are all objects that are instances of a parent class

To get more information about the built-in classes and objects, use dir( ) and help( ) functions

User-defined classes: Defining docstrings

Files and Filesystems

Background

A computer file is a computer resource for recording data discretely (not in the secretive context, but specifically somewhere on a piece of hardware) in a computer storage device. Just as words can be written to paper, so can information be written to a computer file. Files can be edited and transferred through the internet on that particular computer system.

There are different types of computer files, designed for different purposes. A file may be designed to store a picture, a written message, a video, a computer program, or a wide variety of other kinds of data. Some types of files can store several types of information at once.

By using computer programs, a person can open, read, change, save, and close a computer file. Computer files may be reopened, modified, and copied an arbitrary number of times.

Typically, files are organised in a file system, which keeps track of where the files are located on disk and enables user access.

File system

In computing, a file system or filesystem, controls how data is stored and retrieved. Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stops and the next begins. By separating the data into pieces and giving each piece a name, the data is isolated and identified. Taking its name from the way paper-based data management system is named, each group of data is called a “file”. The structure and logic rules used to manage the groups of data and their names is called a “file system”.

Path

A path, the general form of the name of a file or directory, specifies a unique location in a file system. A path points to a file system location by following the directory tree hierarchy expressed in a string of characters in which path components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash (”/”), the backslash character (”\”), or colon (”:”), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems, and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths. As an example consider the following two files:

  1. /Users/theodore/MyGit/@atomickitty/hurri-sensors/.git/Guest.conf
  2. /etc/apache2/users/Guest.conf

They both have the same file name, but are located on different paths. Failure to provide the path when addressing the file can be a problem. Another way to interpret is that the two unique files actually have different names, and only part of those names is common (Guest.conf) The two names above (including the path) are called fully qualified filenames (or absolute names), a relative path (usually relative to the file or program of interest depends on where in the directory structure the file lives. If we are currently in the .git directory (the first file) the path to the file is just the filename.

We have experienced path issues with dependencies on .png files - in general your JupyterLab notebooks on CoCalc can only look at the local directory which is why we have to copy files into the directory for things to work.

File Types

  1. Text Files. Text files are regular files that contain information readable by the user. This information is stored in ASCII. You can display and print these files. The lines of a text file must not contain NULL characters, and none can exceed a prescribed (by architecture) length, including the new-line character. The term text file does not prevent the inclusion of control or other nonprintable characters (other than NUL). Therefore, standard utilities that list text files as inputs or outputs are either able to process the special characters gracefully or they explicitly describe their limitations within their individual sections.

  2. Binary Files. Binary files are regular files that contain information readable by the computer. Binary files may be executable files that instruct the system to accomplish a job. Commands and programs are stored in executable, binary files. Special compiling programs translate ASCII text into binary code. The only difference between text and binary files is that text files have lines of less than some length, with no NULL characters, each terminated by a new-line character.

  3. Directory Files. Directory files contain information the system needs to access all types of files, but they do not contain the actual file data. As a result, directories occupy less space than a regular file and give the file system structure flexibility and depth. Each directory entry represents either a file or a subdirectory. Each entry contains the name of the file and the file's index node reference number (i-node). The i-node points to the unique index node assigned to the file. The i-node describes the location of the data associated with the file. Directories are created and controlled by a separate set of commands.

File Manipulation

For this lesson we examine just a handfull of file manipulations which are quite useful. Files can be "created","read","updated", or "deleted" (CRUD).

Example: Create a file, write to it.

Below is an example of creating a file that does not yet exist. The script is a bit pendandic on purpose.

First will use some system commands to view the contents of the local directory

At this point our new file should exist, lets list the directory and see if that is so

Sure enough, its there, we will use a bash command cat to look at the contents of the file.

Example: Read from an existing file.

We will continue using the file we just made, and read from it the example is below

Example: Update a file.

This example continues with our same file, but we will now add contents without destroying existing contents. The keyword is append

As before we can examine the contents using a shell command sent from the notebook.

Example: Delete a file

Delete can be done by a system call as we did above to clear the local directory

In a JupyterLab notebook, we can either use

import sys
! rm -rf myfirstfile.txt # delete file if it exists

or

import os
os.remove("myfirstfile.txt")

they both have same effect, both equally dangerous to your filesystem.

Learn more about CRUD with text files at https://www.guru99.com/reading-and-writing-files-in-python.html

Learn more about file delete at https://www.dummies.com/programming/python/how-to-delete-a-file-in-python/

A little discussion on the part where we wrote numbers

what_to_add = ','.join(map(repr, mylist[0:len(mylist)])) + "\n"

Here are descriptions of the two functions map and repr

map(function, iterable, ...) Apply function to every item of iterable and return a list of the results. If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. If one iterable is shorter than another it is assumed to be extended with None items. If function is None, the identity function is assumed; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation). The iterable arguments may be a sequence or any iterable object; the result is always a list.

repr(object) Return a string containing a printable representation of an object. This is the same value yielded by conversions (reverse quotes). It is sometimes useful to be able to access this operation as an ordinary function. For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(), otherwise the representation is a string enclosed in angle brackets that contains the name of the type of the object together with additional information often including the name and address of the object. A class can control what this function returns for its instances by defining a repr() method.

What they do in this script is important. The statement:

what_to_add = ’,’.join(map(repr, mylist[0:len(mylist)])) + "\n"

is building a string that will be comprised of elements of mylist[0:len(mylist)]. The repr() function gets these elements as they are represented in the computer, the delimiter a comma is added using the join method in Python, and because everything is now a string the

... + "\n"

puts a linefeed character at the end of the string so the output will start a new line the next time something is written.

Example

References

Overland, B. (2018). Python Without Fear. Addison-Wesley ISBN 978-0-13-468747-6.

Grus, Joel (2015). Data Science from Scratch: First Principles with Python O’Reilly Media. Kindle Edition.

Precord, C. (2010) wxPython 2.8 Application Development Cookbook Packt Publishing Ltd. Birmingham , B27 6PA, UK ISBN 978-1-849511-78-0.