Download this page as a jupyter notebook at Lesson 6

ENGR 1330 Computational Thinking with Data Science

Copyright © 2021 Theodore G. Cleveland and Farhang Forghanparast

Last GitHub Commit Date: 13 July 2021

Lesson 6 Functions:



Objectives

  1. Awareness of intrinsic functions available in core Python
  2. Import and implement external functions in libraries (external modules)
  3. Create and implement user-written functions
  4. Awareness of variable scope
  5. Save user-written functions to a file for import later

What are Functions?

Functions are simply pre-written code fragments that perform a certain task. In older procedural languages functions and subroutines are similar, but a function returns a value whereas a subroutine operates on data. The difference is subtle but important.

More recent thinking has functions being able to operate on data (they always could) and the value returned may be simply an exit code. An analogy are the functions in MS Excel. To add numbers, we can use the sum(range) function and type =sum(A1:A5) instead of typing =A1+A2+A3+A4+A5

To restate A function is a block of code which only runs when it is called; it is pre-defined (prototyped) using parameters (placeholder variables)

The GIF below animates these ideas:

Things you do using functions:

Calling the Function

We call a function simply by typing the name of the function or by using a dot notation. Whether we can use the dot notation or not depends on how the function is written, whether it is part of a class, and how it is imported into a program.

Some functions expect us to pass data to them to perform their tasks. These data are known as parameters( older terminology is arguments, or argument list) and we pass them to the function by enclosing their values in parenthesis ( ) separated by commas.

For instance, the print() function for displaying text on the screen is "called" by typing print('Hello World') where print is the name of the function and the literal (a string) 'Hello World' is the argument.

Program flow

A function, whether built-in, or added must be defined before it is called, otherwise the script will fail. Certain built-in functions "self define" upon start (such as print() and type() and we need not worry about those funtions). The diagram below illustrates the requesite flow control for functions that need to be defined before use.

An example below will illustrate, change the cell to code and run it, you should get an error. Then fix the indicated line (remove the leading "#" in the import math ... line) and rerun, should get a functioning script.

An alternate way to load just the sqrt() function is shown below, either way is fine.

Built-In in Primitive Python (Base install)

The base Python functions and types built into it that are always available, the figure below lists those functions.

Notice all have the structure of function_name(), except __import__() which has a constructor type structure, and is not intended for routine use (among other things it allows assigning multiple names for an imported module which usually breaks things badly) but instead we should stick with plain old import module_name as an expression rather than a function call.

Added-In using External Packages/Modules and Libaries (e.g. math)

Python is also distributed with a large number of external functions. These functions are saved in files known as modules. To use the built-in codes in Python modules, we have to import them into our programs first. We do that by using the import keyword. There are three ways to import:

  1. Import the entire module by writing import moduleName; For instance, to import the random module, we write import random. To use the randrange() function in the random module, we write random.randrange( 1, 10);28
  2. Import and rename the module by writing import random as r (where r is any name of your choice). Now to use the randrange() function, you simply write r.randrange(1, 10); and
  3. Import specific functions from the module by writing from moduleName import name1[,name2[, ... nameN]]. For instance, to import the randrange() function from the random module, we write from random import randrange. To import multiple functions, we separate them with a comma. To import the randrange() and randint() functions, we write from random import randrange, randint. To use the function now, we do not have to use the dot notation anymore. Just write randrange( 1, 10).

The modules that come with Python are extensive and listed at https://docs.python.org/3/py-modindex.html. There are also other modules that can be downloaded and used (just like user defined modules below). In these labs we are building primitive codes to learn how to code and how to create algorithms. For many practical cases you will want to load a well-tested package to accomplish the tasks.

That exercise is saved for the end of the document.

User-Built

We can define our own functions in Python and reuse them throughout the program. The syntax for defining a function is:

def functionName( argument ):
    code detailing what the function should do
    note the colon above and indentation
    ...
    ...
    return [expression]

The keyword def tells the program that the indented code from the next line onwards is part of the function. The keyword returntells the program to return an answer from the function. There can be multiple return statements in a function. Once the function executes a return statement, the program exits the function and continues with its next executable statement. If the function does not need to return any value, you can omit the return statement, the exit code will be none which is a special constant with logical value false

Its probably a good habit to have a return statement anyway, and just return null

Functions can be pretty elaborate; they can search for things in a list, determine variable types, open and close files, read and write to files.

To get started we will build a few really simple mathematical functions; we will need this skill in the future anyway, especially in scientific programming contexts.

User-built within a Code Block

For our first function we will code $$f(x) = x\sqrt{1 + x}$$ into a function named dusty().

When you run the next cell, all it does is prototype the function (defines it), nothing happens until we use the function.

Example

Sum of numbers from 0 to N

Example

Sum of numbers from 0 to N, but forget the return statement

Example

Sum of numbers from 0 to N, and including defaults.

Example

Create a Fahrenhiet to Celsius converter and test it for these values:

*hint: Formula-(°F − 32) × 5/9 = °C

Problem Solving Process

Step 1

Define the problem (problem statement):

Step 2

Gather information (identify known and unknown values, and governing equations)

Step 3

Generate and evaluate potential solutions

Step 4

Refine and implement a solution

  1. Create function to evaluate input and produce output
  1. Create wrapper to prompt for input, execute function, label output

Step 5

Verify and test the solution.

Step 6

Refine to be useful

Variable Scope

An important concept when defining a function is the concept of variable scope. Variables defined inside a function are treated differently from variables defined outside. Firstly, any variable declared within a function is only accessible within the function. These are known as local variables.

In the dusty() function, the variables x and temp are local to the function. Any variable declared outside a function in a main program is known as a program variable and is accessible anywhere in the program.

In the example, the variables xvalue and yvalue are program variables (global to the program; if they are addressed within a function, they could be operated on.) Generally we want to protect the program variables from the function unless the intent is to change their values. The way the function is written in the example, the function cannot damage xvalue or yvalue.

If a local variable shares the same name as a program variable, any code inside the function is accessing the local variable. Any code outside is accessing the program variable

As Separate Module/File

In this section we will invent the neko() function, export it to a file, so we can reuse it in later notebooks without having to retype or cut-and-paste. The neko() function evaluates:

$$f(x) = x\sqrt{|(1 + x)|}$$

Its the same as the dusty() function, except operates on the absolute value in the wadical.

  1. Create a text file named "mylibrary.txt"
  2. Copy the neko() function script below into that file.

     def neko(input_argument) :
         import math #ok to import into a function
         local_variable = input_argument * math.sqrt(abs(1.0+input_argument))
         return local_variable
  1. rename mylibrary.txt to mylibrary.py
  2. modify the wrapper script to use the neko function as an external module

In JupyterHub environments, you may discover that changes you make to your external python file are not reflected when you re-run your script; you need to restart the kernel to get the changes to actually update. The figure below depicts the notebook, external file relatonship

Recursion

Functions can contain themselves as part of definition, this is called recursion. Use the provilige with care, its easy to clobber the memory stack using recursive structures. Consider the factorial function definition from math class:

$$ 0! = 1 \\ 1! = 1 \\ 2! = 0! * 1! * 2 \\ \dots \\ N! = (N-1)! * N $$

Rudimentary Graphics

Graphing values is part of the broader field of data visualization, which has two main goals:

  1. To explore data, and
  2. To communicate data.

In this subsection we will concentrate on introducing skills to start exploring data and to produce meaningful visualizations we can use throughout the rest of this notebook. Data visualization is a rich field of study that fills entire books. The reason to start visualization here instead of elsewhere is that with mathematical functions plotting is a natural activity and we have to import the matplotlib module to make the plots.

The example below is code adapted from Grus (2015) that illustrates simple generic plots. I added a single line (label the x-axis), and corrected some transcription errors (not the original author's mistake, just the consequence of how the API handled the cut-and-paste), but otherwise the code is unchanged.

Now lets put the plotting script into a function so we can make line charts of any two numeric lists

Example

Use the plotting script and create a function that draws a straight line between two points.

References

  1. Grus, Joel (2015-04-14). Data Science from Scratch: First Principles with Python (Kindle Locations 1190-1191). O'Reilly Media. Kindle Edition.

  2. Call Expressions in "Adhikari, A. and DeNero, J. Computational and Inferential Thinking The Foundations of Data Science" https://www.inferentialthinking.com/chapters/03/3/Calls.html

  3. Functions and Tables in "Adhikari, A. and DeNero, J. Computational and Inferential Thinking The Foundations of Data Science" https://www.inferentialthinking.com/chapters/08/Functions_and_Tables.html

  4. Visualization in "Adhikari, A. and DeNero, J. Computational and Inferential Thinking The Foundations of Data Science" https://www.inferentialthinking.com/chapters/07/Visualization.html

  5. Documentation; The Python Standard Library; 9. Numeric and Mathematical Modules https://docs.python.org/2/library/math.html

  6. Code.org; Chris Bosh of Miami Heat and Jess Lee CEO of Polyvore. Let's use code to join Anna and Elsa as they explore the magic and beauty of ice. https://youtu.be/0eo0ESEX9DE

  7. ENGR-1330 Functions and Variable Scope (Fall 2020) https://3.137.111.182/engr-1330-webroot/1-Lessons/Lesson05/OriginalPowerpoint/PythonFunctions-VariableScope-ENGR1330-CTwithDS.pdf