Categories
Uncategorized

12/17/20

20201217
0745-0833
Begin
Search “python create a button”
Watch https://www.youtube.com/watch?v=30oWH6yKuB4, “Python 101 – Making a Button” by HowCode [← Useless.]
Search “tkinter”
Read “Tkinter is a Python binding to the Tk GUI toolkit. It is the standard Python interface to the Tk GUI toolkit, and is Python’s de facto standard GUI. Tkinter is included with standard Linux, Microsoft Windows and Mac OS X installs of Python. The name Tkinter comes from Tk interface.”
Try using “from tkinter import *” within a function.
Fail “‘from tkinter import *’ only allowed at module level.
Try code as written.
Fail “name ‘exit’ is not defined”.
Watch https://www.youtube.com/watch?v=yuuDJ3-EdNQ, “Creating Buttons With TKinter – Python Tkinter GUI Tutorial #3” by Codemy.com
Try code as written.
Success.

from tkinter import *

root = Tk()

def myClick():
    myLabel = Label(root, text="Look! I clicked a Button!")
    myLabel.pack()

myButton = Button(root, text="Click Me!", padx=50, pady=50,
                command=myClick, fg="white", bg="#000000")
myButton.pack()

root.mainloop()

Keywords: button, button color, RGB, button size, button action.

0833-0937
Read some of:
https://realpython.com/python-gui-tkinter/
https://tkdocs.com/tutorial/widgets.html
https://tkdocs.com/tutorial/fonts.html
https://tkdocs.com/tutorial/styles.html
https://tkdocs.com/tutorial/idle.html
https://tkdocs.com/tutorial/morewidgets.html
https://tkdocs.com/tutorial/concepts.html
https://tkdocs.com/tutorial/index.html
https://www.activestate.com/blog/top-10-must-have-python-packages/
https://www.activestate.com/blog/neural-network-showdown-tensorflow-vs-pytorch/

0937-1234
Watched first 2.5 videos of https://developers.google.com/machine-learning/crash-course
Played with:
http://playground.tensorflow.org/#activation=sigmoid&batchSize=3&dataset=spiral&regDataset=reg-plane&learningRate=0.3&regularizationRate=0&noise=5&networkShape=8&seed=0.49245&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=true&xSquared=true&ySquared=true&cosX=false&sinX=true&cosY=false&sinY=true&collectStats=false&problem=classification&initZero=false&hideText=false&dataset_hide=false
Note:
Vertical and horizontal depth of virtual neurons helps with ReLU and Tanh activation. I didn’t find a purpose for linear activation.
With sigmoid activation, horizontal depth actually killed the learning. Sigmoid activation with one horizontal depth seemed the best for the spiral dataset.
Sigmoid refers to an s-curve.
Learning rates of >= 1 were useless. Learning too fast means the weights (think myelinization) increase too quickly leading to clunky approximation.
Learning rates of < 0.01 were unpleasant because the weights took too long to update.
Slower learning led to more elegant weighting (less overall weight).
Limiting the number of input functions could help decrease calculation time if you knew which ones to choose.
Having too many input functions (6) was far superior to having too few (1). Lacking a key input function hamstrung the learning.

4hrs 50mins, 3hrs 10mins left – 1614 quit if no breaks
1324-1406
Read https://www.activestate.com/blog/neural-network-showdown-tensorflow-vs-pytorch/
Try code as written.
Fail “No module named ‘torch'”.
Read https://pytorch.org/.
Try “conda install pytorch torchvision -c pytorch” in Spyder.
Feedback no visible response after a little while.
Switch to reading.

1406-1442
Read “The Pragmatic Programmer” by David Thomas and Andrew Hunt, Ch1.

1443-1457
Success, pytorch appears to have installed while I was reading.
Try running code again.
Fail “NameError: name ‘data_x’ is not defined”. This will also be true of ‘data_y’.
Try commenting out original code and swapping in x_data and y_data.
Success. Received a message telling me to update my NVidia driver.

import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torch.autograd as autog

x_data = np.array([[0,0], [0,1], [1,0]])
y_data = np.array([[0,1,1]]).T
x_data = autog.Variable(torch.FloatTensor(x_data))
y_data = autog.Variable(torch.FloatTensor(y_data), requires_grad=False)

in_dim = 2
out_dim = 1
epochs = 15000
epoch_print = epochs/5
l_rate = 0.001

class NeuralNet(nn.Module):
    def __init__(self, input_size, output_size):
        super(NeuralNet, self).__init__()
        self.lin1 = nn.Linear(input_size, output_size)
    
    def forward(self, x):
        return self.lin1(x)

model = NeuralNet(in_dim, out_dim)
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(model.parameters(), lr=l_rate)

for epoch in range(epochs):
    optimizer.zero_grad()
    pred = model(x_data)
    loss = criterion(pred, y_data)
    loss.backward()
    optimizer.step()
    if (epoch + 1) % epoch_print == 0:
        print("Epoch %d Loss %.3f" %(epoch + 1, loss.item()))

for x, y in zip(x_data, y_data):
    pred = model(x)
    print("Input", list(map(int, x)), "Pred", int(pred > 0), "Output", int(y))

Note:
The program didn’t deliver on what I understood to be it’s only task: determining the 4th state of the or function. Perhaps I misunderstand or the variable swap broke it.
PyTorch appears to be a thing I can now use.
PyTorch is an alternative to TensorFlow (Google’s neural network software package).

1457-1614
Read up to http://neuralnetworksanddeeplearning.com/chap1.html. It promises to recognize handwritten numbers with 74 lines of code.

20201217 summary
Tkinter is a built-in means of providing a GUI in Python.
Google offers a crash-course in machine learning: https://developers.google.com/machine-learning/crash-course.
More virtual neurons don’t always improve performance. Activation type, learning speed, input, etc. matter a great deal to ML performance.
I can use PyTorch.

Categories
Uncategorized

1c4 – Introduction to Algorithms

1 – Asymptotic Complexity, Peak Finding

Teaching Assistant:

Θ (theta) represents the complexity of the function. If you zoom out on a graph of computation time, which term of the function describing computation time dominates?
O (big O) represents the upper bound on the computation time of the function. As programmers don’t care how long a program takes if they get lucky, this is the primary measure of computation complexity they care about.
Ω (omega) represents the lower bound of the function’s complexity. This is the smallest amount of time the program will take to run. Note this may always be close to zero if the program gets lucky under certain (or cyclical) conditions.

He notes everything in the course will be:
♀♪♫☼►◄↕‼¶§▬↨↑↓→←∟↔▲▼ !”#$%&'()*+,-./01

Categories
Uncategorized

1c2 Intro to Computational Thinking and Data Science

1 – Optimization Problems

3 kinds of models:
Optimization models – start with an objective function to be minimized or maximized. This usually has a set of constraints.
Statistical models
Simulation models

Knapsack problem has 2 versions: the discrete version (0/1 version) where you take the item or you don’t and the continuous version where you can take fractions of items. The continuous version is boring; you use a greedy algorithm to take gold until the gold is gone or you run out of space, then repeat for silver, bronze, etc.
The discrete version is more complicated. If you use a brute force algorithm, it doesn’t work because running time is exponential.
The greedy algorithm asks what is the “best” item is and add that to the knapsack and repeat until you run out of space. This program’s main drawback in computational time is the sort. Python uses a version of the merge sort, so that happens in n log n time (log-linear time). Very efficient compared to brute force.

Lambda is used to create an anonymous function (it has no name). It builds a function to evaluate a sequence of identifiers and some expression, returning a function of n arguments. Professor Guttag maintains that lambda expressions should only be used if they fit on a single line. Otherwise write a function because they’re easier to debug.

Greedy algorithms optimize for a single variable, so while locally the best answer (local meaning for that variable), rarely globally the best answer. Professor Guttag drew two hills for a hypothetical hill-climbing algorithm. One hill is clearly taller than the other one. If the greedy algorithm never goes down, then the algorithm can get stuck on the lower hill. This optimizes locally (the local maximum) but not globally (the global maximum which is what was sought).

2 – Optimization Problems (cont.)

Pros of greedy algorithms:
→ easy to implement
→ computationally efficient (n log n time)
Cons of greedy algorithms:
→ does not always yield the best solution
→ you don’t know how good the approximation is (how close to optimal?)

Dynamic programming is meaningless name for a useful methodology. Store and recall answers that will be used more than once, a process known as memoization (like I created memo to remind myself of the answer).

Memoization on the recursive computing of the Fibonacci sequence. Note the use of a dictionary instead of a list.
def fastFib(n, memo = {}):
“””Assumes n is an int >= 0, memo used only by recursive calls
Returns Fibonacci of n”””
if n == 0 or n == 1:
return 1
try:
return memo[n]
except KeyError:
result = fastFib(n-1, memo) +\
fastFib(n-2, memo)
memo[n] = result
return result

Memoization works if you have two conditions:
→ Optimal substructure: a globally optimal solution can be found by combining optimal solutions to local subproblems.
→ Overlapping subproblems: finding an optimal solution involves solving the same problem multiple times.

Modifying maxVal (a program that attempts to get the most calories under a specific limit, a version of the knapsack problem) to use a memo:
→ add memo as a third argument
↑ def fastMaxVal(toConsider, avail, memo = {}):
→ key of memo is a tuple
↑ (items left to be considered, available weight)
↑ items left to be considered represented by len(toConsider)
→ first thing the body of function does is check whether the optimal choice of items given the available weight is already in the memo
→ last thing the body of the function does is update the memo

Comparing the runtimes of the two maxVal programs (the second with dynamic programming) makes the first look exponential and the second polynomial. If the problem is exponential, how can we get polynomial running times? Computational complexity can be subtle.

Many problems of practical importance can be formulated as optimization problems. Greedy algorithms often provide adequate (maybe not optimal) solutions. Finding an optimal solution is usually exponentially hard. Dynamic programming often yields good performance for a subclass of optimization problems–those with optimal substructure and overlapping subproblems–where the solution is always correct and the algorithm is fast in the right circumstances.

3 – Graph-theoretic Models

This lecture is given by Dr. Eric Grimson instead of Dr. John Guttag.

Graphs in computation have nodes (vertices) and edges (arcs) connecting those nodes. Edges can be directed or undirected (directed edges are arrows) and they can be weighted or unweighted (think Markov chains). A digraph is a directed graph, the edges pass only in one direction.

Trees are a special case of a graph in which any pair of nodes is connected by a single path.

Common graph problems:
→ Is there a path from A to B? (A sequence of edges and nodes.)
→ What is the least expensive path?
→ Graph Partition Problem – how do you separate a graph into unconnected connected elements?
→ Min-cut Max-flow Problem – is there an efficient way to separate out the highly connected elements, the things that interact a lot and separate out how many of those kinds of subgraphs are inside of my graph?

Class Node:
class Node(object):
def __init__(self, name):
“””Assumes name is a string”””
self.name = name
def getName(self):
return self.name
def __str__(self):
return self.name

Class Edge:
class Edge(object):
def __init__(self, src, dest):
“””Assumes src and dest are nodes”””
self.src = src
self.dest = dest
def getSource(self):
return self.src
def getDestination(self):
return self.dest
def __str__(self):
return self.src.getName() + ‘→’\
+ self.dest.getName()

A methodology known as an adjacency list stores the graph information by listing each node’s possible destinations using a single edge.

Class Digraph
class Digraph(object):
“””edges is a dict mapping each node to a list of its children”””

def __init__(self):
self.edges = {}
def addNode(self, node):
if node in self.edges:
raise ValueError(‘Duplicate node’)
else:
self.edges[node] = []
def addEdge(self, edge):
src = edge.getSource()
dest = edge.getDestination()
if not (src in self.edges and dest in self.edges):
raise ValueError(‘Node not in graph’)
self.edges[src].append(dest)
def childrenOf(self, node):
return self.edges[node]
def hasNode(self, node):
return node in self.edges
def getNode(self, name):
for n in self.edges:
if n.getName() == name:
return n
raise NameError(name)
def __str__(self):
result = ”
for src in self.edges:
for dest in self.edges[src]:
result = result + src.getName() + ‘→’\
+ dest.getName() + ‘\n’
return result[:-1] #omit final newline

Class Graph
class Graph(Digraph):
def addEdge(self, edge):
Digraph.addEdge(self, edge)
rev = Edge(edge.getDestination(), edge.getSource())
Digraph.addEdge(self, rev)

Adjacency list of an example graph:
Boston: Providence, New York
Providence: Boston, New York
New York: Chicago
Chicago: Denver, Phoenix
Denver: Phoenix, New York
Los Angeles: Boston
Phoenix [nothing]

Dr. Grimson proceeds to put that graph into code:

def buildCityGraph(graphType):
g = graphType()
for name in (‘Boston’, ‘Providence’, ‘New York’, ‘Chicago’, ‘Denver’,
‘Phoenix’, ‘Los Angeles’): # Create 7 nodes
g.addNode(Node(name))
g.addEdge(Edge(g.getNode(‘Boston’), g.getNode(‘Providence’)))
g.addEdge(Edge(g.getNode(‘Boston’), g.getNode(‘New York’)))
g.addEdge(Edge(g.getNode(‘Providence’), g.getNode(‘Boston’)))
g.addEdge(Edge(g.getNode(‘Providence’), g.getNode(‘New York’)))
g.addEdge(Edge(g.getNode(‘New York’), g.getNode(‘Chicago’)))
g.addEdge(Edge(g.getNode(‘Chicago’), g.getNode(‘Denver’)))
g.addEdge(Edge(g.getNode(‘Chicago’), g.getNode(‘Phoenix’)))
g.addEdge(Edge(g.getNode(‘Denver’), g.getNode(‘Phoenix’)))
g.addEdge(Edge(g.getNode(‘Denver’), g.getNode(‘New York’)))
g.addEdge(Edge(g.getNode(‘Los Angeles’), g.getNode(‘Boston’)))
return g

Depth-first search.
Unlike binary trees, graphs can contain loops. To prevent an endless loop, this algorithm will keep track of its path and not repeat a leg of the trip.
Finding a path can be broken down into finding various intermediate waypoints. Thus this can be solved recursively.
→ Look at origin node.
→ Look at all edges leaving the origin node, in some order.
→ Follow the first edge. Check to see if you’re at the destination, if you are, return path because you’re done.
→→ If not, repeat the process from new node.
→ Continue until either find goal node, or run out of options.
→→ When out of options, backtrack to the previous node and try the next edge, repeating this process.

def DFS(graph, start, end, path, shortest, toPring = False):
path = path + [start]
if toPrint:
print(‘Current DFS path:’, printPath(path))
if start == end:
return path
for node in graph.childrenOf(start):
if node not in path: # avoid cycles
if shortest == None or len(path) < len(shortest):
newPath = DFS(graph, node, end, path, shortest, toPrint)
if newPath != None:
shortest = newPath
elif toPrint:
print(‘Already visited’, node)
return shortest

def shortestPath(graph, start, end, toPrint = False):
return DFS(graph, start, end, [], None, toPrint)

The second function here is a wrapper function. It gets the recursion started properly (it has an empty list). It provides appropriate abstraction. To test these:

def testSP(source, destination):
g = buildCityGraph(DiGraph)
sp = shortestPath(g, g.getNode(source), g.getNode(destination),
toPrint = True)
if sp != None:
print(‘Shortest path from’, source, ‘to’, destination, ‘is’,
printPath(sp))
else:
print(‘There is no path from’, source, ‘to’, destination)

testSP(‘Boston’, ‘Chicago’)

Breadth First Search.
It’s like depth first search, except sideways instead of down. You always stop if you find a path to the destination because that’s always a shortest path.

def BFS(graph, start, end, toPrint = False):
initPath = [start]
pathQueue = [initPath]
while len(pathQueue) != 0:
#Get and remove oldest element in pathQueue
tmpPath = pathQueue.pop(0)
if toPrint:
print(‘Current BFS path:’, printPath(tmpPath))
if lastNode == end:
return tmpPath
for nextNode in graph.childrenOf(lastNode):
if nextNode not in tmpPath:
newPath = tmpPath + [nextNode]
pathQueue.append(newPath)
return None

Weighted Shortest Path.
This form of search assumes a cost for each leg of the journey.
Breadth first search doesn’t yield to modification for weighted shortest path because the longest leg taken first could result in the shortest path (all possibilities should be considered). So if you want weighted shortest path, modify depth first search.

Summary
Graphs are a collection of entities and connections between those entities. Searches are a means of finding a path between entities.

4 – Stochastic Thinking

Categories
Computer Science School

1c1 Intro to CS and Python

1 – What is Computation

A program is a recipe.

  • Sequence of steps
  • Flow of control, determining when steps execute
  • A means of determining when to stop

Basic machine architecture 

  • Memory – stores programs and data.
  • CPU
    • Control unit – program counter, tells ALU what ops to do on what data, increments when ALU performs, accesses programs.
    • Arithmetic logic unit – does primitive operations / tests, accesses data, write data.
  • Input/output

Alan Turing proved you could do anything with the six primitives. The six primitives are move left, move right, read, right, scan, and do nothing. We can abstract methods to create new primitives, so modern languages have more primitives. Anything computable in one language is computable in another language.

Errors

  • Syntactic errors (e.g. 3″hi”) – easily caught
  • Static semantic errors (e.g. 3 + “hi”) – some languages check prior to running. These can cause unpredictable behavior.
  • No semantic errors, but different meaning than user intends
    • Program crashes
    • Program runs forever
    • Program returns “incorrect” answer

 A program is a sequence of definitions and commands. It evaluates definitions and executes commands. These are the tests and operations from basic machine architecture.

Programs manipulate data objects. Objects have types defining allowable operations for that type. Objects are scalar or non-scalar. Scalar examples: int, float, bool, NoneType. You can use type parentheses parentheses to see an object’s type. You can convert objects from one type to another (float(3) converts integer to float, int(3.9) truncates a float to the integer 3, str(3) converts 3 to “3”).  Such conversion is called casting.

Operaters
+ → sum, concatenation for strings
– → difference
* → product, repetition for a number of times for strings (“Ho ” * 3 is “Ho Ho Ho “)
/ → division (always a float)
% → modulus
** → power (^ in some languages, the ^ is a bitwise operator in Python)
= → bind a value to a variable. Variables are case sensitive in Python!
# → comment a line
// → floor division, it rounds the result down (result is an integer)

Bitwise operaters
x<<y → x is suffixed by y zeroes. Equivalent to x*2**y.
x>>y → x is shifted right y places. Equivalent to x//2**y.
x&y → (and) for each bit, if x & y are 1, 1, else 0.
x|y → (or) for each bit, if either x or y are 1, 1, else 0.
~x → bitwise complement of x. 01 = 10.
x^y → bitwise exclusive or (xor). Flip a bit in x if bit in y is 1.

2 – Branching and Iteration

Input prints what is in quotes, user type something and presses enter, and the input is bound as a string to a variable.

text = input("Type a string.")
print(text)

If you want the input as another type, cast it as that type.

num = float(input("Enter a number."))
print(num)

Comparing (tests)

i > j #compares strings lexicographically
i >= j
i < j
i <= j
i == j
i != j

Boolean operators: not, and, or.

Control flow through branching (and here I thought if statements were nothing special).

if:
else:

if <condition for case 1>:
elif <condition for case 2>:
elif <condition for case 3>:
else:

Control flow through iteration (loops). Loops and if statements pay close attention to the indentation. If it’s not indented (4 spaces by convention), it’s not in the loop or part of the if statement.

while <condition>:
    Indented code
    More indented code
Code not in the while loop

#This:

n = 0
while n < 5:
    print(n)
    n = n + 1

#is equivalent to this:

for n in range(5):
    print(n)

Range(start, stop, step) 
Stop is the only required argument; start defaults to zero and step defaults to one. If provided two arguments, the function interprets them as the start and stop. The for loop continues until stop minus one. The arguments must be integers.

Break ends the loop it is found in.

For loops

  • known number of iterations
  • uses a counter
  • can be written as a while loop

While loops

  • unbounded number of iterations
  • can use a counter, but must initialize before the loop and increment it in the loop
  • not all loops can be written as for loops

If you enter into an infinite loop, try Ctrl + C.
“\n” starts a new line in text.

3 – String Manipulation, Guess and Check, Approximation, Bisection

s = "abc"
len(s) # is 3

#indexing
s[0]  # is a
s[1]  # is b
s[2]  # is c
s[3]  # is trying to index out of bounds, error
s[-1] # is c
s[-2] # is b
s[-3] # is a

#slicing
#differentiated from indexing by at least one colon
#[start : stop : step], step defaults to 1
s = "abcdefgh"
s[3:6]    # "def"
s[3:6:2]  # "df"
s[::]     # "abcdefgh", equivalent to s[0:len(s):1]
s[::-1]   # "hgfedbca", equivalent to s[-1:-(len(s)+1):-1]
s[4:1:-2] # "ec"

Strings are immutable. You cannot hot-swap letters in a string. You must redefine the string. You can use a string to redefine the string.

s = "hello"
s[0] = 'j' # this gives an error.
s = 'j' + s[1:len(s)]

# Check each letter in string s for an i or a u.
s = "abcdefghi"

for char in s:
    if char == 'i' or char == 'u':
        print("This character is an i or u.")
    else:
        print("This character is not an i or u.")

Cheerleader program Dr. Ana Bell wrote:

# An example written by Dr. Ana Bell

an_letters = "aefhilmnorsxAEFHILMNORSX"
word = input("I will cheer for you!  Enter a word: ")
times = int(input("Enthusiasm level (1-10): "))

for char in word:
    char = word[i]
    if char in an_letters:
        print("Give me an " + char + "! " + char)
    else:
        print("Give me a " + char + "! " + char)
print("What does that spell?")
for i in range(times):
    print(word, "!!!")

Guess-and-check methodology

cube = int(input("Enter an integer for cube root: "))
for guess in range(abs(cube)+1cube = int(input("Enter an integer for cube root: "))
for guess in range(abs(cube)+1):
    if guess**3 >= abs(cube):
        break
if guess**3 != abs(cube):
    print(cube, 'is not a perfect cube.')
else:
    if cube < 0:
        guess = -guess
    print('Cube root of '+str(cube)+' is '+str(guess))

This can be changed to be far more accurate using bisection when guessing, determining if the guess is high or low, and re-bisecting it accordingly. I changed Dr. Bell’s code to have the epsilon much smaller. It takes the program longer, but the results for perfect cubes are more likely to be integers.

cube = int(input("Enter an integer for cube root: "))
epsilon = .0000000000001
num_guesses = 0
low = 0
high = cube
guess = (high + low) / 2.0

while abs(guess**3-cube) >= epsilon:
    if guess**3 < cube:
        low = guess
    else:
        high = guess
    guess = (high + low) / 2.0
    num_guesses += 1
print('num_guesses =', num_guesses)
print(guess, 'is close to the cube root of', cube)

4 – Decomposition, Abstraction, Functions

Abstraction: you don’t need to know how a black box works in order to use the black box. E.g. we use computers and very few of us understand from start to finish how typing in a letter results in that letter being printed on the screen and stored in memory. Functions are the same way, they abstract blocks of code into easily read and used single-line entities. Suppress details.

Decomposition: divide the programming. Divide it into modules, functions, and classes.

  • self-contained
  • breaks up code
  • intended to be reusable
  • keeps code organized
  • keeps code coherent

Functions’ characteristics:

  • name
  • parameters (0 or more arguments)
  • docstring (optional but recommended for explaining what the function does to other people and future you)
  • body
  • returns something
def is_even(i):
    """
    Input: i, a positive int
    Returns True if i is even, otherwise False
    """
    print("inside is_even")
    return i%2 == 0

If the function doesn’t return anything, python will add “return None” if you don’t manually put it. (It’s optional.)

5 – Tuples, Lists, Aliasing, Mutability, and Cloning

Tuple: an ordered sequence of elements. You can mix element types. Tuples are immutable.
t = () # declares an empty tuple
t = (2, “MIT”, 3)
t[0] # evaluates to 2
(2, “MIT”, 3) + (5, 6) # evaluates to (2, “MIT”, 3, 5, 6)
t[1:2] # slicing the tuple results in (“MIT”, ). The comma in the parentheses indicates that it’s a tuple. Without the comma, it’s a string.
t[1:3] # slicing the tuple results in (“MIT”, 3)
len(t) # evaluates to 3
t[1] = 4 # results in error, can’t modify tuple object.

Tuples are convenient for swapping variable values.
You can’t swap variables by:
x = y
y = x
because you’re overwriting x.
You can:
temp = x
x = y
y = temp
But with tuples, it’s one line:
(x, y) = (y, x)

Functions are allowed to only return one object. Tuples are objects with multiple objects inside. So you can use tuples to get multiple things from a function.

def quotient_and_remainder(x, y):
    q = x // y
    r = x % y
    return(q, r)

You can iterate over tuples.

def get_data(aTuple):
    nums = ()
    words = ()
    for t in aTuple:
        nums = nums + (t[0],)
        if t[1] not in words:
            words = words + (t[1],)
    min_n = min(nums)
    max_n = max(nums)
    unique_words = len(words)
    return (min_n, max_n, unique_words)

Lists are a lot like tuples, but they are shown by square brackets instead of parentheses and are mutable!

Common pattern, iterate over list elements:

total = 0
for i in L:
    total += i
print(total)

# is equivalent to:

total = 0
for i in range(len(L)):
    total += L[i]
print(total)

# the first is more Pythonic.

Add elements to the list:
L = [2, 1, 3]
L.append(5) # changes the list L to [2, 1, 3, 5]

The dot in “.append” indicates something being done to an object. In the past I’ve found this frequently in VBA for Excel. Objects have data, methods, and functions. For instance, math.pi calls up the pi value from the math module.

L1 = [2, 1, 3]
L2 = [4, 5, 6]
L3 = L1 + L2 # L3 is [2, 1, 3, 4, 5, 6]. L1 and L2 remain unchanged.
L1.extend([0, 6]) # mutated L1 to [2, 1, 3, 0, 6].

L = [2, 1, 3, 6, 3, 7, 0]
L.remove(2) # L = [1, 3, 6, 3, 7, 0]. The function finds the first value 2 and removes it.
L.remove(3) # L = [1, 6, 3, 7, 0].
del(L[1]) # L = [1, 3, 7, 0]. The del function deletes nth element of the list.
L.pop() # function returns last element and deletes it. L = [1, 3, 7].

Converting lists to strings and back.
s = “I<3 cs”
list(s) # returns [‘I’, ‘<‘, ‘3’, ‘ ‘, ‘c’, ‘s’]. Note the returned object is a list.
s.split(‘<‘) # returns [‘I’, ‘3 cs’]. Again, string becomes list. If called without a parameter, split() splits on spaces.
L = [‘a’, ‘b’, ‘c’]
”.join(L) # returns “abc”. Note list becomes a string.
‘ ‘.join(L) # returns “a b c”.

Other list operations
L = [9, 6, 0, 3]
sorted(L) # returns a sorted list, but doesn’t modify L.
L.sort() # sorts L.
L.reverse() # reverse sorts L.

As lists are mutable, if you change one and refer to it in multiple places, it will change in each of those places.
a = 1
b = a
print(a)
> 1
print(b)
> 1 # because b is referring to a.

warm = [‘red’, ‘yellow’, ‘orange’]
hot = warm
hot.append(‘pink’)
print(hot)
> [‘red’, ‘yellow’, ‘orange’, ‘pink’]
print(warm)
> [‘red’, ‘yellow’, ‘orange’, ‘pink’]

Get around this by cloning (copying) the list for the new variable.
cool = [‘blue’, ‘green’, ‘grey’]
chill = cool[:] # that sweet cloning action
chill.append(‘black’)
print(chill)
> [‘blue’, ‘green’, ‘grey’, ‘black’]
print(cool)
> [‘blue’, ‘green’, ‘grey’] # because you cloned it and didn’t mess with the clone.

You can have lists in lists.

6 – Recursion and Dictionaries

This lecture was taught by Dr. Eric Grimson.

Recursion can be thought of as reduce and conquer. It’s a program that calls itself.
• It must have 1 or more base cases that are easy to solve. Once this is solved, the recursion ceases.
• It solve the same problem on some other input with the goal of simplifying the larger problem input.

def mult_iter(a, b):
    result = 0
    while b > 0:
        result += a
        b -= 1
    return result

# This adds a to itself b times.  The same, with recursion:

def mult(a, b):
    if b == 1: 
        return a # the if is the base case, an exit clause.
    else:
        return a + mult(a, b-1)

So recursion keeps reducing the problem until it gets to something it can solve directly.

def fact(n):
    if n == 1:
        return 1
    else:
        return n*fact(n-1)

fact(5)
    > 120

So, normally, if I return 1, that’s the output of the function. Why isn’t the output 1? Well, it is, five levels deep. Stack diagram:

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * fact(2)
fact scope (2) → return 2 * fact(1)
fact scope (1) → return 1

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * fact(2)
fact scope (2) → return 2 * 1

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * 2

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * 6

Global scope → fact(5)
fact scope (5) → return 5 * 24

Global scope → 120

Dictionaries store pairs of data, a key and a value.
my_dictionary = {}
grades = {‘Ana’: ‘B’, ‘John’:’A+’, ‘Denise’:’A’, ‘Katy’:’A’}
grades[‘Sylvan’] = ‘A’ # add an entry
‘John’ in grades # returns True. This means you can test if a key exists.
‘Daniel’ in grades # returns False.
del(grades[‘Ana’]) # deletes Ana entry in the dictionary.

Dictionaries can have any data type for values.
Keys must be an immutable data type. Keys must be unique. No order is guaranteed. Dictionaries match “keys” to “values”.

7 – Testing, Debugging, Exceptions, and Assertions

In order to make testing easier, have as many functions as possible.
Check that your program runs (no syntax/semantic errors).
Know test cases; know what outputs you expect for given inputs.

Classes of tests:
1. Unit testing
→ validate each piece of the program
→ test each function separately
2. Regression testing
→ add tests for bugs as you find them
→ catch reintroduced errors that were previously fixed
→ back to 1.
3. Integration testing
→ does the overall program work?
→ resist the urge to rush this.
→ back to 1 and 2.

Testing approaches:
1. Intuition. Some programs will have natural boundaries. If the program is designed that one parameter is always bigger than the other, it won’t work if that’s violated. Document these and move on.
2. Random testing. The probability that the code works increases with more tests. Black box and glass box testing are preferred.
3. Black box testing. Explore paths through specification.
4. Glass box testing. Explore paths through code.

Black box testing of a square root program:
Boundary → 0, -1
Perfect square → 25, 400
Less than 1 → .05, .5
Irrational square root → 2
Large → 999,999,999,999
Small → 0.000 000 000 001

Glass box testing uses code to guide test case design.
Path-complete if testing evaluates every potential path.
Guidelines:
→ branches → exercise all parts of a conditional
→ for loops
→ → loop not entered
→ → loop executed once
→ → loop executed more than once
→ while loops
→ → same as for loops.
→ → look at all ways to exit the loop.

Notes on glass box testing:
• path-complete testing can still miss bugs
• you should still test boundary cases

Use print() to debug. Put it in loops, use it at approximate half-way points (bisection).

Study the program code. Don’t ask “what is wrong?”. Ask, “How did I get the unexpected result?”. Is it part of a family of errors?

Use the scientific method.
1. Study available data. (Gather more data.)
2. Form hypothesis.
3. Repeatable experiments.
4. Pick simplest input to test with.

Types of errors (exceptions):
1. trying to access beyond the limits of a list
→ test = [1, 2, 3]
→ test[4]
→ IndexError
2. trying to convert (cast) an inappropriate type
→ int(test)
→ TypeError
3. referencing a non-existent variable
→ a
→ NameError
4. mixing data types without appropriate coercion
→ ‘3’ / 4
→ TypeError
5. forgetting to close parentheses, quotes, etc.
→ a = len([1,2,3]
→ print(a)
→ SyntaxError
6. logic errors. Think before writing new code. Draw pictures, take a break. Explain the code to someone else or a rubber ducky.
7. AttributeError. Attribute reference fails.
8. ValueError. Operand type okay, but value is illegal.
9. IOError. IO system reports malfunction (e.g. file not found).

Do unit testing. Write a function, test the function, debug the function. Then do integration testing (testing the functions’ interplay). This systematic methodology will cut down on debugging time.

If you change the code, back it up first. Do this in a systematic way. The alternative is to possibly lose track of if what bugs are caused by the original coding or changes introduced.

Use except.
→ try:
→ [code]
→ except:
→ print(“Bug in user input.”)

You can use multiple except statements like if statements.
→ try:
→ [code]
→ except ValueError:
→ print(“Could not convert to a number.”)
→ except ZeroDivisionError:
→ print(“Cannot divide by zero.”)
→ except:
→ print(“Something went very wrong.”)

else block. A body of code executed after associated try body completes with no exceptions.

finally block. A body of code always executed after try, else, and except clauses, even if they raised errors or executed a break, continue, or return. Useful for clean-up code that should be run no matter what else happened (e.g. close a file).

What you can do with exceptions:
1. Fail silently. Substitute default values or just continue. Bad idea because user gets no warning.
2. Return an “error” value. What value do you choose? Complicates code having to check for a special value.
3. Stop execution, signal error condition. In Python, raise an exception.
→ raise Exception(“descriptive string”)

The Python assert (assertion) can head potential errors off at the pass.
→ def avg(grades):
→ assert not len(grades) == 0, ‘no grades data’
→ return sum(grades) / len(grades)
This raises an AssertionError for empty grade lists, otherwise runs fine.

The goal is to spot bugs as soon as they are introduced and make it clear what happened. Use assertions as a supplement to testing. Raise exceptions if users supply bad data input. Use assertions to:
• check types of arguments or values
• check that invariants on data structures are met
• check constraints on return values
• check for violations of constraints on procedures (e.g. no duplicates in a list)

8 – Object Oriented Programming

Objects
1. How is it represented? (Internal representation should be private.)
2. How do you manipulate it?

Implementing an object → defining a class. Using an object → using an instance of the class.
Creating a class: (1) define its name, (2) define its attributes.
Using a class: (1) create an instance of the class, (2) doing operations on that instance.

Defining an object in Python is similar to def (defining a function), indented lines following constitute the class’ definition.
• Coordinate is a subclass of object.
• Object is a superclass of Coordinate.
• Coordinate inherits all object’s attributes. (See next lecture.)

class Coordinate(object):

Attributes are data and procedures belonging to a class.
Data attributes are the types of data making a class.
• Coordinate is made of two numbers.
Methods (procedural attributes) are functions that only work with this class. Methods are how to interact with the object.
• Two coordinates have a distance between them, so x2+y2=d2 could be used to determine that distance.

Defining how to create an instance of a class uses the __init__() function. Yes, it uses two double underscores. Self is a parameter defining an instance. You can use any name, but convention is self, so stick to that.

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

Once you’ve given how to create an instance of a class, proceed to creating methods for the class.

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def distance (self, other):
        x_diff_sq = (self.x-other.x)**2
        y_diff_sq = (self.y-other.y)**2
        returnt (x_diff_sq + y_diff_sq)**0.5

Other than self and dot notation, methods behave just like functions; they take parameters, do operations, and return objects. Note how the distance method would work:

c = Coordinate(3,4)
zero = Coordinate(0,0)
print(c.distance(zero))

print(c.distance(zero)) equates to print(Coordinate.distance(c, zero)).

You’ll also want to define what print does for an object. Otherwise Python returns your defined class and the instance’s memory location.

    def __str__(self):
        return "<" + str(self.x) + ", " + str(self.y) + ">"

You can override many common operators to work with your class.
__add__(self, other) → self + other
__sub__(self, other) → self – other
__eq__(self, other) → self == other
__len__(self) → len(self)

See https://docs.python.org/3/reference/datamodel.html#basic-customization.

You can use isinstance() to check if an object is a type of object: print(isinstance(c, Coordinate)).

9 – Python Classes and Inheritance

Implement vs using classes. Implementing the class: define the class, define its attributes (what it is), define its methods (how to use it). Using the class: create instances of the object type and do operations with them.

Getter and Setter methods allow the user to retrieve and change (get and set) attributes of a class. While Python allows directly accessing and writing data from and to your class instance, this is discouraged.

Default arguments allow the user to not have to enter in certain information into a function.
def set_name(self, newname = “”):
self.name = newname

Classes can have hierarchies. A parent class (superclass) can have a child class (subclass) that inherits all data and behaviors of the parent class, adds more information, adds more behavior, and can override behavior. In the following examples, Animal is the superclass.
class Cat(Animal):
def speak(self):
print(“meow”)
def __str__(self):
return “cat:” + str(self.name) + “:” + str(self.age)

class Person(Animal):
Animal.__init__(self, age)
self.set_name(name)
self.friends = []
def get_friends(self):
return self.friends
def add_friend(self, fname):
if fname not in self.friends:
self.friends.append(fname)
def speak(self):
print(“hello”)
def age_diff(self, other):
diff = self.age – other.age
print(abs(diff), “year difference”)
def __str__(self):
return “person:” + str(self.name) + “:” + str(self.age)

You can create variables within a class that can be used to create a unique ID for each class instance. See the tag and rid (rabbit ID) below. Zfill(n) puts leading zeroes to make the number n characters long.
class Rabbit(Animal):
tag = 1
def __init__(self, age, parent1=None, parent2=None):
Animal.__init__(self, age)
self.parent1 = parent1
self.parent2 = parent2
self.rid = Rabbit.tag
Rabbit.tag += 1
def get_rid(self):
return str(self.rid).zfill(3)
def get_parent1(self):
return self.parent1
def get_parent2(self):
return self.parent2

In conclusion, classes allow consistent handling of data of the same type. This continues the ideas of decomposition and abstraction in programming. (Wikipedia: Decomposition in computer science, also known as factoring, is breaking a complex problem or system into parts that are easier to conceive, understand, program, and maintain. https://whatis.techtarget.com/definition/abstraction: Abstraction hides all but the relevant data about an object in order to reduce complexity and increase efficiency.)

10 – Understanding Program Efficiency, part 1

Efficiency is often a trade-off between processing time and memory storage. For instance, if you store the 100th and 101st terms every hundred places for the Fibonacci sequence, it reduces computing time but increases memory storage for the program. Different algorithms take different times and different implementations of those algorithms vary it time taken even further.

Methodologies:
1. time the methodologies and choose the fastest.
2. more abstractly, count the operations: mathematical operations, comparisons, setting values, and retrieving values.
3. order of growth (complexity classes) is what computer scientists use.

Timing a program:
import time
def c_to_f(c):
return c*9/5 + 32
t0 = time.clock()
c_to_f(100000)
t1 = time.clock() – t0
print(“t = “, t, “:”, t1, “s”)

One of the main issues with timing programs is the conflation of implementations of algorithms with the algorithms themselves. Run time varies between computers. Run time is not predictable based on small inputs. Time varies for different inputs and it’s difficult to express a relationship between inputs and time.

Counting operations of a program:
Assume these take place in constant time: mathematical operations, comparisons, assignments, accessing objects in memory. Count the number of operations executed as function of size of input. For instance, count the number of operations performed within a loop and then multiply that by the number of times the loop runs.

This is closer to what is desired because it (1) depends on the algorithm, (2) is independent of computers’ speed, and it (3) varies its results based on inputs thereby generating a relationship between input and count. It still has the drawbacks of (1) depending on implementations and a new drawback of no clear definition of which operations to count. This new drawback can be mitigated by defining and timing types of operations.

Orders of growth (“big O notation”):
You can take the best case, worst case, and average case scenarios. We focus on the worst case scenario.

See https://en.wikipedia.org/wiki/Big_O_notation, Orders of common functions; https://en.wikipedia.org/wiki/Time_complexity#Table_of_common_time_complexities. Know:
1. constant time. The program always takes a certain number of steps.
2. double logarithmic time.
3. logarithmic time.
4. fractional power.
5. linear time.
6. n log n time. AKA “log-linear”.
7. quadratic, cubic, polynomial time.
8. exponential time.
9. factorial time.
10. double exponential time.

def fact_iter(n):
“””assumes n an int >= 0″””
answer = 1
while n > 1:
answer *= n
n -= 1
return answer

The above computes factorials. Steps = 5n + 2. The professor counts each incremental operation as 2 steps because you calculate the new number and then set the variable to the new number. When calculating O(n), ignore additive constants, ignore multiplicative constants. You’re interested in the worst case scenario, the asymptotic behavior. The highest order term is the term that captures the complexity. Focus on dominant terms. Keep in mind the behavior of a function can be different for smaller values of n.

n**2 + 2n + 2: O(n**2)
n**2 + 100000n + 3**1000: O(n**2)
log(n) + n + 4: O(n)
0.0001*n*log(n) + 300n: O(n log n)
2n**30 + 3n: O(3**n) ← exponentials are always expensive to compute

Nested loops with variables in each range are generally quadratic. (Polynomial time.)

Your goal is to always be as close to linear time as possible.

Linear search on an unsorted list:
def linear_search(L, e):
found = False
for i in range(len(L)):
if e == L[i]:
found = True
return found
The algorithm searches through everything. This is linear.

Linear search on a sorted list:
def search(L, e):
for i in range(len(L)):
if L[i] == e:
return True
if L[i] > e:
return False
return False
This only looks until it reaches a number greater than the element sought. Overall complexity is still O(n), even though average runtime is very different.

Quadratic complexity shown in a function determining if L1 is a subset of L2, assuming no duplicate terms:
def is Subset(L1, L2):
for e1 in L1:
matched = False
for e2 in L2:
if e1 == e2:
matched = True
break
if not matched:
return False
return True

Quadratic complexity with a function finding the intersection of two lists:
def intersect(L1, L2):
tmp = []
for e1 in L1:
for e2 in L2:
if e1 == e2:
tmp.append(e1)
res = []
for i in tmp:
if not (e in res):
res.append(e)
return res
This is actually two quadratics; the second part has to look through the list it’s creating, so its quadratic as well.

11 – Understanding Program Efficiency, part 2

  1. constant time. The program always takes a certain number of steps.
  2. double logarithmic time.
  3. logarithmic time.
  4. fractional power.
  5. linear time.
  6. n log n time. AKA “log-linear”.
  7. quadratic, cubic, polynomial time.
  8. exponential
  9. factorial time.
  10. double exponential time.

Be as close to linear time as possible.

Logarithmic time example: bisection search through a sorted list of numbers. The algorithm will finish searching in a worst case scenario when 1 = n / 2**i, so i = log n. Complexity is O(log n).

def bisect_search1(L, e):
if L == []:
return False
elif len(L) == 1:
return L[0] == e
else:
half = len(L)//2
if L[half] > e:
return bisect_search1( L[:half], e)
else:
return bisect_search1( L[half:], e)

The above isn’t a true bisection search because it’s copying the list when it slices the list. Thus this bit of code isn’t done in log time, it’s done in linear time. This is suboptimal.

def bisect_search2(L, e):
def bisect_search_helper(L, e, low, high):
if high == low:
return L[low] == e
mid = (low + high) // 2
if L[mid] == e:
return True
elif L[mid] > e:
if low == mid: #nothing left to search
return False
else:
return bisect_search_helper(L, e, low, mid – 1)
else:
return bisect_search_helper(L, e, mid + 1, high)
if len(L) == 0:
return False
else:
return bisect_search_helper(L, e, 0, len(L) – 1)

Professor Eric Grimson emphasizes noting the approach of the algorithm. If it reduces the problem by one each time (or any fixed constant), it’s linear. If it reduces the problem by half (or any fraction of the solution space), it’s logarithmic.

Finding power sets (all possible subsets formed from a superset) is an exponential problem.
def genSubsets(L):
if len(L) == 0:
return [[]] # list of empty list
smaller = genSubsets(L[:-1]) # all subsets w/o last element
extra = L[-1:] # create a list of just last element
new = []
for small in smaller:
new.append(small + extra) # for all smaller solutions, add one with last element
return smaller + new # combine those with last element and those w/o

Complexity classes:
O(1) – code does not depend on size of problem
O(log n) – reduce problem in half each time through process
O(n) – simple iterative or recursive programs
O(n log n) – next lecture
O(n**c) – nested loops or recursive calls
O(c**n) – multiple recursive calls at each level

Complexity of common python functions:
Lists: n is len(L)
→ O(1) index
→ O(1) store
→ O(1) length
→ O(1) append
→ O(n) ==
→ O(n) remove
→ O(n) copy
→ O(n) reverse
→ O(n) iteration
→ O(n) in list
Dictionaries: n is len(d)
worst case:
→ O(n) index
→ O(n) store
→ O(n) length
→ O(n) delete
→ O(n) iteration
average case:
→ O(1) index
→ O(1) store
→ O(1) delete
→ O(n) iteration

12 – Searching and Sorting

Searching – finding an item or group of items from a class. Types of search:
→ brute force – look at every item in the list.
→ bisection – in a sorted list, compare the query against the middle and cut the population in two, repeat.
→ linear search on sorted list – check each item sequentially until the item checked is larger than the queried item.

Types of sort:
→ monkey sort (AKA slow sort, permutation sort, shotgun sort, etc.) – randomize or go through all permutations until they’re sorted. Complexity is unbounded if unlucky. Anything worse than double exponentiation should not be considered. This is bad.
def bogo_sort(L):
while not is_sorted(L):
random.shuffle(L)
→ bubble sort – compare each pair stepping along L[i], swapping them if the larger is closer to L[0].
def bubble_sort(L):
swap = False
while not swap:
swap = True
for j in range(1, len(L)):
if L[ j – 1 ] > L[ j ]:
swap = False
temp = L[ j ]
L[ j ] = L[ j – 1 ]
L[ j – 1 ] = temp
→ selection sort – find the smallest element, move it to the start of the list (then the second, third, etc.).
→ merge sort – split the list into groups of one. Compare two at a time, switching so the smallest is at front. Continue this process, merging the groups into larger sorted groups by ever comparing the first of each group. This is n log n time (log-linear time). As such, it beats the pants off of all the quadratic time search algorithms above. It’s also the best algorithm known for worst-case scenario sorts.
def merge(left, right):
result = []
i, j = 0, 0
while i < len(left) and j < len(right):
if left[i] < right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
while (i < len(left)):
result.append(left[i])
i += 1
while (j < len(right)):
result.append(right[j])
j += 1
return result
def merge_sort(L):
if len(L) < 2:
return L[:]
else:
middle = len(L) // 2
left = merge_sort(L[ :middle])
right = merge_sort(L[middle: ])
return merge(left, right)

End of Course. I really enjoyed Doctors Ana Bell and Eric Grimson. MIT Opencourseware delivers the goods.

Categories
Computer Science School

01 12

Progress:

Intro to CS & Python, 2 lectures.
Think Python, 2 chapters.

  1. 8/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 1/11 Puzzled
  6. ? EECS, Remedial 8/19 Think Python – Dr. Allen Downey

***

Testing, Debugging, Exceptions, and Assertions; Intro to CS and Python lecture #7

In order to make testing easier, have as many functions as possible.
Check that your program runs (no syntax/semantic errors).
Know test cases; know what outputs you expect for given inputs.

Classes of tests:
1. Unit testing
→ validate each piece of the program
→ test each function separately
2. Regression testing
→ add tests for bugs as you find them
→ catch reintroduced errors that were previously fixed
→ back to 1.
3. Integration testing
→ does the overall program work?
→ resist the urge to rush this.
→ back to 1 and 2.

Testing approaches:
1. Intuition. Some programs will have natural boundaries. If the program is designed that one parameter is always bigger than the other, it won’t work if that’s violated. Document these and move on.
2. Random testing. The probability that the code works increases with more tests. Black box and glass box testing are preferred.
3. Black box testing. Explore paths through specification.
4. Glass box testing. Explore paths through code.

Black box testing of a square root program:
Boundary → 0, -1
Perfect square → 25, 400
Less than 1 → .05, .5
Irrational square root → 2
Large → 999,999,999,999
Small → 0.000 000 000 001

Glass box testing uses code to guide test case design.
Path-complete if testing evaluates every potential path.
Guidelines:
→ branches → exercise all parts of a conditional
→ for loops
→ → loop not entered
→ → loop executed once
→ → loop executed more than once
→ while loops
→ → same as for loops.
→ → look at all ways to exit the loop.

Notes on glass box testing:
• path-complete testing can still miss bugs
• you should still test boundary cases

Use print() to debug. Put it in loops, use it at approximate half-way points (bisection).

Study the program code. Don’t ask “what is wrong?”. Ask, “How did I get the unexpected result?”. Is it part of a family of errors?

Use the scientific method.
1. Study available data. (Gather more data.)
2. Form hypothesis.
3. Repeatable experiments.
4. Pick simplest input to test with.

Types of errors (exceptions):
1. trying to access beyond the limits of a list
→ test = [1, 2, 3]
→ test[4]
→ IndexError
2. trying to convert (cast) an inappropriate type
→ int(test)
→ TypeError
3. referencing a non-existent variable
→ a
→ NameError
4. mixing data types without appropriate coercion
→ ‘3’ / 4
→ TypeError
5. forgetting to close parentheses, quotes, etc.
→ a = len([1,2,3]
→ print(a)
→ SyntaxError
6. logic errors. Think before writing new code. Draw pictures, take a break. Explain the code to someone else or a rubber ducky.
7. AttributeError. Attribute reference fails.
8. ValueError. Operand type okay, but value is illegal.
9. IOError. IO system reports malfunction (e.g. file not found).

Do unit testing. Write a function, test the function, debug the function. Then do integration testing (testing the functions’ interplay). This systematic methodology will cut down on debugging time.

If you change the code, back it up first. Do this in a systematic way. The alternative is to possibly lose track of if what bugs are caused by the original coding or changes introduced.

Use except.
→ try:
→ [code]
→ except:
→ print(“Bug in user input.”)

You can use multiple except statements like if statements.
→ try:
→ [code]
→ except ValueError:
→ print(“Could not convert to a number.”)
→ except ZeroDivisionError:
→ print(“Cannot divide by zero.”)
→ except:
→ print(“Something went very wrong.”)

else block. A body of code executed after associated try body completes with no exceptions.

finally block. A body of code always executed after try, else, and except clauses, even if they raised errors or executed a break, continue, or return. Useful for clean-up code that should be run no matter what else happened (e.g. close a file).

What you can do with exceptions:
1. Fail silently. Substitute default values or just continue. Bad idea because user gets no warning.
2. Return an “error” value. What value do you choose? Complicates code having to check for a special value.
3. Stop execution, signal error condition. In Python, raise an exception.
→ raise Exception(“descriptive string”)

The Python assert (assertion) can head potential errors off at the pass.
→ def avg(grades):
→ assert not len(grades) == 0, ‘no grades data’
→ return sum(grades) / len(grades)
This raises an AssertionError for empty grade lists, otherwise runs fine.

The goal is to spot bugs as soon as they are introduced and make it clear what happened. Use assertions as a supplement to testing. Raise exceptions if users supply bad data input. Use assertions to:
• check types of arguments or values
• check that invariants on data structures are met
• check constraints on return values
• check for violations of constraints on procedures (e.g. no duplicates in a list)

***

Object Oriented Programming; Intro to CS and Python lecture #8

Objects
1. How is it represented? (Internal representation should be private.)
2. How do you manipulate it?

Implementing an object → defining a class. Using an object → using an instance of the class.
Creating a class: (1) define its name, (2) define its attributes.
Using a class: (1) create an instance of the class, (2) doing operations on that instance.

Defining an object in Python is similar to def (defining a function), indented lines following constitute the class’ definition.
• Coordinate is a subclass of object.
• Object is a superclass of Coordinate.
• Coordinate inherits all object’s attributes. (See next lecture.)

class Coordinate(object):

Attributes are data and procedures belonging to a class.
Data attributes are the types of data making a class.
• Coordinate is made of two numbers.
Methods (procedural attributes) are functions that only work with this class. Methods are how to interact with the object.
• Two coordinates have a distance between them, so x2+y2=d2 could be used to determine that distance.

Defining how to create an instance of a class uses the __init__() function. Yes, it uses two double underscores. Self is a parameter defining an instance. You can use any name, but convention is self, so stick to that.

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

Once you’ve given how to create an instance of a class, proceed to creating methods for the class.

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def distance (self, other):
        x_diff_sq = (self.x-other.x)**2
        y_diff_sq = (self.y-other.y)**2
        returnt (x_diff_sq + y_diff_sq)**0.5

Other than self and dot notation, methods behave just like functions; they take parameters, do operations, and return objects. Note how the distance method would work:

c = Coordinate(3,4)
zero = Coordinate(0,0)
print(c.distance(zero))

print(c.distance(zero)) equates to print(Coordinate.distance(c, zero)).

You’ll also want to define what print does for an object. Otherwise Python returns your defined class and the instance’s memory location.

    def __str__(self):
        return "<" + str(self.x) + ", " + str(self.y) + ">"

You can override many common operators to work with your class.
__add__(self, other) → self + other
__sub__(self, other) → self – other
__eq__(self, other) → self == other
__len__(self) → len(self)

See https://docs.python.org/3/reference/datamodel.html#basic-customization.

You can use isinstance() to check if an object is a type of object: print(isinstance(c, Coordinate)).

***

Think Python CH 8

Printing a string backwards.

>>> def backwardString(s):
...     n = ""
...     i = 1
...     while len(n) < len(s):
...         n = n + s[len(s)-i]
...         i += 1
...     return n
...
>>> print(backwardString("Marcus Aurelius"))
suileruA sucraM

Printing a string backwards, one character per line.

>>> def backwardAcrostic(s):
...     i = 1
...     while i <= len(s):
...         print(s[len(s)-i])
...         i += 1
...
>>> backwardAcrostic("ABC")
C
B
A

***

Categories
Uncategorized

01 11

No progress. (Sunday.)

  1. 6/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 1/11 Puzzled
  6. ? EECS, Remedial 6/19 Think Python – Dr. Allen Downey
Categories
Uncategorized

01 10

No progress. (Saturday.)

  1. 6/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 1/11 Puzzled
  6. ? EECS, Remedial 6/19 Think Python – Dr. Allen Downey
Categories
Uncategorized

01 09

Progress

Puzzled, first puzzle.

  1. 6/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 1/11 Puzzled
  6. ? EECS, Remedial 6/19 Think Python – Dr. Allen Downey
Categories
Computer Science School

01 08

Progress

Think Python, 1 chapter.

  1. 6/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 0/11 Puzzled
  6. ? EECS, Remedial 6/19 Think Python – Dr. Allen Downey

***

Euler Project 1-4.

***

Think Python, CH6

The sixth chapter of Think Python seems to have greatly influenced the sixth lecture of Intro to CS & Python. They even use some of the same examples. I don’t know if that’s because there are only so many elementary recursion examples and those include factorials and Fibonacci bunnies, but see notes from yesterday.

***

Categories
Computer Science School

01 07

Progress

Think Python, 2 chapters.
Intro to CS & Python, 2 lectures.

  1. 6/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 0/11 Puzzled
  6. ? EECS, Remedial 5/19 Think Python – Dr. Allen Downey

***

Think Python, CH 4

Interface design. Dr. Downey covers the following concepts.

Generalization adds parameters to functions to make them able to perform a broader range of actions. Dr. Downey’s example is taking a function that draws a square and making it to draw regular polygons instead. It can still draw squares, but it can draw other polygons too.

“It is legal, and sometimes helpful, to include the names of the parameters in the argument list: polygon(bob, n=7, length=70)”

Refactoring rearranges code to maximize code reuse. It’s like factoring in math; you factor out the constants.

A development plan (verbatim):

  1. Start by writing a small program with no function definitions.
  2. Once you get the program working, encapsulate it in a function and give it a name.
  3. Generalize the function by adding appropriate parameters.
  4. Repeat steps 1–3 until you have a set of working functions. Copy and paste working code to avoid retyping (and re-debugging).
  5. Look for opportunities to improve the program by refactoring. For example, if you have similar code in several places, consider factoring it into an appropriately general function.

Dr. Downey covers docstrings in making functions. Unlike where I’ve seen it before (was that Dr. Bell’s example?), Dr. Downey starts the explanation of the function right after the first triple quote, not the next line.

“””Draws n line segments with the given length and angle (in degrees) between them. t is a turtle.
“””

Instead of:

“””
Draws n line segments with the given length and angle (in degrees) between them. t is a turtle.
“””

In the perennial debugging section of the chapter, Dr. Downey argues that as long as the required preconditions of the function are properly documented, failure to adhere to them is user error. Along with preconditions, he mentions postconditions. I don’t like his definition, so I’ll go with Wikipedia: “a condition or predicate that must always be true just after the execution of some section of code or after an operation in a formal specification”. Dr. Downey makes it sound like the execution of the function is also a postcondition which violates the “post” portion of the word. A precondition is before the function, a postcondition is after the function.

***

Think Python, CH 5

Modulus operator. 7 % 3 = 1 because 7 / 3 has a remainder of 1. x % 10 gives the last digit of x. x % 100 gives the last 2 digits of x.

When covering logical operators, he points out that any number other zero is interpreted as “true”. I tested negative numbers and nonintegers. They are interpreted as true. When I put in 0, instead of returning False like I was expecting, it returned 0. When I put False into myBool variable, it returned False.

He covers chained and nested if statements.

Recursion. He covers functions that call themselves. I enjoyed his example. My version of Python requires parentheses for the print function.

>>> def countdown(n):
...     if n <= 0:
...         print("Blastoff!")
...     else:
...         print(n)
...         countdown(n-1)
...
>>> countdown(3)
3
2
1
Blastoff!
>>> countdown(10)
10
9
8
7
6
5
4
3
2
1
Blastoff!

He covers stack diagrams for recursive functions, keeping track of what’s going on.
<module>
countdown – n→3
countdown – n→2
countdown – n→1
countdown – n→0

Infinite recursion doesn’t happen because Python caps it at 1,000.

Error messages in Python point to where a problem is discovered, not where it is caused.

***

Turtles in Python

So in Chapter 4, Dr. Downey talked about using turtles and downloading something from his site. I wasn’t inclined to do so. But I did a search for “draw in Python” and found that in 3.9, turtles are baked into Python (https://docs.python.org/3/library/turtle.html). I only messed around with it briefly, but I might go back and do the turtle examples in chapters 4 and 5.

With turtles imported, I drew an 18 pointed star. This reminds me of the Spirograph I had when I was a kid.

>>> reset()
>>> for x in range(18):
...     turtle.forward(100) #pixels
...     turtle.right(100) #degrees

***

Tuples, Lists, Aliasing, Mutability, and Cloning; Intro to CS and Python lecture #5

Tuple: an ordered sequence of elements. You can mix element types. Tuples are immutable.
t = () # declares an empty tuple
t = (2, “MIT”, 3)
t[0] # evaluates to 2
(2, “MIT”, 3) + (5, 6) # evaluates to (2, “MIT”, 3, 5, 6)
t[1:2] # slicing the tuple results in (“MIT”, ). The comma in the parentheses indicates that it’s a tuple. Without the comma, it’s a string.
t[1:3] # slicing the tuple results in (“MIT”, 3)
len(t) # evaluates to 3
t[1] = 4 # results in error, can’t modify tuple object.

Tuples are convenient for swapping variable values.
You can’t swap variables by:
x = y
y = x
because you’re overwriting x.
You can:
temp = x
x = y
y = temp
But with tuples, it’s one line:
(x, y) = (y, x)

Functions are allowed to only return one object. Tuples are objects with multiple objects inside. So you can use tuples to get multiple things from a function.

def quotient_and_remainder(x, y):
    q = x // y
    r = x % y
    return(q, r)

You can iterate over tuples.

def get_data(aTuple):
    nums = ()
    words = ()
    for t in aTuple:
        nums = nums + (t[0],)
        if t[1] not in words:
            words = words + (t[1],)
    min_n = min(nums)
    max_n = max(nums)
    unique_words = len(words)
    return (min_n, max_n, unique_words)

Lists are a lot like tuples, but they are shown by square brackets instead of parentheses and are mutable!

Common pattern, iterate over list elements:

total = 0
for i in L:
    total += i
print(total)

# is equivalent to:

total = 0
for i in range(len(L)):
    total += L[i]
print(total)

# the first is more Pythonic.

Add elements to the list:
L = [2, 1, 3]
L.append(5) # changes the list L to [2, 1, 3, 5]

The dot in “.append” indicates something being done to an object. In the past I’ve found this frequently in VBA for Excel. Objects have data, methods, and functions. For instance, math.pi calls up the pi value from the math module.

L1 = [2, 1, 3]
L2 = [4, 5, 6]
L3 = L1 + L2 # L3 is [2, 1, 3, 4, 5, 6]. L1 and L2 remain unchanged.
L1.extend([0, 6]) # mutated L1 to [2, 1, 3, 0, 6].

L = [2, 1, 3, 6, 3, 7, 0]
L.remove(2) # L = [1, 3, 6, 3, 7, 0]. The function finds the first value 2 and removes it.
L.remove(3) # L = [1, 6, 3, 7, 0].
del(L[1]) # L = [1, 3, 7, 0]. The del function deletes nth element of the list.
L.pop() # function returns last element and deletes it. L = [1, 3, 7].

Converting lists to strings and back.
s = “I<3 cs”
list(s) # returns [‘I’, ‘<‘, ‘3’, ‘ ‘, ‘c’, ‘s’]. Note the returned object is a list.
s.split(‘<‘) # returns [‘I’, ‘3 cs’]. Again, string becomes list. If called without a parameter, split() splits on spaces.
L = [‘a’, ‘b’, ‘c’]
”.join(L) # returns “abc”. Note list becomes a string.
‘ ‘.join(L) # returns “a b c”.

Other list operations
L = [9, 6, 0, 3]
sorted(L) # returns a sorted list, but doesn’t modify L.
L.sort() # sorts L.
L.reverse() # reverse sorts L.

As lists are mutable, if you change one and refer to it in multiple places, it will change in each of those places.
a = 1
b = a
print(a)
> 1
print(b)
> 1 # because b is referring to a.

warm = [‘red’, ‘yellow’, ‘orange’]
hot = warm
hot.append(‘pink’)
print(hot)
> [‘red’, ‘yellow’, ‘orange’, ‘pink’]
print(warm)
> [‘red’, ‘yellow’, ‘orange’, ‘pink’]

Get around this by cloning (copying) the list for the new variable.
cool = [‘blue’, ‘green’, ‘grey’]
chill = cool[:] # that sweet cloning action
chill.append(‘black’)
print(chill)
> [‘blue’, ‘green’, ‘grey’, ‘black’]
print(cool)
> [‘blue’, ‘green’, ‘grey’] # because you cloned it and didn’t mess with the clone.

You can have lists in lists.

Recursion and Dictionaries; Intro to CS and Python lecture #6

This lecture was taught by Dr. Eric Grimson.

Recursion can be thought of as reduce and conquer. It’s a program that calls itself.
• It must have 1 or more base cases that are easy to solve. Once this is solved, the recursion ceases.
• It solve the same problem on some other input with the goal of simplifying the larger problem input.

def mult_iter(a, b):
    result = 0
    while b > 0:
        result += a
        b -= 1
    return result

# This adds a to itself b times.  The same, with recursion:

def mult(a, b):
    if b == 1: 
        return a # the if is the base case, an exit clause.
    else:
        return a + mult(a, b-1)

So recursion keeps reducing the problem until it gets to something it can solve directly.

def fact(n):
    if n == 1:
        return 1
    else:
        return n*fact(n-1)

fact(5)
    > 120

So, normally, if I return 1, that’s the output of the function. Why isn’t the output 1? Well, it is, five levels deep. Stack diagram:

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * fact(2)
fact scope (2) → return 2 * fact(1)
fact scope (1) → return 1

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * fact(2)
fact scope (2) → return 2 * 1

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * 2

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * 6

Global scope → fact(5)
fact scope (5) → return 5 * 24

Global scope → 120

Dictionaries store pairs of data, a key and a value.
my_dictionary = {}
grades = {‘Ana’: ‘B’, ‘John’:’A+’, ‘Denise’:’A’, ‘Katy’:’A’}
grades[‘Sylvan’] = ‘A’ # add an entry
‘John’ in grades # returns True. This means you can test if a key exists.
‘Daniel’ in grades # returns False.
del(grades[‘Ana’]) # deletes Ana entry in the dictionary.

Dictionaries can have any data type for values.
Keys must be an immutable data type. Keys must be unique. No order is guaranteed. Dictionaries match “keys” to “values”.