Categories
Computer Science School

1c1 Intro to CS and Python

1 – What is Computation

A program is a recipe.

  • Sequence of steps
  • Flow of control, determining when steps execute
  • A means of determining when to stop

Basic machine architecture 

  • Memory – stores programs and data.
  • CPU
    • Control unit – program counter, tells ALU what ops to do on what data, increments when ALU performs, accesses programs.
    • Arithmetic logic unit – does primitive operations / tests, accesses data, write data.
  • Input/output

Alan Turing proved you could do anything with the six primitives. The six primitives are move left, move right, read, right, scan, and do nothing. We can abstract methods to create new primitives, so modern languages have more primitives. Anything computable in one language is computable in another language.

Errors

  • Syntactic errors (e.g. 3″hi”) – easily caught
  • Static semantic errors (e.g. 3 + “hi”) – some languages check prior to running. These can cause unpredictable behavior.
  • No semantic errors, but different meaning than user intends
    • Program crashes
    • Program runs forever
    • Program returns “incorrect” answer

 A program is a sequence of definitions and commands. It evaluates definitions and executes commands. These are the tests and operations from basic machine architecture.

Programs manipulate data objects. Objects have types defining allowable operations for that type. Objects are scalar or non-scalar. Scalar examples: int, float, bool, NoneType. You can use type parentheses parentheses to see an object’s type. You can convert objects from one type to another (float(3) converts integer to float, int(3.9) truncates a float to the integer 3, str(3) converts 3 to “3”).  Such conversion is called casting.

Operaters
+ → sum, concatenation for strings
– → difference
* → product, repetition for a number of times for strings (“Ho ” * 3 is “Ho Ho Ho “)
/ → division (always a float)
% → modulus
** → power (^ in some languages, the ^ is a bitwise operator in Python)
= → bind a value to a variable. Variables are case sensitive in Python!
# → comment a line
// → floor division, it rounds the result down (result is an integer)

Bitwise operaters
x<<y → x is suffixed by y zeroes. Equivalent to x*2**y.
x>>y → x is shifted right y places. Equivalent to x//2**y.
x&y → (and) for each bit, if x & y are 1, 1, else 0.
x|y → (or) for each bit, if either x or y are 1, 1, else 0.
~x → bitwise complement of x. 01 = 10.
x^y → bitwise exclusive or (xor). Flip a bit in x if bit in y is 1.

2 – Branching and Iteration

Input prints what is in quotes, user type something and presses enter, and the input is bound as a string to a variable.

text = input("Type a string.")
print(text)

If you want the input as another type, cast it as that type.

num = float(input("Enter a number."))
print(num)

Comparing (tests)

i > j #compares strings lexicographically
i >= j
i < j
i <= j
i == j
i != j

Boolean operators: not, and, or.

Control flow through branching (and here I thought if statements were nothing special).

if:
else:

if <condition for case 1>:
elif <condition for case 2>:
elif <condition for case 3>:
else:

Control flow through iteration (loops). Loops and if statements pay close attention to the indentation. If it’s not indented (4 spaces by convention), it’s not in the loop or part of the if statement.

while <condition>:
    Indented code
    More indented code
Code not in the while loop

#This:

n = 0
while n < 5:
    print(n)
    n = n + 1

#is equivalent to this:

for n in range(5):
    print(n)

Range(start, stop, step) 
Stop is the only required argument; start defaults to zero and step defaults to one. If provided two arguments, the function interprets them as the start and stop. The for loop continues until stop minus one. The arguments must be integers.

Break ends the loop it is found in.

For loops

  • known number of iterations
  • uses a counter
  • can be written as a while loop

While loops

  • unbounded number of iterations
  • can use a counter, but must initialize before the loop and increment it in the loop
  • not all loops can be written as for loops

If you enter into an infinite loop, try Ctrl + C.
“\n” starts a new line in text.

3 – String Manipulation, Guess and Check, Approximation, Bisection

s = "abc"
len(s) # is 3

#indexing
s[0]  # is a
s[1]  # is b
s[2]  # is c
s[3]  # is trying to index out of bounds, error
s[-1] # is c
s[-2] # is b
s[-3] # is a

#slicing
#differentiated from indexing by at least one colon
#[start : stop : step], step defaults to 1
s = "abcdefgh"
s[3:6]    # "def"
s[3:6:2]  # "df"
s[::]     # "abcdefgh", equivalent to s[0:len(s):1]
s[::-1]   # "hgfedbca", equivalent to s[-1:-(len(s)+1):-1]
s[4:1:-2] # "ec"

Strings are immutable. You cannot hot-swap letters in a string. You must redefine the string. You can use a string to redefine the string.

s = "hello"
s[0] = 'j' # this gives an error.
s = 'j' + s[1:len(s)]

# Check each letter in string s for an i or a u.
s = "abcdefghi"

for char in s:
    if char == 'i' or char == 'u':
        print("This character is an i or u.")
    else:
        print("This character is not an i or u.")

Cheerleader program Dr. Ana Bell wrote:

# An example written by Dr. Ana Bell

an_letters = "aefhilmnorsxAEFHILMNORSX"
word = input("I will cheer for you!  Enter a word: ")
times = int(input("Enthusiasm level (1-10): "))

for char in word:
    char = word[i]
    if char in an_letters:
        print("Give me an " + char + "! " + char)
    else:
        print("Give me a " + char + "! " + char)
print("What does that spell?")
for i in range(times):
    print(word, "!!!")

Guess-and-check methodology

cube = int(input("Enter an integer for cube root: "))
for guess in range(abs(cube)+1cube = int(input("Enter an integer for cube root: "))
for guess in range(abs(cube)+1):
    if guess**3 >= abs(cube):
        break
if guess**3 != abs(cube):
    print(cube, 'is not a perfect cube.')
else:
    if cube < 0:
        guess = -guess
    print('Cube root of '+str(cube)+' is '+str(guess))

This can be changed to be far more accurate using bisection when guessing, determining if the guess is high or low, and re-bisecting it accordingly. I changed Dr. Bell’s code to have the epsilon much smaller. It takes the program longer, but the results for perfect cubes are more likely to be integers.

cube = int(input("Enter an integer for cube root: "))
epsilon = .0000000000001
num_guesses = 0
low = 0
high = cube
guess = (high + low) / 2.0

while abs(guess**3-cube) >= epsilon:
    if guess**3 < cube:
        low = guess
    else:
        high = guess
    guess = (high + low) / 2.0
    num_guesses += 1
print('num_guesses =', num_guesses)
print(guess, 'is close to the cube root of', cube)

4 – Decomposition, Abstraction, Functions

Abstraction: you don’t need to know how a black box works in order to use the black box. E.g. we use computers and very few of us understand from start to finish how typing in a letter results in that letter being printed on the screen and stored in memory. Functions are the same way, they abstract blocks of code into easily read and used single-line entities. Suppress details.

Decomposition: divide the programming. Divide it into modules, functions, and classes.

  • self-contained
  • breaks up code
  • intended to be reusable
  • keeps code organized
  • keeps code coherent

Functions’ characteristics:

  • name
  • parameters (0 or more arguments)
  • docstring (optional but recommended for explaining what the function does to other people and future you)
  • body
  • returns something
def is_even(i):
    """
    Input: i, a positive int
    Returns True if i is even, otherwise False
    """
    print("inside is_even")
    return i%2 == 0

If the function doesn’t return anything, python will add “return None” if you don’t manually put it. (It’s optional.)

5 – Tuples, Lists, Aliasing, Mutability, and Cloning

Tuple: an ordered sequence of elements. You can mix element types. Tuples are immutable.
t = () # declares an empty tuple
t = (2, “MIT”, 3)
t[0] # evaluates to 2
(2, “MIT”, 3) + (5, 6) # evaluates to (2, “MIT”, 3, 5, 6)
t[1:2] # slicing the tuple results in (“MIT”, ). The comma in the parentheses indicates that it’s a tuple. Without the comma, it’s a string.
t[1:3] # slicing the tuple results in (“MIT”, 3)
len(t) # evaluates to 3
t[1] = 4 # results in error, can’t modify tuple object.

Tuples are convenient for swapping variable values.
You can’t swap variables by:
x = y
y = x
because you’re overwriting x.
You can:
temp = x
x = y
y = temp
But with tuples, it’s one line:
(x, y) = (y, x)

Functions are allowed to only return one object. Tuples are objects with multiple objects inside. So you can use tuples to get multiple things from a function.

def quotient_and_remainder(x, y):
    q = x // y
    r = x % y
    return(q, r)

You can iterate over tuples.

def get_data(aTuple):
    nums = ()
    words = ()
    for t in aTuple:
        nums = nums + (t[0],)
        if t[1] not in words:
            words = words + (t[1],)
    min_n = min(nums)
    max_n = max(nums)
    unique_words = len(words)
    return (min_n, max_n, unique_words)

Lists are a lot like tuples, but they are shown by square brackets instead of parentheses and are mutable!

Common pattern, iterate over list elements:

total = 0
for i in L:
    total += i
print(total)

# is equivalent to:

total = 0
for i in range(len(L)):
    total += L[i]
print(total)

# the first is more Pythonic.

Add elements to the list:
L = [2, 1, 3]
L.append(5) # changes the list L to [2, 1, 3, 5]

The dot in “.append” indicates something being done to an object. In the past I’ve found this frequently in VBA for Excel. Objects have data, methods, and functions. For instance, math.pi calls up the pi value from the math module.

L1 = [2, 1, 3]
L2 = [4, 5, 6]
L3 = L1 + L2 # L3 is [2, 1, 3, 4, 5, 6]. L1 and L2 remain unchanged.
L1.extend([0, 6]) # mutated L1 to [2, 1, 3, 0, 6].

L = [2, 1, 3, 6, 3, 7, 0]
L.remove(2) # L = [1, 3, 6, 3, 7, 0]. The function finds the first value 2 and removes it.
L.remove(3) # L = [1, 6, 3, 7, 0].
del(L[1]) # L = [1, 3, 7, 0]. The del function deletes nth element of the list.
L.pop() # function returns last element and deletes it. L = [1, 3, 7].

Converting lists to strings and back.
s = “I<3 cs”
list(s) # returns [‘I’, ‘<‘, ‘3’, ‘ ‘, ‘c’, ‘s’]. Note the returned object is a list.
s.split(‘<‘) # returns [‘I’, ‘3 cs’]. Again, string becomes list. If called without a parameter, split() splits on spaces.
L = [‘a’, ‘b’, ‘c’]
”.join(L) # returns “abc”. Note list becomes a string.
‘ ‘.join(L) # returns “a b c”.

Other list operations
L = [9, 6, 0, 3]
sorted(L) # returns a sorted list, but doesn’t modify L.
L.sort() # sorts L.
L.reverse() # reverse sorts L.

As lists are mutable, if you change one and refer to it in multiple places, it will change in each of those places.
a = 1
b = a
print(a)
> 1
print(b)
> 1 # because b is referring to a.

warm = [‘red’, ‘yellow’, ‘orange’]
hot = warm
hot.append(‘pink’)
print(hot)
> [‘red’, ‘yellow’, ‘orange’, ‘pink’]
print(warm)
> [‘red’, ‘yellow’, ‘orange’, ‘pink’]

Get around this by cloning (copying) the list for the new variable.
cool = [‘blue’, ‘green’, ‘grey’]
chill = cool[:] # that sweet cloning action
chill.append(‘black’)
print(chill)
> [‘blue’, ‘green’, ‘grey’, ‘black’]
print(cool)
> [‘blue’, ‘green’, ‘grey’] # because you cloned it and didn’t mess with the clone.

You can have lists in lists.

6 – Recursion and Dictionaries

This lecture was taught by Dr. Eric Grimson.

Recursion can be thought of as reduce and conquer. It’s a program that calls itself.
• It must have 1 or more base cases that are easy to solve. Once this is solved, the recursion ceases.
• It solve the same problem on some other input with the goal of simplifying the larger problem input.

def mult_iter(a, b):
    result = 0
    while b > 0:
        result += a
        b -= 1
    return result

# This adds a to itself b times.  The same, with recursion:

def mult(a, b):
    if b == 1: 
        return a # the if is the base case, an exit clause.
    else:
        return a + mult(a, b-1)

So recursion keeps reducing the problem until it gets to something it can solve directly.

def fact(n):
    if n == 1:
        return 1
    else:
        return n*fact(n-1)

fact(5)
    > 120

So, normally, if I return 1, that’s the output of the function. Why isn’t the output 1? Well, it is, five levels deep. Stack diagram:

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * fact(2)
fact scope (2) → return 2 * fact(1)
fact scope (1) → return 1

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * fact(2)
fact scope (2) → return 2 * 1

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * 2

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * 6

Global scope → fact(5)
fact scope (5) → return 5 * 24

Global scope → 120

Dictionaries store pairs of data, a key and a value.
my_dictionary = {}
grades = {‘Ana’: ‘B’, ‘John’:’A+’, ‘Denise’:’A’, ‘Katy’:’A’}
grades[‘Sylvan’] = ‘A’ # add an entry
‘John’ in grades # returns True. This means you can test if a key exists.
‘Daniel’ in grades # returns False.
del(grades[‘Ana’]) # deletes Ana entry in the dictionary.

Dictionaries can have any data type for values.
Keys must be an immutable data type. Keys must be unique. No order is guaranteed. Dictionaries match “keys” to “values”.

7 – Testing, Debugging, Exceptions, and Assertions

In order to make testing easier, have as many functions as possible.
Check that your program runs (no syntax/semantic errors).
Know test cases; know what outputs you expect for given inputs.

Classes of tests:
1. Unit testing
→ validate each piece of the program
→ test each function separately
2. Regression testing
→ add tests for bugs as you find them
→ catch reintroduced errors that were previously fixed
→ back to 1.
3. Integration testing
→ does the overall program work?
→ resist the urge to rush this.
→ back to 1 and 2.

Testing approaches:
1. Intuition. Some programs will have natural boundaries. If the program is designed that one parameter is always bigger than the other, it won’t work if that’s violated. Document these and move on.
2. Random testing. The probability that the code works increases with more tests. Black box and glass box testing are preferred.
3. Black box testing. Explore paths through specification.
4. Glass box testing. Explore paths through code.

Black box testing of a square root program:
Boundary → 0, -1
Perfect square → 25, 400
Less than 1 → .05, .5
Irrational square root → 2
Large → 999,999,999,999
Small → 0.000 000 000 001

Glass box testing uses code to guide test case design.
Path-complete if testing evaluates every potential path.
Guidelines:
→ branches → exercise all parts of a conditional
→ for loops
→ → loop not entered
→ → loop executed once
→ → loop executed more than once
→ while loops
→ → same as for loops.
→ → look at all ways to exit the loop.

Notes on glass box testing:
• path-complete testing can still miss bugs
• you should still test boundary cases

Use print() to debug. Put it in loops, use it at approximate half-way points (bisection).

Study the program code. Don’t ask “what is wrong?”. Ask, “How did I get the unexpected result?”. Is it part of a family of errors?

Use the scientific method.
1. Study available data. (Gather more data.)
2. Form hypothesis.
3. Repeatable experiments.
4. Pick simplest input to test with.

Types of errors (exceptions):
1. trying to access beyond the limits of a list
→ test = [1, 2, 3]
→ test[4]
→ IndexError
2. trying to convert (cast) an inappropriate type
→ int(test)
→ TypeError
3. referencing a non-existent variable
→ a
→ NameError
4. mixing data types without appropriate coercion
→ ‘3’ / 4
→ TypeError
5. forgetting to close parentheses, quotes, etc.
→ a = len([1,2,3]
→ print(a)
→ SyntaxError
6. logic errors. Think before writing new code. Draw pictures, take a break. Explain the code to someone else or a rubber ducky.
7. AttributeError. Attribute reference fails.
8. ValueError. Operand type okay, but value is illegal.
9. IOError. IO system reports malfunction (e.g. file not found).

Do unit testing. Write a function, test the function, debug the function. Then do integration testing (testing the functions’ interplay). This systematic methodology will cut down on debugging time.

If you change the code, back it up first. Do this in a systematic way. The alternative is to possibly lose track of if what bugs are caused by the original coding or changes introduced.

Use except.
→ try:
→ [code]
→ except:
→ print(“Bug in user input.”)

You can use multiple except statements like if statements.
→ try:
→ [code]
→ except ValueError:
→ print(“Could not convert to a number.”)
→ except ZeroDivisionError:
→ print(“Cannot divide by zero.”)
→ except:
→ print(“Something went very wrong.”)

else block. A body of code executed after associated try body completes with no exceptions.

finally block. A body of code always executed after try, else, and except clauses, even if they raised errors or executed a break, continue, or return. Useful for clean-up code that should be run no matter what else happened (e.g. close a file).

What you can do with exceptions:
1. Fail silently. Substitute default values or just continue. Bad idea because user gets no warning.
2. Return an “error” value. What value do you choose? Complicates code having to check for a special value.
3. Stop execution, signal error condition. In Python, raise an exception.
→ raise Exception(“descriptive string”)

The Python assert (assertion) can head potential errors off at the pass.
→ def avg(grades):
→ assert not len(grades) == 0, ‘no grades data’
→ return sum(grades) / len(grades)
This raises an AssertionError for empty grade lists, otherwise runs fine.

The goal is to spot bugs as soon as they are introduced and make it clear what happened. Use assertions as a supplement to testing. Raise exceptions if users supply bad data input. Use assertions to:
• check types of arguments or values
• check that invariants on data structures are met
• check constraints on return values
• check for violations of constraints on procedures (e.g. no duplicates in a list)

8 – Object Oriented Programming

Objects
1. How is it represented? (Internal representation should be private.)
2. How do you manipulate it?

Implementing an object → defining a class. Using an object → using an instance of the class.
Creating a class: (1) define its name, (2) define its attributes.
Using a class: (1) create an instance of the class, (2) doing operations on that instance.

Defining an object in Python is similar to def (defining a function), indented lines following constitute the class’ definition.
• Coordinate is a subclass of object.
• Object is a superclass of Coordinate.
• Coordinate inherits all object’s attributes. (See next lecture.)

class Coordinate(object):

Attributes are data and procedures belonging to a class.
Data attributes are the types of data making a class.
• Coordinate is made of two numbers.
Methods (procedural attributes) are functions that only work with this class. Methods are how to interact with the object.
• Two coordinates have a distance between them, so x2+y2=d2 could be used to determine that distance.

Defining how to create an instance of a class uses the __init__() function. Yes, it uses two double underscores. Self is a parameter defining an instance. You can use any name, but convention is self, so stick to that.

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

Once you’ve given how to create an instance of a class, proceed to creating methods for the class.

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def distance (self, other):
        x_diff_sq = (self.x-other.x)**2
        y_diff_sq = (self.y-other.y)**2
        returnt (x_diff_sq + y_diff_sq)**0.5

Other than self and dot notation, methods behave just like functions; they take parameters, do operations, and return objects. Note how the distance method would work:

c = Coordinate(3,4)
zero = Coordinate(0,0)
print(c.distance(zero))

print(c.distance(zero)) equates to print(Coordinate.distance(c, zero)).

You’ll also want to define what print does for an object. Otherwise Python returns your defined class and the instance’s memory location.

    def __str__(self):
        return "<" + str(self.x) + ", " + str(self.y) + ">"

You can override many common operators to work with your class.
__add__(self, other) → self + other
__sub__(self, other) → self – other
__eq__(self, other) → self == other
__len__(self) → len(self)

See https://docs.python.org/3/reference/datamodel.html#basic-customization.

You can use isinstance() to check if an object is a type of object: print(isinstance(c, Coordinate)).

Categories
Computer Science School

01 12

Progress:

Intro to CS & Python, 2 lectures.
Think Python, 2 chapters.

  1. 8/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 1/11 Puzzled
  6. ? EECS, Remedial 8/19 Think Python – Dr. Allen Downey

***

Testing, Debugging, Exceptions, and Assertions; Intro to CS and Python lecture #7

In order to make testing easier, have as many functions as possible.
Check that your program runs (no syntax/semantic errors).
Know test cases; know what outputs you expect for given inputs.

Classes of tests:
1. Unit testing
→ validate each piece of the program
→ test each function separately
2. Regression testing
→ add tests for bugs as you find them
→ catch reintroduced errors that were previously fixed
→ back to 1.
3. Integration testing
→ does the overall program work?
→ resist the urge to rush this.
→ back to 1 and 2.

Testing approaches:
1. Intuition. Some programs will have natural boundaries. If the program is designed that one parameter is always bigger than the other, it won’t work if that’s violated. Document these and move on.
2. Random testing. The probability that the code works increases with more tests. Black box and glass box testing are preferred.
3. Black box testing. Explore paths through specification.
4. Glass box testing. Explore paths through code.

Black box testing of a square root program:
Boundary → 0, -1
Perfect square → 25, 400
Less than 1 → .05, .5
Irrational square root → 2
Large → 999,999,999,999
Small → 0.000 000 000 001

Glass box testing uses code to guide test case design.
Path-complete if testing evaluates every potential path.
Guidelines:
→ branches → exercise all parts of a conditional
→ for loops
→ → loop not entered
→ → loop executed once
→ → loop executed more than once
→ while loops
→ → same as for loops.
→ → look at all ways to exit the loop.

Notes on glass box testing:
• path-complete testing can still miss bugs
• you should still test boundary cases

Use print() to debug. Put it in loops, use it at approximate half-way points (bisection).

Study the program code. Don’t ask “what is wrong?”. Ask, “How did I get the unexpected result?”. Is it part of a family of errors?

Use the scientific method.
1. Study available data. (Gather more data.)
2. Form hypothesis.
3. Repeatable experiments.
4. Pick simplest input to test with.

Types of errors (exceptions):
1. trying to access beyond the limits of a list
→ test = [1, 2, 3]
→ test[4]
→ IndexError
2. trying to convert (cast) an inappropriate type
→ int(test)
→ TypeError
3. referencing a non-existent variable
→ a
→ NameError
4. mixing data types without appropriate coercion
→ ‘3’ / 4
→ TypeError
5. forgetting to close parentheses, quotes, etc.
→ a = len([1,2,3]
→ print(a)
→ SyntaxError
6. logic errors. Think before writing new code. Draw pictures, take a break. Explain the code to someone else or a rubber ducky.
7. AttributeError. Attribute reference fails.
8. ValueError. Operand type okay, but value is illegal.
9. IOError. IO system reports malfunction (e.g. file not found).

Do unit testing. Write a function, test the function, debug the function. Then do integration testing (testing the functions’ interplay). This systematic methodology will cut down on debugging time.

If you change the code, back it up first. Do this in a systematic way. The alternative is to possibly lose track of if what bugs are caused by the original coding or changes introduced.

Use except.
→ try:
→ [code]
→ except:
→ print(“Bug in user input.”)

You can use multiple except statements like if statements.
→ try:
→ [code]
→ except ValueError:
→ print(“Could not convert to a number.”)
→ except ZeroDivisionError:
→ print(“Cannot divide by zero.”)
→ except:
→ print(“Something went very wrong.”)

else block. A body of code executed after associated try body completes with no exceptions.

finally block. A body of code always executed after try, else, and except clauses, even if they raised errors or executed a break, continue, or return. Useful for clean-up code that should be run no matter what else happened (e.g. close a file).

What you can do with exceptions:
1. Fail silently. Substitute default values or just continue. Bad idea because user gets no warning.
2. Return an “error” value. What value do you choose? Complicates code having to check for a special value.
3. Stop execution, signal error condition. In Python, raise an exception.
→ raise Exception(“descriptive string”)

The Python assert (assertion) can head potential errors off at the pass.
→ def avg(grades):
→ assert not len(grades) == 0, ‘no grades data’
→ return sum(grades) / len(grades)
This raises an AssertionError for empty grade lists, otherwise runs fine.

The goal is to spot bugs as soon as they are introduced and make it clear what happened. Use assertions as a supplement to testing. Raise exceptions if users supply bad data input. Use assertions to:
• check types of arguments or values
• check that invariants on data structures are met
• check constraints on return values
• check for violations of constraints on procedures (e.g. no duplicates in a list)

***

Object Oriented Programming; Intro to CS and Python lecture #8

Objects
1. How is it represented? (Internal representation should be private.)
2. How do you manipulate it?

Implementing an object → defining a class. Using an object → using an instance of the class.
Creating a class: (1) define its name, (2) define its attributes.
Using a class: (1) create an instance of the class, (2) doing operations on that instance.

Defining an object in Python is similar to def (defining a function), indented lines following constitute the class’ definition.
• Coordinate is a subclass of object.
• Object is a superclass of Coordinate.
• Coordinate inherits all object’s attributes. (See next lecture.)

class Coordinate(object):

Attributes are data and procedures belonging to a class.
Data attributes are the types of data making a class.
• Coordinate is made of two numbers.
Methods (procedural attributes) are functions that only work with this class. Methods are how to interact with the object.
• Two coordinates have a distance between them, so x2+y2=d2 could be used to determine that distance.

Defining how to create an instance of a class uses the __init__() function. Yes, it uses two double underscores. Self is a parameter defining an instance. You can use any name, but convention is self, so stick to that.

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y

Once you’ve given how to create an instance of a class, proceed to creating methods for the class.

class Coordinate(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def distance (self, other):
        x_diff_sq = (self.x-other.x)**2
        y_diff_sq = (self.y-other.y)**2
        returnt (x_diff_sq + y_diff_sq)**0.5

Other than self and dot notation, methods behave just like functions; they take parameters, do operations, and return objects. Note how the distance method would work:

c = Coordinate(3,4)
zero = Coordinate(0,0)
print(c.distance(zero))

print(c.distance(zero)) equates to print(Coordinate.distance(c, zero)).

You’ll also want to define what print does for an object. Otherwise Python returns your defined class and the instance’s memory location.

    def __str__(self):
        return "<" + str(self.x) + ", " + str(self.y) + ">"

You can override many common operators to work with your class.
__add__(self, other) → self + other
__sub__(self, other) → self – other
__eq__(self, other) → self == other
__len__(self) → len(self)

See https://docs.python.org/3/reference/datamodel.html#basic-customization.

You can use isinstance() to check if an object is a type of object: print(isinstance(c, Coordinate)).

***

Think Python CH 8

Printing a string backwards.

>>> def backwardString(s):
...     n = ""
...     i = 1
...     while len(n) < len(s):
...         n = n + s[len(s)-i]
...         i += 1
...     return n
...
>>> print(backwardString("Marcus Aurelius"))
suileruA sucraM

Printing a string backwards, one character per line.

>>> def backwardAcrostic(s):
...     i = 1
...     while i <= len(s):
...         print(s[len(s)-i])
...         i += 1
...
>>> backwardAcrostic("ABC")
C
B
A

***

Categories
Computer Science School

01 08

Progress

Think Python, 1 chapter.

  1. 6/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 0/11 Puzzled
  6. ? EECS, Remedial 6/19 Think Python – Dr. Allen Downey

***

Euler Project 1-4.

***

Think Python, CH6

The sixth chapter of Think Python seems to have greatly influenced the sixth lecture of Intro to CS & Python. They even use some of the same examples. I don’t know if that’s because there are only so many elementary recursion examples and those include factorials and Fibonacci bunnies, but see notes from yesterday.

***

Categories
Computer Science School

01 07

Progress

Think Python, 2 chapters.
Intro to CS & Python, 2 lectures.

  1. 6/12 Intro to CS & Python – Dr. Ana Bell, Dr. Eric Grimson
  2. 1/22 Structures
  3. 2/30 AI – Dr. Patrick Winston
  4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
  5. 0/11 Puzzled
  6. ? EECS, Remedial 5/19 Think Python – Dr. Allen Downey

***

Think Python, CH 4

Interface design. Dr. Downey covers the following concepts.

Generalization adds parameters to functions to make them able to perform a broader range of actions. Dr. Downey’s example is taking a function that draws a square and making it to draw regular polygons instead. It can still draw squares, but it can draw other polygons too.

“It is legal, and sometimes helpful, to include the names of the parameters in the argument list: polygon(bob, n=7, length=70)”

Refactoring rearranges code to maximize code reuse. It’s like factoring in math; you factor out the constants.

A development plan (verbatim):

  1. Start by writing a small program with no function definitions.
  2. Once you get the program working, encapsulate it in a function and give it a name.
  3. Generalize the function by adding appropriate parameters.
  4. Repeat steps 1–3 until you have a set of working functions. Copy and paste working code to avoid retyping (and re-debugging).
  5. Look for opportunities to improve the program by refactoring. For example, if you have similar code in several places, consider factoring it into an appropriately general function.

Dr. Downey covers docstrings in making functions. Unlike where I’ve seen it before (was that Dr. Bell’s example?), Dr. Downey starts the explanation of the function right after the first triple quote, not the next line.

“””Draws n line segments with the given length and angle (in degrees) between them. t is a turtle.
“””

Instead of:

“””
Draws n line segments with the given length and angle (in degrees) between them. t is a turtle.
“””

In the perennial debugging section of the chapter, Dr. Downey argues that as long as the required preconditions of the function are properly documented, failure to adhere to them is user error. Along with preconditions, he mentions postconditions. I don’t like his definition, so I’ll go with Wikipedia: “a condition or predicate that must always be true just after the execution of some section of code or after an operation in a formal specification”. Dr. Downey makes it sound like the execution of the function is also a postcondition which violates the “post” portion of the word. A precondition is before the function, a postcondition is after the function.

***

Think Python, CH 5

Modulus operator. 7 % 3 = 1 because 7 / 3 has a remainder of 1. x % 10 gives the last digit of x. x % 100 gives the last 2 digits of x.

When covering logical operators, he points out that any number other zero is interpreted as “true”. I tested negative numbers and nonintegers. They are interpreted as true. When I put in 0, instead of returning False like I was expecting, it returned 0. When I put False into myBool variable, it returned False.

He covers chained and nested if statements.

Recursion. He covers functions that call themselves. I enjoyed his example. My version of Python requires parentheses for the print function.

>>> def countdown(n):
...     if n <= 0:
...         print("Blastoff!")
...     else:
...         print(n)
...         countdown(n-1)
...
>>> countdown(3)
3
2
1
Blastoff!
>>> countdown(10)
10
9
8
7
6
5
4
3
2
1
Blastoff!

He covers stack diagrams for recursive functions, keeping track of what’s going on.
<module>
countdown – n→3
countdown – n→2
countdown – n→1
countdown – n→0

Infinite recursion doesn’t happen because Python caps it at 1,000.

Error messages in Python point to where a problem is discovered, not where it is caused.

***

Turtles in Python

So in Chapter 4, Dr. Downey talked about using turtles and downloading something from his site. I wasn’t inclined to do so. But I did a search for “draw in Python” and found that in 3.9, turtles are baked into Python (https://docs.python.org/3/library/turtle.html). I only messed around with it briefly, but I might go back and do the turtle examples in chapters 4 and 5.

With turtles imported, I drew an 18 pointed star. This reminds me of the Spirograph I had when I was a kid.

>>> reset()
>>> for x in range(18):
...     turtle.forward(100) #pixels
...     turtle.right(100) #degrees

***

Tuples, Lists, Aliasing, Mutability, and Cloning; Intro to CS and Python lecture #5

Tuple: an ordered sequence of elements. You can mix element types. Tuples are immutable.
t = () # declares an empty tuple
t = (2, “MIT”, 3)
t[0] # evaluates to 2
(2, “MIT”, 3) + (5, 6) # evaluates to (2, “MIT”, 3, 5, 6)
t[1:2] # slicing the tuple results in (“MIT”, ). The comma in the parentheses indicates that it’s a tuple. Without the comma, it’s a string.
t[1:3] # slicing the tuple results in (“MIT”, 3)
len(t) # evaluates to 3
t[1] = 4 # results in error, can’t modify tuple object.

Tuples are convenient for swapping variable values.
You can’t swap variables by:
x = y
y = x
because you’re overwriting x.
You can:
temp = x
x = y
y = temp
But with tuples, it’s one line:
(x, y) = (y, x)

Functions are allowed to only return one object. Tuples are objects with multiple objects inside. So you can use tuples to get multiple things from a function.

def quotient_and_remainder(x, y):
    q = x // y
    r = x % y
    return(q, r)

You can iterate over tuples.

def get_data(aTuple):
    nums = ()
    words = ()
    for t in aTuple:
        nums = nums + (t[0],)
        if t[1] not in words:
            words = words + (t[1],)
    min_n = min(nums)
    max_n = max(nums)
    unique_words = len(words)
    return (min_n, max_n, unique_words)

Lists are a lot like tuples, but they are shown by square brackets instead of parentheses and are mutable!

Common pattern, iterate over list elements:

total = 0
for i in L:
    total += i
print(total)

# is equivalent to:

total = 0
for i in range(len(L)):
    total += L[i]
print(total)

# the first is more Pythonic.

Add elements to the list:
L = [2, 1, 3]
L.append(5) # changes the list L to [2, 1, 3, 5]

The dot in “.append” indicates something being done to an object. In the past I’ve found this frequently in VBA for Excel. Objects have data, methods, and functions. For instance, math.pi calls up the pi value from the math module.

L1 = [2, 1, 3]
L2 = [4, 5, 6]
L3 = L1 + L2 # L3 is [2, 1, 3, 4, 5, 6]. L1 and L2 remain unchanged.
L1.extend([0, 6]) # mutated L1 to [2, 1, 3, 0, 6].

L = [2, 1, 3, 6, 3, 7, 0]
L.remove(2) # L = [1, 3, 6, 3, 7, 0]. The function finds the first value 2 and removes it.
L.remove(3) # L = [1, 6, 3, 7, 0].
del(L[1]) # L = [1, 3, 7, 0]. The del function deletes nth element of the list.
L.pop() # function returns last element and deletes it. L = [1, 3, 7].

Converting lists to strings and back.
s = “I<3 cs”
list(s) # returns [‘I’, ‘<‘, ‘3’, ‘ ‘, ‘c’, ‘s’]. Note the returned object is a list.
s.split(‘<‘) # returns [‘I’, ‘3 cs’]. Again, string becomes list. If called without a parameter, split() splits on spaces.
L = [‘a’, ‘b’, ‘c’]
”.join(L) # returns “abc”. Note list becomes a string.
‘ ‘.join(L) # returns “a b c”.

Other list operations
L = [9, 6, 0, 3]
sorted(L) # returns a sorted list, but doesn’t modify L.
L.sort() # sorts L.
L.reverse() # reverse sorts L.

As lists are mutable, if you change one and refer to it in multiple places, it will change in each of those places.
a = 1
b = a
print(a)
> 1
print(b)
> 1 # because b is referring to a.

warm = [‘red’, ‘yellow’, ‘orange’]
hot = warm
hot.append(‘pink’)
print(hot)
> [‘red’, ‘yellow’, ‘orange’, ‘pink’]
print(warm)
> [‘red’, ‘yellow’, ‘orange’, ‘pink’]

Get around this by cloning (copying) the list for the new variable.
cool = [‘blue’, ‘green’, ‘grey’]
chill = cool[:] # that sweet cloning action
chill.append(‘black’)
print(chill)
> [‘blue’, ‘green’, ‘grey’, ‘black’]
print(cool)
> [‘blue’, ‘green’, ‘grey’] # because you cloned it and didn’t mess with the clone.

You can have lists in lists.

Recursion and Dictionaries; Intro to CS and Python lecture #6

This lecture was taught by Dr. Eric Grimson.

Recursion can be thought of as reduce and conquer. It’s a program that calls itself.
• It must have 1 or more base cases that are easy to solve. Once this is solved, the recursion ceases.
• It solve the same problem on some other input with the goal of simplifying the larger problem input.

def mult_iter(a, b):
    result = 0
    while b > 0:
        result += a
        b -= 1
    return result

# This adds a to itself b times.  The same, with recursion:

def mult(a, b):
    if b == 1: 
        return a # the if is the base case, an exit clause.
    else:
        return a + mult(a, b-1)

So recursion keeps reducing the problem until it gets to something it can solve directly.

def fact(n):
    if n == 1:
        return 1
    else:
        return n*fact(n-1)

fact(5)
    > 120

So, normally, if I return 1, that’s the output of the function. Why isn’t the output 1? Well, it is, five levels deep. Stack diagram:

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * fact(2)
fact scope (2) → return 2 * fact(1)
fact scope (1) → return 1

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * fact(2)
fact scope (2) → return 2 * 1

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * fact(3)
fact scope (3) → return 3 * 2

Global scope → fact(5)
fact scope (5) → return 5 * fact(4)
fact scope (4) → return 4 * 6

Global scope → fact(5)
fact scope (5) → return 5 * 24

Global scope → 120

Dictionaries store pairs of data, a key and a value.
my_dictionary = {}
grades = {‘Ana’: ‘B’, ‘John’:’A+’, ‘Denise’:’A’, ‘Katy’:’A’}
grades[‘Sylvan’] = ‘A’ # add an entry
‘John’ in grades # returns True. This means you can test if a key exists.
‘Daniel’ in grades # returns False.
del(grades[‘Ana’]) # deletes Ana entry in the dictionary.

Dictionaries can have any data type for values.
Keys must be an immutable data type. Keys must be unique. No order is guaranteed. Dictionaries match “keys” to “values”.

Categories
Computer Science School

01 06

Progress – No progress.

1. 4/12 Intro to CS & Python – Dr. Ana Bell
2. 1/22 Structures
3. 2/30 AI – Dr. Patrick Winston
4. 0/4 (0/11, 0/11, 0/5, 0/8) Math for CS
5. 0/11 Puzzled
6. ? EECS, Remedial 3/19 Think Python – Dr. Allen Downey

***

In trying to get a visual for all the Alt codes (Alt + Numpad number = special character), I wrote the following in HTML/Javascript:

<!DOCTYPE html>
<html>
<head>
<title>Characters</title>
</head>
<body style="color:#00FF00; background-color:#000000;">

<h1>The goal of this program is to print out characters and their Alt code references.</h1>
<p>If failed at showing the Alt codes.  It shows instead JavaScript's fromCharCode outputs.</p>
<p id="demo"></p>

<script>
var text = "";
var i;
var j;
for (i = 0; i < 65560; i=i+10) {
  text += i + " " 
  for (j = 0; j < 10; j++) {
	text += String.fromCharCode(i+j)+ " "
  }
  text += "<br>";
}
document.getElementById("demo").innerHTML = text;
</script>

</body>
</html> 

This doesn’t give the Alt codes, it gives JavaScript’s fromCharCode outputs. Still an interesting result.

***

Categories
Computer Science School

01 05

Progress

1. Introduction to CS and Programming in Python (Intro to CS & Python)
2. Computation Structures (Structures)
3. Artificial Intelligence (AI)
4. Mathematics for CS (Math for CS)
5. Programming for the Puzzled (Puzzled)
6. Introduction to Electrical Engineering and CS 1 (EECS1)

1. 4/12 (1/3!) Dr. Ana Bell
2. 1/22
3. 2/30 Dr. Patrick Winston
4. 0/4 units. Units: 0/11, 0/11, 0/5, 0/8.
5. 0/11
6. ? In remedial reading, I’m 3/19 (about 1/6).

***

Introduction and Scope; AI #1

Algorithms enabled by constraints exposed by representations that support models targeted at looping through thinking, perception, and action.

Methods:
Generate and test. [Much like Dr. Ana Bell’s bisection.]

If your representation is right, you’re almost done.
Simple ≠ trivial. Simple can be powerful; trivial implies useless.
Rumpelstiltskin Principle – if you know the name of a concept, you have power over that concept. The thing that binds the ends of your shoelaces is an aglet. Likewise, know the concepts of programming.

A brief history of computing.
♠♠ Age of Speculation
♦ Ada Lovelace in 1842 was answering questions regarding a computation machine. As she is the first to recognize applications for Charles Babbage’s machine (Analytical Engine) beyond computation and to write a program for it, she is considered the first programmer.
♦ Alan Turing came up with his Turing machine and helped win World War II. He also came up with the Turing test and wrote a paper in 1950 “Computing Machinery and Intelligence”. The British government drove him to suicide because he was gay.
♠♠ Dawn Age
♦ Marvin Minsky in 1962 published a (7, 4) Turing machine and proved it universal.
♦ James R. Slagle in 1961 programmed an IBM 7090 to solve freshman integration problems. Slagle worked under Minsky at MIT. Slagle was blind. Link to his paper: https://dspace.mit.edu/bitstream/handle/1721.1/11997/31225400-MIT.pdf?sequence=2.
♦ Joseph Weizenbaum in 1964 finished the Eliza program, an attempt at replicating a Rogerian psychotherapist. Weizenbaum’s own secretary ascribed human-like emotions to the program, despite Weizenbaum’s objections.
♦ Computers are programmed to analyze visual input. This included visual analogies and properties of more complex input.
♦ Computers are taught basic learning.
♦ Expert systems. A program was written that diagnosed infections in the blood better than most doctors. It was never used because they used broad-spectrum antibiotics. Delta uses a program to park airplanes to save on fuel.
♠♠ Bulldozer Age
♦ People used computers in brute force fashion.
♦ Deep blue beats the champions at Jeopardy.
♠♠ Imagination Age
♦ Dr. Winston guesses the current age is going to be about solving problems in a more concise form.

MIT is about skill-building and big ideas.

Course structure:
• Lectures to introduce materials and powerful ideas.
• Recitations for accuracy(?) and have a venue small enough for discussion.
• Mega recitations to have graduate students showing how to work prior quiz problems.
• Tutorials help with the homework.
• Each portion of the subject matter is covered by a quiz and the final, except for a portion taught at the end which will not have a quiz. Each student’s score is the max(quiz score, final score). Some people have passed with only taking the final, but it’s not recommended because most people would run out of time. Students that have done well on quizzes can skip the portions of the final that they’re already satisfied with their corresponding quiz score.

***

Reasoning: Goal Trees and Problem Solving; AI #2

• Modeling Problem Solving
• Generate & Test
• Problem Reduction
• Problem Reduction Tree (And/Or Tree, Goal Tree)
• Transforms & Example
• Reflections

Knowledge about knowledge is power

Catechism to ask when designing an expert system.
• What kind? What kind of knowledge is needed?
• How represented? How is that represented?
• How used? How is it used?
• How much? How much knowledge is needed?
• What exactly? What exactly is the next step?

Skill – In order to build a skill you need to
Understand. In order to understand,
Witness a specific example and abstract up.

Dr. Winston seeks to teach the basics of AI through a manual demonstration of how Slagle’s integration program works.

Apply all safe transforms → Check table for matches
Apply heuristics → Some transforms that might help, such as trigonometric form to polynomial form or vice versa.

Slagle’s program answered 54 of 56 problems. It was missing two transforms to solve the two it failed to solve. Its max depth was 7 and its average depth was 3. Its unused branches is 1. Slagle needed about 12 safe transforms, about 12 heuristics, and 26 identities in the table. His program’s successors went on to be able to beat the best mathematicians in the world at integration and became MatLab.

***

Categories
Computer Science School

01 04

Progress

I did not know a way to actually run a .py file. The runfile() does not seem to be a valid command. After floundering around with some commands from the internet, I found out that in Windows you can run scripts directly from the command line (Winkey + R, CMD). When in the DOS box, type python and then the file path. This is easier by locating the python file in Windows Explorer, Shift + right-click the file, and choose “Copy as path”. This took about 2 hours to find.

I looked at the coursework for “A Gentle Introduction to Programming Using Python (January IAP 2011)”. I consider it redundant to what I’m learning in the remedial portion of EECS. It even uses Think Python as its text. I’m dropping the course and will pick something else up in its place.

I looked at the coursework for “The History of Computing”. It’s a lot of reading in books I don’t have. I don’t know if they printed pages from various books or what, but merely getting all of the reading from the library via interlibrary loans would take weeks. I don’t know that the broader view of computing is worth finding all of this. I’m dropping the course and will pick something else.

I looked at “Artificial Intelligence” and watched some of the first lecture. I’m taking this course, I like the professor and think that I’ll like learning more about the subject from him. My only concern is that I might lack prerequisite knowledge.

I’m going to try “Mathematics for Computer Science”. It looks brutal, but it also seems necessary. I have no clue how to prove anything mathematically outside of parroting some Euclidean geometric proof right after reviewing it. Maybe I can change that with this course.

So this semester is looking like:
1. Introduction to CS and Programming in Python
2. Computation Structures
3. Artificial Intelligence
4. Mathematics for CS
5. Programming for the Puzzled
6. Introduction to Electrical Engineering and CS 1

Progress:
1. 4/12 (1/3!)
2. 1/22
3. 0/30
4. 0/4 units. Units: 0/11, 0/11, 0/5, 0/8.
5. 0/11
6. ? In remedial reading, I’m 3/19 (about 1/6).

Notes on coursework order:
1. I pretty much have to complete Intro to CS & Python and Think Python before I can do Programming for the Puzzled or EECS.
2. Based on my experience in Structures so far (1 lecture), I might need a full 22 days to digest it. That said, it doesn’t seem contingent on anything, so I can attack it whenever I feel like it.
3. I’m not sure I’m up to Mathematics for CS this semester, but I have to address the math at some point.

***

String Manipulation, Guess and Check, Approximation, Bisection; Intro to CS and Python lecture #3

s = "abc"
len(s) # is 3

#indexing
s[0]  # is a
s[1]  # is b
s[2]  # is c
s[3]  # is trying to index out of bounds, error
s[-1] # is c
s[-2] # is b
s[-3] # is a

#slicing
#differentiated from indexing by at least one colon
#[start : stop : step], step defaults to 1
s = "abcdefgh"
s[3:6]    # "def"
s[3:6:2]  # "df"
s[::]     # "abcdefgh", equivalent to s[0:len(s):1]
s[::-1]   # "hgfedbca", equivalent to s[-1:-(len(s)+1):-1]
s[4:1:-2] # "ec"

Strings are immutable. You cannot hot-swap letters in a string. You must redefine the string. You can use a string to redefine the string.

s = "hello"
s[0] = 'j' # this gives an error.
s = 'j' + s[1:len(s)]

# Check each letter in string s for an i or a u.
s = "abcdefghi"

for char in s:
    if char == 'i' or char == 'u':
        print("This character is an i or u.")
    else:
        print("This character is not an i or u.")

Cheerleader program Dr. Ana Bell wrote:

# An example written by Dr. Ana Bell

an_letters = "aefhilmnorsxAEFHILMNORSX"
word = input("I will cheer for you!  Enter a word: ")
times = int(input("Enthusiasm level (1-10): "))

for char in word:
    char = word[i]
    if char in an_letters:
        print("Give me an " + char + "! " + char)
    else:
        print("Give me a " + char + "! " + char)
print("What does that spell?")
for i in range(times):
    print(word, "!!!")

Guess-and-check methodology

cube = int(input("Enter an integer for cube root: "))
for guess in range(abs(cube)+1cube = int(input("Enter an integer for cube root: "))
for guess in range(abs(cube)+1):
    if guess**3 >= abs(cube):
        break
if guess**3 != abs(cube):
    print(cube, 'is not a perfect cube.')
else:
    if cube < 0:
        guess = -guess
    print('Cube root of '+str(cube)+' is '+str(guess))

This can be changed to be far more accurate using bisection when guessing, determining if the guess is high or low, and re-bisecting it accordingly. I changed Dr. Bell’s code to have the epsilon much smaller. It takes the program longer, but the results for perfect cubes are more likely to be integers.

cube = int(input("Enter an integer for cube root: "))
epsilon = .0000000000001
num_guesses = 0
low = 0
high = cube
guess = (high + low) / 2.0

while abs(guess**3-cube) >= epsilon:
    if guess**3 < cube:
        low = guess
    else:
        high = guess
    guess = (high + low) / 2.0
    num_guesses += 1
print('num_guesses =', num_guesses)
print(guess, 'is close to the cube root of', cube)

***

Decomposition, Abstraction, Functions; Intro to CS and Python lecture #4

Abstraction: you don’t need to know how a black box works in order to use the black box. E.g. we use computers and very few of us understand from start to finish how typing in a letter results in that letter being printed on the screen and stored in memory. Functions are the same way, they abstract blocks of code into easily read and used single-line entities. Suppress details.

Decomposition: divide the programming. Divide it into modules, functions, and classes.

  • self-contained
  • breaks up code
  • intended to be reusable
  • keeps code organized
  • keeps code coherent

Functions’ characteristics:

  • name
  • parameters (0 or more arguments)
  • docstring (optional but recommended for explaining what the function does to other people and future you)
  • body
  • returns something
def is_even(i):
    """
    Input: i, a positive int
    Returns True if i is even, otherwise False
    """
    print("inside is_even")
    return i%2 == 0

If the function doesn’t return anything, python will add “return None” if you don’t manually put it. (It’s optional.)

***

Categories
Computer Science School

01 03

No progress today. I went to the store, played disc golf, installed 4 4-foot LED lights on my garage ceiling with my brother, and watched the movie Unbreakable with family.

Categories
Computer Science School

01 02

Performance

EECS1 – read 2, 3 of 19 chapters of Think Python. That sounds good until you realize it’s a remedial portion of the class. Also, I haven’t started the real reading in that class.

Intro to CS & Python – lectures 1, 2 of 12.

Extra – pseudo code for Huffman’s Algorithm (see Computation Structures notes from yesterday). I did this first thing in the morning and it already looks bad in light of some of what I learned today.

I dictated my written notes into my phone. The automated dictation software has issues. For instance, “new line” sometimes prints out “new line” and sometimes verbally commands a carriage return. “New paragraph” doesn’t seem to have this issue. “Colon” always prints “colon” despite “semicolon” working as a verbal command for “;”. “Paren” or “parenthesis” almost never gives me a “(” and I’ve never successfully commanded a close parenthesis “)”. Dictation was fine for prose, but unusable for lists and code. Fixing the dictation for lists and code takes longer than typing them.

Part of that may be due to using WordPress. Perhaps the text would be easier to work with in Word or Google docs and then I can transfer that to here. That would mean my process is (1) write notes, (2) dictate notes, (3) clean notes in a word processor, (4) copy notes here, (5) clean notes for web presentation. Alternatively, (1) type notes in WordPress.

***

What is Computation, Intro to CS and Python lecture #1

A program is a recipe.

  • Sequence of steps
  • Flow of control, determining when steps execute
  • A means of determining when to stop

Basic machine architecture 

  • Memory – stores programs and data.
  • CPU
    • Control unit – program counter, tells ALU what ops to do on what data, increments when ALU performs, accesses programs.
    • Arithmetic logic unit – does primitive operations / tests, accesses data, write data.
  • Input/output

Alan Turing proved you could do anything with the six primitives. The six primitives are move left, move right, read, right, scan, and do nothing. We can abstract methods to create new primitives, so modern languages have more primitives. Anything computable in one language is computable in another language.

Errors

  • Syntactic errors (e.g. 3″hi”) – easily caught
  • Static semantic errors (e.g. 3 + “hi”) – some languages check prior to running. These can cause unpredictable behavior.
  • No semantic errors, but different meaning than user intends
    • Program crashes
    • Program runs forever
    • Program returns “incorrect” answer

 A program is a sequence of definitions and commands. It evaluates definitions and executes commands. These are the tests and operations from basic machine architecture.

Programs manipulate data objects. Objects have types defining allowable operations for that type. Objects are scalar or non-scalar. Scalar examples: int, float, bool, NoneType. You can use type parentheses parentheses to see an object’s type. You can convert objects from one type to another (float(3) converts integer to float, int(3.9) truncates a float to the integer 3, str(3) converts 3 to “3”).  Such conversion is called casting.

Operaters
+ → sum, concatenation for strings
– → difference
* → product, repetition for a number of times for strings (“Ho ” * 3 is “Ho Ho Ho “)
/ → division (always a float)
% → modulus
** → power (^ in some languages, the ^ is a bitwise operator in Python)
= → bind a value to a variable. Variables are case sensitive in Python!
# → comment a line
// → floor division, it rounds the result down (result is an integer)

Bitwise operaters
x<<y → x is suffixed by y zeroes. Equivalent to x*2**y.
x>>y → x is shifted right y places. Equivalent to x//2**y.
x&y → (and) for each bit, if x & y are 1, 1, else 0.
x|y → (or) for each bit, if either x or y are 1, 1, else 0.
~x → bitwise complement of x. 01 = 10.
x^y → bitwise exclusive or (xor). Flip a bit in x if bit in y is 1.

***

Branching and Iteration, Intro to CS and Python lecture #2

Input prints what is in quotes, user type something and presses enter, and the input is bound as a string to a variable.

text = input("Type a string.")
print(text)

If you want the input as another type, cast it as that type.

num = float(input("Enter a number."))
print(num)

Comparing (tests)

i > j #compares strings lexicographically
i >= j
i < j
i <= j
i == j
i != j

Boolean operators: not, and, or.

Control flow through branching (and here I thought if statements were nothing special).

if:
else:

if <condition for case 1>:
elif <condition for case 2>:
elif <condition for case 3>:
else:

Control flow through iteration (loops). Loops and if statements pay close attention to the indentation. If it’s not indented (4 spaces by convention), it’s not in the loop or part of the if statement.

while <condition>:
    Indented code
    More indented code
Code not in the while loop

#This:

n = 0
while n < 5:
    print(n)
    n = n + 1

#is equivalent to this:

for n in range(5):
    print(n)

Range(start, stop, step) 
Stop is the only required argument; start defaults to zero and step defaults to one. If provided two arguments, the function interprets them as the start and stop. The for loop continues until stop minus one. The arguments must be integers.

Break ends the loop it is found in.

For loops

  • known number of iterations
  • uses a counter
  • can be written as a while loop

While loops

  • unbounded number of iterations
  • can use a counter, but must initialize before the loop and increment it in the loop
  • not all loops can be written as for loops

If you enter into an infinite loop, try Ctrl + C.
“\n” starts a new line in text.

***

Think Python, CH 3

Functions
function(argument)
Arguments are assigned to variables called parameters. Variables and parameters are local. Void functions don’t return anything; return functions do. 

Modules contain collections of related functions.

import math
import [module]
math.pi
math.[math function]

# Instead of
    import math
    print(math.pi)
# you can 
    from math import pi
    print pi
# or get everything from the math module
    from math import *
    print pi

Define a new function using def. First line is the header, the rest is the body.

def print_lyrics()
    print("Somewhere over the rainbow...")

Flow of execution does not necessarily flow from top to bottom. As functions must be defined before they are executed, each may serve as a detour from the top down flow. Reading programs following the flow of execution is smoother than reading at top to bottom.

Stack diagrams.  Shown in a coding block for the fixed-width font.  Stack diagrams track variables and their values across modules and functions.  

<module>
    line1 "homage" #think the French pronunciation
    line2 "to orange"
function1()
    part1 "homage"
    part2 "to orange"
    part3 "and Omaha"
function2()
...

Functions help by:

  • Dividing things up so you can name a function, its arguments, and its variables. This makes reading and debugging easier.
  • Eliminating repetitive code.
  • Allowing you to build pieces and assemble them into a whole.
  • Allowing you to reuse well-designed code in other programs.

Python doesn’t recognize tabs, so if your text editor isn’t converting the tabs to spaces, it’ll have run errors.

When debugging, ensure you are calling the most recently debugged version of the program and not the original bugged program.

Categories
Computer Science School

01 01 – Starting CS

I want to study computer science because

  • I’d like to write a stock trading program (I’ve been told this is so common that it’s a banned topic in some machine learning classes)
  • I’d like to write a chat bot that helps depressed people
  • I’d like to write a program that analyzes potential legislation through giving unique words (likely pork if it mentions a location) and giving the frequency of other words
  • I’d like to write a program that systematically aggregates and translates from non-English sources
  • I’d like to know programming because it can improve the quality of life for anything else I do
  • I’d like to program for a living because it seems like it would pay more than being an auditor.

To this end, I’m going to utilize MIT’s Open Course Ware. Yesterday I found a list of courses at https://ocw.mit.edu/courses/find-by-topic/#cat=engineering&subcat=computerscience. This list was daunting. I want to take 107 courses, about twice what a lot of degrees would require. (5 courses per semester * 8 semesters = 40 courses, 6 * 8 = 48 courses.) The amount to be consumed is a problem.

My advantages are manifold. I’m not taking the traditional courses in-person. I can drop a class if I don’t think it will be useful. I can proceed at my own pace. I can retake the tests. In some courses, I can take multiple midterm tests from different semesters if I feel like I need the practice. We will see if the advantages can surmount the number of courses I’d like to take.

***

Zipf’s law: the idea that the 2nd most frequently used word appears 1/2 as often as the most frequently used word, the 3rd most frequently used word appears 1/3 as often as the most frequently used word, etc. Generalized, the eponymous law states the Nth frequent word appears 1/N as often as the most frequently used word where N !< 2. As “the” is the most frequent word in English, its frequency would then seem to be important.

According to https://www.english-corpora.org/coca/, the frequency of “the” in the COCA (Corpus Of Contemporary American English) is 50,066,305. https://corpus.byu.edu/coca/compare-bnc.asp lists the millions of words in each genre and these add up to 575 million. 50066305 / 575000000 = 0.0870718347826087. 1 / .0870718347826087 = 11.484770046441414. So about every 12th word is “the”.

The Vsauce video on Zipf’s law says the frequency is more like 6%, so about every 16th word is “the”. That video also states the most frequent 100 words account for about half of all words in a work. A word used only once in a work is a “hapax legomenon”. Quizzaciously was, at the time the video was posted, a Google hapax legomenon.

Hapax legomenon are important because they constitute the gist of the message couched in the most frequent words. Additionally, they can make up a significant portion of a work. It stands to reason deep conversation might similarly be better defined not by frequency but by infrequency.

***

The Alt code for capital sigma (signifying summation) is Alt + 228. Σ!

***

Today I am starting with Introduction to Electrical Engineering and Computer Science 1. There is a Python Tutorial section which seems to be an indication I should know a certain amount of Python. I don’t know Python. They point to Think Python: How to Think Like a Computer Scientist by Allen B. Downey. Another resource is the official tutorial. Having read the introduction to both, I’m proceeding with Think Python because it seems more user-friendly.

I read chapter 1 of 19 chapters.

***

Computation Structures, 1 Basics of Information

Information resolves uncertainty about a fact or circumstance. The amount of uncertainty resolved, the more information conveyed. Claude Shannon defines information as I(xi) = log2(1 / pi) where discrete random variable X has N possibilities with P probabilities. (1 / pi) is the probability of a specific instance of information and log2 ensures the information is represented in bits, that is, the number of bits required to encode the information.

Common case: faced with N equally probable cases, you receive data narrowing cases to M choices. The probability such data is sent is M * (1 / N). I(data) = log2( 1 / (M * (1 / N)) ) = log2(N / M) bits.

Common case examples:

  • Coin flip => N=2 & M=1. log2(2/1) = 1 bit.
  • A drawn card’s suit => N=4 & M=1. log2(4/1) = 2 bits.
  • Roll of 2 dice => N=36 (the dice are different colors) & M=1. log2(36/1) = 5.17 bits. This represents the most efficient encoding scheme possible.
  • [mine] roll of 2 identical dice => N=21 & M=1. log2(21/1) = 4.39 bits. I did not find a calculator that worked for this, so I’m using Google to type things like “log2(21)”.

Entropy is the average information received in each datum received regarding X. That is, H(X) = E(I(X)) = Σpi * log2(1 / pi) where H is entropy, E is the expected value (the average), and I is the information. In other words, multiply each datum’s information in bits by the datum’s likelihood and add the products.

Entropy provides a yardstick to measure average bits used in an encoding scheme. If on average the encoding scheme uses less than the entropy, the scheme is not resolving uncertainty. If on average the encoding scheme uses more than the entropy, the scheme may be inefficient. The entropy should be just over H(X), the scheme communicating everything it needs to as efficiently as possible.

Encoding is an unambiguous mapping between bit strings and the set of possible data. There are fixed length and variable length encoding. Variable length encoding can be represented as a binary tree. An example would be to compress text wherein certain words occur more frequently than others.

Fixed length encoding assumes equal probability of the characters (or other piece of information). Similarly, for unknown probabilities and lacking contravening evidence of their equality, the programmer defaults to fixed length encoding. Thus a binary tree for a fixed length encoding scheme would have all the termini at the same height of the tree, otherwise it would be a variable length encoding scheme. Fixed length encoding allows random access, meaning the user can access the Nth character of a string via {(N-1) * [binary tree height] + 1}. The entropy for fixed length encoding is Σ(1/N)log2( 1 / (1/N) ) = log2(N) bits. So for the ASCII 128 characters, that’s 7 bits. Sure enough, log2(128) = 7.

Hexadecimal is often represented by preceding the number with a 0x, e.g. 0x7D1.

N-bit numbers. The highest order bit is the “sign bit” with 0 being positive and 1 being negative. The most negative number is 10…000?! So negative numbers are from opposite land? Yep, it seems a negative number for a “two’s complement value” is its bitwise complement and add 1. The C representation of bitwise complement is ~X where X is the things you want the bitwise complement of.

Huffman’s Algorithm problem: “given a set of symbols and their probabilities, construct an optimal variable length encoding”. Huffman’s Algorithm: (1) starting with the two least probable symbols, create a subtree. (2) Continue to make subtrees using the two least probable symbols or subtrees. Evaluate subtrees’ likelihoods as the sum of their symbols’ likelihoods. (3) Stop when the last symbol has been used. The result is an optimal constructed from the bottom up.

Huffman’s Algorithm need not be applied to single symbols. It can be applied to pairs of symbols or any other length desired. It can also apply to words.

The professor mentions the LZW algorithm. It’s a phenomenal algorithm, but I don’t understand it yet. I did take an hour or two to mess around with Huffman’s Algorithm. I think I’ve got it.

Hamming distance is the number of bits required to change one valid code word to another valid code word. For instance, if a coin flip is encoded to a single bit, the Hamming distance between heads and tails is 1, so a single bit error is undetectable. If the Hamming distance between to code words is zero, they are the same. If the minimum Hamming distance in an encoding scheme is 2, a single bit error can no longer go undetected.

A simple way of increasing Hamming distance is to add even or odd parity to the code words. For instance, if coin flips are encoded as 1 for heads and 0 for tails, adding an even parity bit to the end would give 11 and 00. If a parity check is run and the word isn’t even, an error has occurred. A two bit error would go undetected. To detect E errors, the minimum Hamming distance must be E + 1.

To correct E errors, set code words a minimum of 2E+1 Hamming distance apart. This is required so that no corruption of E magnitude will change any code word into any other code word’s possible corrupted versions.

I don’t think this was an hour long, but it took hours to digest it and type notes. There are 21 lectures, so this would take about 21 days. At first this seemed way too slow, but if you bank on about 120 days in a semester and divide by 6 courses, that’s 20 days per course.

***

Python commands used

  • help() – brings up a help utility
  • [in help()] “keywords” – brought up keywords help could help with
  • [in help()] quit – quits help utility
  • import random – imports the random module
  • random.randint(70, 80) – provides a random integer between 70 and 80, inclusive
  • import math – imports the math module (you get the idea)
  • math.sqrt(4) – gives the square root of 4
  • import calendar
  • calendar.HTMLCalendar(firstweekday=0) – makes a file at a given hexadecimal address. I tried to find said file by looking at files created today. Forgot that I reinstalled python today. I can try again tomorrow.
  • print(“Hello, world!”) – a traditional first step