The Importance of Being Python

Saturday 4 April 2015

Throwback Sunday: Looking back at "Adapting in_between() for Binary Search Trees"

Alex makes a good point about art and programming in her SLOG. I couldn't agree more with her. Programming can be a tricky mistress sometimes, yet always creative and fun. This last semester I have had some experience helping my wife with her Art & Code projects and, though Java is not my area of expertise, I did learn a lot about what programming can do for the arts. Moreover, being creative can lead you to new ways to solve old problems. Hence, we will look back at one of the problems seen previously, Adapting in_between() for Binary Search Trees, and see how new ideas on recursiveness could shape a new answer to this question, namely Easy Recursion.

Since adapting some code by adding new lines is hardly groundbreaking, let's come back with a new solution involving our triad of One Function, One Helper and One Global Variable. The question was how to look for values inside a BST that satisfies a group of parameters (ie range of numbers, multiple of 3). This can be done rather easily with the triad. Take a look at the code below:

_gather_values = []

def in_between(node, start, end):

    _gather_values.clear()
    
    _get_values(node, start, end)
    
    return _gather_values


def _get_values(node, start, end):
    
    if node.data > start and node.data < end and node.data % 3 == 0:
        _gather_values.append(node.data)
        
    if len(node.children) != 0:
        for x in node.left:
            _get_values(x, start, end)
        for y in node.right:
            _get_values(y, start, end)

Now, it does have a major flaw: it does not take advantage, nor tries to take advantage of, the properties of BSTs. The code at the end of the helper function could be changed to cover the properties of BSTs (left branch goes down, right branch goes up in value). Regardless, this could be improved quickly. The point was to show that this code, which solves the problem, could be constructed easily.

Looking back at that old post, I can see how many different ways you can solve a problem. Creatively, or not, it is always exciting. Whilst I aim for writing elegant code, this does not negate the power of quick, easy solutions. Let us keep coming with creative solutions, and improve upon our code writing.

Holy Mutations! Things getting dicey from now on...

We have done a lot work traversing trees and linking nodes. While it has been all fun, we now need to step things up a notch: we need to mutate the contents inside these trees and linking nodes. Let us remember that the data in each node is mutatable, and thus can be changed at any spot in time. Let us observe at one of the examples of mutation, in this case, of binary search trees.

Delete in Binary Search Trees
Algorithm
(Note: Remember that in a BST, the left child is smaller than the node, and the right child is bigger than the node)

If the node is not a node, then there is nothing to do
If data is less than the data in the node, go down the left (which has smaller values)
If data is more than the data in the node, go down the right (which has bigger values)
If node has fewer than two children, then return the existing one (which essentially brings down the only child if the deleted node has only got one child)
Otherwise, if both children exist, replace the data with the data in the largest child in the left subtree and recursively work on the data from the left node, and finally return that node

On this final step, it is important to notice that, when deleting an element from a BST, you need to pull the item from the left child since the right child is always bigger than the node. To make this clear, we could follow this example:

8
6
5
4
3
2
1

Let's delete the five (5). Then we need to to grab a value to replace the five, and it only makes sense to take the largest child from the left. In this case it would be the four (4). The four is still smaller than the rest of the right children, and bigger than the left children. Then we would get the following:

8
6
4
3
2
1

Code

This code was taken from class, so credit to whoever wrote it! It is shown here to give more context to what was done above.

def delete(node, data):
    ''' (BTNode, data) -> BTNode

    Delete, if it exists, node with data and return resulting tree.

    >>> b = BTNode(8)
    >>> b = insert(b, 4)
    >>> b = insert(b, 2)
    >>> b = insert(b, 6)
    >>> b = insert(b, 12)
    >>> b = insert(b, 14)
    >>> b = insert(b, 10)
    >>> b = delete(b, 12)
    >>> print(b)
            14
        10
    8
            6
        4
            2
    <BLANKLINE>
    '''
    return_node = node
    if not node: # Step 1 from the algorithm
        pass
    elif data < node.data: # Step 2 from the algorithm
        node.left = delete(node.left, data)
    elif data > node.data: # Step 3 from the algorithm
        node.right = delete(node.right, data)
    elif not node.left: # Step 4 from the algorithm
        return_node = node.right
    elif not node.right: # Step 4, again, from the algorithm
        return_node = node.left
    else: # Step 5 from the algorithm
        node.data = _find_max(node.left).data
        node.left = delete(node.left, node.data)
    return return_node

There is a helper function that traverses the right children for the highest value. It is used on the function above so that the highest value from the left child branch is brought and used as the new data of the node, and everything is brought down the left children. After all, the highest value from the left child will be found on the left child's right branch.

def _find_max(node):
    ''' (BSTNode) -> BSTNode
    
    Return maximal node in the BST rooted at node.
    Assume node is not None.
    
    >>> b = BTNode(8)
    >>> b = insert(b, 4)
    >>> b = insert(b, 2)
    >>> b = insert(b, 6)
    >>> b = insert(b, 12)
    >>> b = insert(b, 14)
    >>> b = insert(b, 10)
    >>> _find_max(b).data
    14
    >>> _find_max(b.left).data
    6
    '''
    return _find_max(node.right) if node.right else node

The Challenge
I wish I could say this could is easy to understand. Once you analyze it, it makes sense how it works and what it does. Regardless, there is a difference between knowing what to do and knowing how it needs to be done. The algorithm is not that intuitive, and requires close attention to notice exactly why things happen, and then how.

Friday 3 April 2015

Easy Recursion: One Function, One Helper, One Global Variable

Just like Andrew said in his SLOG, when working on recursive code, it is often the case that simple instructions will not solve a problem. But what if I were to tell you that there is a simple way to achieve recursion? You would probably tell me I am lying, or otherwise say that I have probably broken Python (just like Dima told me when I mentioned I found a way). So what is the solution I found? It can be broken down into these elements:

One Global Variable: Holds the information gathered by recursive code
One Helper Function: Uses recursion to gather information into the global variable
One Function: Clears the global variable, executes the helper function, and returns the global variable

Of course I would be penalized for efficiency, but bear in mind that this is an easy solution to what can seem a murky task to many. It uses recursion, yes, and it solves a problem, yes. Later on when you have more experience programming you can ditch this, but as a temporary solution, it should work great.

Let's put things into perspective. This is what we have seen in class:

def list_all(t):
    ''' (Tree) -> list

    Return list of values in t.

    >>> t = Tree(0)
    >>> list_all(t)
    [0]
    >>> t = descendents_from_list(Tree(0), [1, 2, 3, 4, 5, 6, 7], 3)
    >>> L = list_all(t)
    >>> L.sort()
    >>> L
    [0, 1, 2, 3, 4, 5, 6, 7]
    '''
    # implicit base case when len(t.children) == 0
    return [t.value] + gather_lists([list_all(c) for c in t.children])

(Credit to whoever did this code from Lab 05 solution)

This code is elegant, short, beautiful. It uses a helper function gather_lists(), which concatenates all sublists, and that is about it. The return line is only one line long. Now this will look like a travesty: I will do exactly the same, but using the method described before.

# Global variable here will hold all the info gathered
_all_elements = []

def list_all(t):
    ''' (Tree) -> list

    Return list of values in t.

    >>> t = Tree(0)
    >>> list_all(t)
    [0]
    >>> t = descendents_from_list(Tree(0), [1, 2, 3, 4, 5, 6, 7], 3)
    >>> L = list_all(t)
    >>> L.sort()
    >>> L
    [0, 1, 2, 3, 4, 5, 6, 7]
    '''

    # Clear the global variable first
    _all_elements.clear()

    # Call helper function
    _get_elements(t)
    
    # Return global variable
    return(_all_elements)
    
def _get_elements(t):
    ''' (Tree) -> None
    
    Gather all objects from the tree rooted at t in the global
    variable _get_elements
    '''
    
    # Append value to global variable
    _all_elements.append(t.value)
    
    # If there are children...
    if len(t.children) != 0:
        # Recursively work with the children
        for x in t.children:
            _get_elements(x)

So it does the same, but it is easier to understand, and easier to construct. It is also longer, less elegant, and probably not very efficient. It might be vulnerable to outside interference since it relies on using a global variables as a temporary vehicle.

Challenges

Recursion can be very challenging, but having a way to split the steps to visualize it better, while achieving results, can be encouraging. I encourage everyone to apply this method when unsure about what to do.

The challenge, hence, is how to get from this method to the more elegant, efficient one. And also the question on whether it consumes much more computation power than the elegant method. Perhaps it is not way better, but only slightly better.

Sunday 15 March 2015

Linking Nodes: Wrapping up your one child tree

The instant I read about Linked Lists, what I thought about was the impossible bet explained in Minute Physics. The solution to that puzzle involved a linked list of the sorts and, while the video is not about linked lists, it does outline one of the benefits of this data structure: it is an organized way to sequentially traverse through information.

What is a Linked List Node (LLNode)?
It is an object that has two pieces of information:

value = object held by the node (e.g. a number)
next = reference to the next node (i.e. another LLNode)

Does this ring any bells? Well, of course: it is identical to the attributes from our original Tree, with next instead of children (and next being the next node, while children was a list of nodes).

What is a wrapper?
Nodes could go on forever, or perhaps not. When we want to use a particular section of a chain, we use a wrapper to define the beginning and the end of it, as well as the size. The attributes of a wrapper are the following:

size = size of chain
start = starting node
end = ending node

As the name implies, the wrapper wraps itself around the nodes.

The Challenge

The appearance of a wrapper changes everything! While we used to use recursion in a kind of happy-go-lucky way, we now need to keep in mind that recursive code has to deal with the nodes inside the wrapper. Hence, it will have a different nature since changes will need to be made on the nodes and/or the wrapper. These are some examples and what they change:

Delete Node: Nodes + Wrapper
Change Node Value: Nodes
Add Node: Nodes + Wrapper
Shorten Wrapper: Wrapper

Let's keep these in mind for the next post, where we will explore how recursive code can change the wrapper and the nodes.

Adapting in_between() for Binary Search Trees

Last week we had the second Term Test, and one of the questions we had involved manipulating a piece of code we had seen in the lab. While I am usually horrible writing recursive code, I have a good time interpreting it. Therefore, today we are going to adapt one of the functions seen in class: list_between(). We have changed the function a little bit and named it in_between().

This is the code we are working with:

def in_between(node, start, end): ''' (BTNode, object, object) -> list

 Return a Python list of all values in the binary search tree
 rooted at node that are between start and end (inclusive).

 >>> in_between(None, 4, 11)
 []
 >>> b = BTNode(8)
 >>> b = insert(b, 6)
 >>> b = insert(b, 2)
 >>> b = insert(b, 5)
 >>> b = insert(b, 11)
 >>> b = insert(b, 7)
 >>> b = insert(b, 1)
 >>> in_between(b, 4, 5)
 [5]
 >>> L = in_between(b, 6, 11)
 >>> L.sort()
 >>> L
 [6, 7, 8, 11]
 '''

 #when there is no node
 #return an empty list
 #alternatively use: 
 #if node is None:
 if not node:
  return []

 #you need to act recursively,
 #gathering numbers from node.data
 #while checking if you can go
 #further left of further right
 else:
  #go further left
  lst_left = (in_between(node.left, start, end) 
    if node.data > start 
    else [])

  #go further left
  lst_right = (in_between(node.right, start, end) 
    if node.data < end 
    else [])

  #check on the node  lst_node = ([node.data] 
    if (start <= node.data <= end) 
    else [])

 #join all lists

 return lst_left + lst_node + lst_right

Let's say for example you want to do something a bit different. Now you want to search strictly for numbers that are multiples of three. This can be accomplished with a simple addition. What we are going to do is add the condition node.data % 3 == 0 in the part of the code where the objects are checked for compliance with start and end. Additionally, we are going to change the docstring so that it is strictly for numbers.

This is the modified code:

def threes_in_between(node, start, end):
 ''' (BTNode, number, number) -> list

 Return a Python list of all values in the binary search tree
 rooted at node that are between start and end (inclusive),
        and are multiples of three.

 >>> in_between(None, 4, 11)
 []
 >>> b = BTNode(9)
 >>> b = insert(b, 6)
 >>> b = insert(b, 2)
 >>> b = insert(b, 5)
 >>> b = insert(b, 11)
 >>> b = insert(b, 7)
 >>> b = insert(b, 1)
 >>> in_between(b, 4, 5)
 [5]
 >>> L = in_between(b, 6, 11)
 >>> L.sort()
 >>> L
 [6, 9]
 '''
 #when there is no node
 #return an empty list
 #alternatively use: 
 #if node is None: 
 if not node:
  return []

 #you need to act recursively,
 #gathering numbers from node.data
 #while checking if you can go
 #further left of further right else:
  #go further left
  lst_left = (threes_in_between(node.left, start, end) 
    if node.data > start 
    else [])

  #go further left
  lst_right = (threes_in_between(node.right, start, end) 
    if node.data < end 
    else [])

  #check on the node
  lst_node = ([node.data] 
    if (start <= node.data <= end 
     and node.data % 3 == 0) 
    else [])

 #join all lists
 return lst_left + lst_node + lst_right

So it was a simple change that allowed us to be selective about the data we gather. It is important to notice, too, that lst_node is the only list that gathers data, while the other ones just act recursively on the left and right, deciding where to stop (remember that a Binary Search Tree's data increases to the right and decreases to the left).

Challenges

You might feel inclined to add the condition to lst_right and lst_left, too.

 else:
  #go further left
  lst_left = (threes_in_between(node.left, start, end) 
    if node.data > start and node.data % 3 == 0
    else [])

  #go further left
  lst_right = (threes_in_between(node.right, start, end) 
    if node.data < end and node.data % 3 == 0
    else [])

However, this would be a huge mistake since their only job is to dig (or climb, since it is a tree) further down the tree with start and end as limits, finally gathering the numbers when lst_node does its magic.

Sunday 8 March 2015

Doodad Directed Programming (An overview on OOP)

I feel humbled by my friend Dima's explanation of Object Oriented Programming (OOP). It is a very entertaining account that includes both technical terms and his personal experience on learning about and applying this concept. Having said that, I will do my best to explain what OOP is and why it is important.

What is Object Oriented Programming (OOP)?
Some friends of mine believe that programming involves a lot of binary digits and their bending à la Matrix. What seems to be a lesser known fact about programming is that there are different levels of communication between programmers and machines.

Diagram 1: Representation of programmer / machine levels of communication

(Source: Programming Language, Webopedia)

As you can see in Diagram 1, we have hardware at the bottom and programming languages at the top. Machine language is the only language that machines understand and it is composed mainly of numbers, and each CPU uses a different language, hence it is impractical for a person to learn it. To make it practical for people to create programs, high-level languages were created. These resemble English (i.e. if you don't know English, you might have a harder time understanding why things are the way they are in programming), but some more than others.

OK, this explanation seems to have drifted away from OOP. Do I still have a point to make here? Yes, and it has to do with the mentality applied when creating high-level languages. Machines do not know what a number is, or the heavy burden of daily life. They know about numbers (probably in the form of binary gates XOR XOR 110011101) and that's it. Hence, when creating a language they needed to tell the machine what a number is and how it works. They also needed to tell the machine what a string is and how it works. And so on...

But why should the person who engineers a programming language carry the burden of describing everything in the whole wide world. We find objects in the world, and they have definitions and rules. OOP is centered around creating objects, properly defined through attributes and following rules through methods, and them being instrumental when fulfilling a goal.

In short, OOP revolves around creating and using objects (called classes in Python and other programming languages), which are data structures that have attributes and methods.

What is a class?
A class is a data structure that has both attributes and methods (you are probably tired of listening these two words). An attribute is any form of data that is contained inside the class. A method is a function that is part of the class, and typically focuses on working with the attributes that are part of the class.

Classes can have children. Why couldn't they? As long as they are caring parents and pass on some attributes and methods, there should not be a problem.

What is inheritance?
Inheritance occurs when a subclass is initiated. This subclass will have all the attributes and methods of the parent class, unless they are overridden in the subclass itself. The idea is that you can make a main, generic class and have its offspring be specific for a certain task.

Check my POST on inheritance for a more detailed explanation with examples!

Saturday 7 February 2015

Recursion(Recursion(Recursion(...)))

Recursion is a powerful property of many programming languages. Alex's SLOG called it "code-ception", which I thought was a catchy name and a spot on way to describe it. Just like a dream inside a dream inside a dream, a recursive function is a function that goes inside itself, and then goes inside itself, and so on (as many times as needed). The difference between Inception and Recursion is that in Inception you go, while in a dream, into dreams of different people, while in Recursion you go, while the code is being executed, into the same code. Hence, the title Recursion(Recursion(Recursion(...))).

Why is it useful?
Save space; save trees. Well, not physical trees (unless for some reason you want to print your code [but not print() your code]), but memory space. The alternative is writing several functions with if statements that call each other when needed. A clear problem appears as you realize that you would have to write a lot of code if many steps are needed!

Tracing Back
Trace back is a good exercise when wanting to understand what a piece of recursive code does. Let's try tracing back the following piece of code:

def looper(L): '''
 (list) -> int
 
 Awful docstring that explains nothing.
 '''

 return sum([looper(x) for x in L]) if isinstance(L, list) else 5

(Oh! BTW, that is a new trick I learned from the labs. You can make a return statement while implementing an if statement)

These are the examples that we will trace back:
looper([]) --> 0 #since L has no elements, recursion does not take place

looper([17]) --> sum([looper(x) for x in [17])
-- > sum([5])
--> 5

looper([1, 2, 3, 4, [5, 6]]) --> sum([looper(x) for x in [1, 2, 3, 4, [5, 6]])
--> sum([5, 5, 5, 5, 10]) #we do not solve when the function is executed inside
--> 30

looper(['some', 'here', 5, ['hello', 'there']]) --> sum([looper(x) for x in ['some', 'here', 5, ['hello', 'there']]])
--> sum([5, 5, 5, 10])
--> 25