08 October 2014

Hackbright Day 7: Markov Chains and Mary Had a Little Lamb

Today started out with a review of the previous day's exercise 6, where we were given a colon separated values list of restaurants and ratings and we were told to use a dictionary to store the information and then present it in two ways: alphabetical and then by rating. Sorting alphabetically is easy, because we used the restaurant name as the key and a sort on the dictionary will sort on the key names.

The sort on the rating is a bit more complicated. My initial thought was to make the ratings the key and the restaurant names the values, but that doesn't work because you can't have duplicate keys. We ended up doing this in two ways: first, we created a list of tuples out of the dictionary and then sorted on the second element (which was the rating) and then printed from the list of tuples. The other way that we approached this was to create a list of the keys (or restaurant names in this case) and then use that list to create a list of tuples that contained the restaurant ratings and names and then used list sorting to order them by rating. While both approaches seem the same, the subtle difference is how we sorted them. In the first example, we needed to use a special argument in the sort command called a key. A key is a function and we used that function to return the second element of the tuple, upon which the list was sorted. In the second case, the list of tuples was constructed such that the tuples contained the (rating, name,) instead of the same order that was in the dictionary.

One of the biggest things that I learned from this exercise is that there are many ways to approach a problem. We also were warned against trusting anyone who uses the program to provide sane input. One of the instructors called this "defensive coding".

If we ever have a boy, it's totally going to be named Little Bobby Tables

In the afternoon, we discussed Markov chains. There are a lot of wonderful explanations of Markov chains. Markov chains are used in weather prediction, Google page rank, and in predictive typing, among other applications. However, the best use of Markov chains clearly is for humor. In fact, Chris once used a Markov chain text generator on his tweets. It's pretty hilarious. Also hilarious? Kim Kierkegaardashian, a mash up of tweets from Kim Kardashian and the philosophy of Soren Kieregaard, the only existentialist author I could ever stand to read for long periods of time. (College was a moody time, OKAY?!)

Here is my take on Markov chains:

Markov chains are constructed by the probability of an event occurring at a given state. If you have a state and then event a happens, then you can go to event a and then look at the probabilities associated with that event of what might happen next. I realize that this is a very terrible explanation, so let's look at an example.

Mary had a little lamb,
little lamb, little lamb
Mary had a little lamb,
whose fleece was white as snow

Every time you see the word Mary, it's followed by the word had. Similarly, every time you see the word little, it's followed by lamb. When you look at the word lamb, however, it's followed by little, little, Mary, and whose.

Thus, if you start with the word lamb, there's a 50% chance that the next word will be little, a 25% chance that the next word will be Mary, and a 25% chance that the next word will be whose.

The Markov chain text generator that we worked on this afternoon basically worked by looking at an entire piece of text and then determining the possible set of next words for a given word.

We then started with a seed word (or a few words, but we'll get to that in a minute), randomly choosing the next word from the list of possible next words that we generated from analyzing the text. At this point, we have: seed + new word. It then evaluates the new word to figure out the likely next word, and so on.

Markov chains are pretty good at generating text, but they work better if you evaluate more than just one word at a time. This set of words is referred to as an n-gram, where n refers to the number of words that you string together. So in the example I gave, instead of looking at just "Mary", we could look at Mary had or little lamb instead of just little. This is actually a pretty terrible example, but let's explore it for a little lamb. I mean bit. little bit.

The way you would approach this using a bi-gram (or an n-gram with 2 words), is by first evaluating Mary had, which always is followed by a.
seed = Mary had
new = a

then we need to evaluate the next set of two, which would be:

had a, which is always followed by little.

So now we have:
Mary had

And then we evaluate a little, and so on. The way that we attacked this in python was to create a dictionary, where the key was a tuple that had n-elements, where n was the number of words that made up each n-gram. The corresponding values were lists that contained the possible outcomes. Let's look at the first four entries in a dictionary for a bi-gram of Mary had a little lamb:

('Mary','had',) : ['a'],
('had','a',) : ['little'],
('a', 'little',) : ['lamb']:
('little', 'lamb',) : ['little', 'little', 'Mary', 'whose'],

So after we figured out that, we worked on trying to get the random text to end of an ending punctuation mark (i.e., a period, question mark, or exclamation point) and then trying to truncate the phrase to less than 140 characters. This is because the next step is to make the tweets appear on twitter. We'll be working on that tomorrow morning. I'll share the link to the twitter profile and describe how the twitter API works tomorrow if we can get it up and tweeting.

Wow. This is insanely long. I'm sort of surprised that I'm still faithfully blogging after every class, but so far, it's been the easiest way for me to synthesize what I've been working on. Here's hoping that I can continue on for at least the next 4 weeks. Also, for everyone who was sitting through my Debbie Downer-isms at lunch today, thank you for letting me vent. It's oddly comforting to know that other people feel the same way.


Dan N said...

import operator

x = {"McDonalds":0, "Burger King":1, "Wendy's":2, "Subway":2, "Carl's Jr":1}

sorted_by_key = sorted(x.items(), key=operator.itemgetter(0))
sorted_by_val = sorted(x.items(), key=operator.itemgetter(1))

julie k h aka jkru said...

That was supremely useful. Thanks!

Dan N said...

Or without using itemgetter.

sorted_by_key2 = sorted(x.items, key=lambda x: x[0])
sorted_by_val2 = sorted(x.items, key=lambda x: x[1])

julie k h aka jkru said...

yeah, we did something similar without a lambda function where we defined a real function that we literally only used for this, but i think it was more of an object lesson than a practical application.