Skip to content ↓

Please note:

The MIT Welcome Center (E38) will be closing at 2 PM on November 27. MIT and the admissions office will be closed November 28–29 for Thanksgiving break, and will open on December 2.

MIT staff blogger Chris Peterson SM '13

So @mitblogs_ebooks exists by Chris Peterson SM '13

'as you might know, i am a full time internet'

Some of you may remember @horse_ebooks, a Twitter account which, before it was bought and subverted by Buzzfeed, was a truly delightful gibberish machine which spouted pseudorandomly generated spam tweets from a collection of source texts. Some of my favorites:

 

 

 

 

 

Fraudulent or not, @horse_ebooks helped inspire an entire genre of surrealist _ebooks-style Twitter bots, which actually do take source texts and produce randomly generated tweets inflected by the voice of various academics, journalists, and programmers. Because they are randomly generated, many, perhaps most, of these tweets aren’t very funny. But some of them are really funny, if in an admittedly odd way, because while they are consonant in subject and voice with the source texts, they are probabilistically written in ways that the ‘actual’ authors never would. The practical result is that you get tweets which sound strangely familiar but are off just enough to be startling and (sometimes) funny.

A few months ago I decided I wanted to make one for the blogs. Over the last few weeks, after reading and committee ended, I actually did. Here’s how:

First, I wrote a crude but effective scraper in Python. This script crawls the blogs, downloads every entry ever written, uses the BeautifulSoup library to parse the HTML, and writes each parsed line to a text file.

Then, I cobbled together a tweet generator in Ruby. This script takes the text file as a source, uses the MarkyMarkov gem to map probabilistic word relationships, randomly generates sentences, rolls a D20 to decide if they should be SHOUTED IN ALL CAPS, and posts the final result to Twitter.

I uploaded the source text and the ruby script to scripts, a free hosting service operated by MIT students for the MIT community, and set my cron file to run it every three hours.

TL;DR: @mitblogs_ebooks is now a thing. Everything it says is randomly generated from a source text of every blog entry every written. I like to think of it as admissions advice from an alternate universe, spoken not by any single blogger but by the rumbling chorus of a collective, semi-sentient blogger organism:

 

 

 

 

 

So there you go. I had never used Ruby, or MarkyMarkov, or a lot of other things before I began this piece of carpentry, but I personally find that trying (and failing, and trying again) is the best way to learn. In making @mitblogs_ebooks, I learned a lot, and sometimes the thing I made even makes me laugh because of how weird it is, which is an added bonus. If you want to try your own exercise in computationally generated weirdness, you can download my code here. Happy making!