Poemgen – learning Python through automated “poetry” generation

In an attempt to learn some Python skills and delve a bit into the area of natural language, I decided to make this project I call “poemgen”. It’s at a very simple stage at the moment, but I plan to expand and improve it in the future.

The basic idea was this: write a program that produces sensical English statements in a “poetic” structure.

The way I accomplished this was by defining different parts of speech that could be used. Nouns and verbs were some of the first to come to mind, of course, but I later expanded to articles, adjectives, prepositions, and gerunds, among others. I assigned each part of speech a number to identify it by. For example, nouns are #1, verbs are #2, articles are #3, and so on. Each part of speech has an array of words in it.

Example from a preliminary version of the nouns list:

nouns = [“flower”, “tree”, “child”, “sun”, “moon”, “darkness”, “light”, “rain”, “beach”, “earth”, “dog”, “cat”, “lover”, “life”, “love”, “heart”, “fish”, “poetry”, “music”, “happiness”, “peace”, “serenity”, “quiet”, “blossoms”, “blooms”, “madness”, “anger”, “sadness”, “dance”, “books”];

After I compiled some sample words for each list, I then thought about how English sentences are formed. Sure, we say hundreds of sentences every day, but on closer inspection, most have a rudimentary structure that can be copied to make other sentences. I labelled each part I found as one of the id numbers for parts of speech above, and soon came to classify sentences as “314351” or some other combination of numbers. It may look cryptic at first glance, but there’s a method here. Each number refers to a part of speech I defined in my table, so putting the numbers together allows a sentence to be built using the parts of speech as elements. By breaking sentences down into their parts, we have the tools with which to build our sentences in a algorithmic way.

The rest of the python script parses these sentence structure id’s, and then goes and fetches a random word from the specified part of speech. It concatenates them together, and there you have a (hopefully) sensical English statement. For the “poem” part, I simply started a new line and choose a new sentence structure to use.

A lot of the things that I got out of the program at first didn’t make a lot of sense, and I’m still working on making it more believable. I’ve received some interesting outputs however.

Here’s an example of a “poem” that my program has generated:

 building a lively heart under a lovely sadness , shaking or singing

the shining happiness taste to a sad earth

the nice heart or the sad lover , growing

shaking the nice music with the beautiful darkness , thinking or loving

being a sad child into the lovely flower , being and singing

As you can see, there’s a lot of work to be done, but I plan to continually improve upon it in the future. It was a fun exercise and pretty neat for something I did as a first Python project.

To-do’s for the future:

  • expand word lists
  • capitalize first words in lines
  • fix the space between commas
  • add more sentence structures
  • add different tenses of words and teach the program how to understand them

If you’re interested, you can view the code for poemgen here.


2 comments on “Poemgen – learning Python through automated “poetry” generation

  1. Fun. You might have fun with Connor’s Every Haiku site.

  2. […] something I’ve been meaning to do for a while, and I’ve done a few simple programs (see here) but still didn’t really “get” it. Two things about how I learned the basics of […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s