Towards A Computational Model of Poetry Generation is a paper by Manurung, Ritchie and Thompson (whomever they are) published in May 2000 and so far seems to be the best starting point for my graduation thesis.
There are three main parts to this story:
- what makes it hard
- how it used to be done
- how it should be done
Obviously my idea of how it should be done matches somewhat with the authors of this paper, otherwise I wouldn't hold it in such a high regard :P
Poetry is a unique artifact of human natural language pro- duction, with the distinctive feature of having a strong unity between its content and its form. The creation of poetry is a task that requires intelligence, expert mastery over world and linguistic knowledge, and creativity. Al- though some researchwork has been devoted towards cre- ative language such as story generation, poetry writing has not been afforded the same attention. It is the aim of this research to fill that gap, and to shed some light on what often seems to be the most enigmatic andmysterious forms of artistic expression.
In short, this is a really cool problem to solve because it's something that hasn't really been done before. And it just looks interesting when you give computers - strictly logical reasonable machines - even the most modest of abilities to create art.
When rationalization is required ... well somebody needs to create all those pop songs. Imagine if we could get a computer to write them, then sell them to all the pop music labels to feed to the Spears and Aguileras of this world. Marvelous!
The problem with poetry is two-fold. On the one hand one finds themselves at a serious disadvantage in comparison to creating a traditional NLG(natural language generation) tool because poetry is less rigid. There is no message you are trying to convey and so the goal you are trying to achieve is a bit muddy.
Furthermore, poetry possesses a certain unity between message and form. As Levin stated in 1962 "In poetry the form of the discourse and its meaning are fused into a higher unity".
This creates a problem because traditional NLG systems are decomposed into simpler units of content determination, text planning and surface realisation. In poetry form and message are too intertwined to allow this approach.
The only advantage we might have from creating poetry instead of informative text, is that readers expect to do a lot of interpretative work, but realistically this again only serves to make it more difficult to define a goal for our algorithm.
Essentially: it's hard to decide what we _want _and traditional well researched approaches break down. Great.
As with a lot of NLPresearch, there were several attempts at a strictly academic approach to poetry generation back in the 80's. More recently, there have been several-ish web based attempts, but using similar techniques.
The problem with old approaches is that they were basically party tricks - humans generated extremely detailed grammars and poetic structures, so the computer ended up just semi-randomly filling in the words. While this produced good looking poetry and in case of RACTER even lead to publication.
Leaving aside considerations of whether the poetry at this point is even computer generated, all of these approaches completely ignored semantic meaning and most poetics as well - rhythm, rhyme and figurative language.
Just as I'm planning to in my thesis, authors of this paper have decided to focus mostly on poetic form and ensure their results follow a strict verse, rhyming structure and other features usually attributed to classic poetry. Mostly because it is easy to pass off almost anything as modern experimental poetry and the more structure we can muster, the easier it becomes to actually verify results.
The approach they suggest is a stochastic hillclimbing algorithm - particularly an evolutionary approach where the algorithm follows this kind of loop:
Generation is done by mutating the best candidates from the previous cycle through three simple constructs: Adding, Changing and Deleting.
So if you had a verse "John walked" it could become "John walked to the store", or "John lumbered". Or for instance "John likes Jill and Mary" it becomes "John likes Jill".
Because of an integrated architecture (combining all of semantics, form, etc.) these changes can and should happen at any level so special care must be taken that changing the semantics of a verse, doesn't negatively affect its rhythm. Although from the paper I don't grok why exactly this is a problem, since the evaluation step should take care of negative mutations.
After we have a new population, these are then evaluated for correctness and the candidates with highest scores from the fitness functions get to go into the next round of mutations.
The authors go into little detail about assessing the different poetic structures, mostly because they have not yet cracked how to implement all of them. Detecting rhythm seems to be where they've made the most progress and a solution is suggested where we give the algorithm a target phonetic form, like this limerick:
w,s,w,w,s,w,w,s(a) w,s,w,w,s,w,w,s(a) w,s,w,w,s(b) w,s,w,w,s(b) w,s,w,w,s,w,w,s(a)
Where w means a weak stress syllable and s means a strong one. In parentheses the rhyming structure is described.
It is suggested that with the exception figurative language all the criteria can be dumbed down to numerical arguments that can be easily quantified through different means that are only loosely described.
They also go into a bit of formal detail explaining the sort of grammars that are used, but you should go read the actual paper if you're interested in such heavily theoretical things :)
Here is an example of what their algorithm can produce:
john(1),mary(2), dog(3), bottle(4), love(5,6,7), slow(8), smile(9,10)
the bottle was loved by Luke
a bottle was loved by a dog
Continue reading about Science Wednesday: Towards a computational model of poetry generation
Semantically similar articles hand-picked by GPT-4
- Comparing automatic poetry generators
- Natural Language Generation system architectures
- Evolving a poem with an hour of python hacking
- Science Wednesday: Defining poetry
- Eight things to know about LLMs
I write articles with real insight into the career and skills of a modern software engineer. "Raw and honest from the heart!" as one reader described them. Fueled by lessons learned over 20 years of building production code for side-projects, small businesses, and hyper growth startups. Both successful and not.
Subscribe below 👇
Join Swizec's Newsletter and get insightful emails 💌 on mindsets, tactics, and technical skills for your career. Real lessons from building production software. No bullshit.
"Man, love your simple writing! Yours is the only newsletter I open and only blog that I give a fuck to read & scroll till the end. And wow always take away lessons with me. Inspiring! And very relatable. 👌"
Senior Mindset Book
Get promoted, earn a bigger salary, work for top companiesLearn more
Have a burning question that you think I can answer? Hit me up on twitter and I'll do my best.
Who am I and who do I help? I'm Swizec Teller and I turn coders into engineers with "Raw and honest from the heart!" writing. No bullshit. Real insights into the career and skills of a modern software engineer.
Want to become a true senior engineer? Take ownership, have autonomy, and be a force multiplier on your team. The Senior Engineer Mindset ebook can help 👉 swizec.com/senior-mindset. These are the shifts in mindset that unlocked my career.
Curious about Serverless and the modern backend? Check out Serverless Handbook, for frontend engineers 👉 ServerlessHandbook.dev
Want to Stop copy pasting D3 examples and create data visualizations of your own? Learn how to build scalable dataviz React components your whole team can understand with React for Data Visualization
Did someone amazing share this letter with you? Wonderful! You can sign up for my weekly letters for software engineers on their path to greatness, here: swizec.com/blog
By the way, just in case no one has told you it yet today: I love and appreciate you for who you are ❤️