This post is summarized from Chapter 3 of Ruli Manurung‘s An evolutionary algorithm approach to poetry generation from 2003 – it is essentially 10 years old research from a fast moving field of science. However, these are core principles and techniques; a casual perusal of wikipedia indicates they are still valid.

If you know of something new and spectacular, I’d love to know.

Natural Language Generation

So what is natural language generation anyway? The only thing everyone can agree on is that the algorithm should take some manner of input and output a sensible text that a human can read and understand.

Trisk - a conversational robot

Trisk - a conversational robot

Formally defined, a NLG system should accept a <k,c,u,d> tuple, where k is the knowledge source, c is the communicative goal, u is the user model and d is the discourse model. Essentially this means that “You know K and you want to say C to U, using the style of D” … remember back to your high school days – before writing an essay the teacher would tell you exactly this.

The output … well that’s easier, it should make sense, fulfill the communicative goal and look like it was written at least by a trained monkey.

A NLG system usually involves three processes:

  1. Content determination – what you’re going to say, based on knowledge, communicative goal and user’s expectations
  2. Sentence planning – how are you going to say it. Remember, semantically sane!
  3. Surface realisation – the final output, which specific words to use, do sentences follow each other nicely and so on. You could call this style or flow.

Traditional architecture

Traditionally this has been approached by discretely implementing the three stages and assembling them into a pipeline of some sort.

Three reiter pipeline architecture

More modernly Wikipedia suggests a six stage solution divided into: Content determination, Document structuring, Aggregation, Lexical choice, Referring expression generation and Realisation. However this is still essentially the same approach.

But what if you don’t have a clear communicative goal and just want to say something interesting about a subject? Or perhaps a later stage might find a better solution that has been disallowed by an earlier stage.

A number of different architectures have emerged to combat these problems:

NLG architectures

The revision approach combats problems by iteratively fixing each stage in hopes of finding something better, feedback simply feeds findings from later stages back into earlier stages, which can open up options that would otherwise be pruned away, blackboard is particularly interesting because it uses a common space where results from different stages are posted so every stage has access to what’s already out there.

But without a clear communicative goal or end user in mind, the integrated approach looks most promising.

NLG as a search problem

One way to approach natural language generation in an integrated manner is by posing it as a search problem – you have a search space of possible solutions, your job is merely to find the best.

A search space

A search space

This can be done in many ways:

  • hillclimbing – take a random spot on the map. With each step, consider all possible moves and move in the best direction. Hope you don’t get trapped in a local maximum
  • systemic search– go through all possible solutions and find the best one. Problem with this approach is that it can take a while to finish, but you always get the most optimal solutionFour main ways exist to implement this: chart generation (use charts to define the search space and go from there), truth maintenance systems (have assumptions, make sure they stay true), “redefining the search problem” (a lot of invalid choices can be pruned out based on linguistic knowledge, which helps with the explosive search space size), constraint logic programming (everyone who’s done a prolog course has seen this I think)
  • stochastic search – this is my favorite approach since it involves a bunch of interesting algorithms. The basic idea is that you can define a set of constraints and then “guess” solutions based on those constraints. But since each constraint is probabilistic, you have a lot of flexibility in coming up with solutions.

Overgeneration and ranking

This approach is based on the model of an author and a reviewer. The author continuously outputs a bunch of plausible solutions that the reviewer then ranks on viability and finally picks one to present to the user.

Author-reviewer model

The beauty of this approach lies in the fact that each part of the system only has to concern itself with what it does best, while they still collaborate so you aren’t trapped in a pipeline architecture.

Opportunistic planning

Opportunistic planning is an approach that works best when you don’t have a clearly defined communicative goal – so how do you know when you’ve found the solution?

A jewelry

A jewelry

This approach was developed for a system that produced description labels for items in a museum. From what I understand users could virtually click around a collection of jewelry, giving the NLG system a simple goal of tell me something interesting about this.

Without any clear goals and users, even without much advance knowledge of what the system will be talking about, the only viable approach is generating text based on opportunity, like a human guide would.

The crowning moment is when it can connect subsequent items into  something like “…and it was work like this which directly inspired work like the Roger Morris brooch on the stand which we looked at earlier”

Poetry?

Poetry suffers from a lot of the problems other NLG systems have already had to tackle – a unity of content and form, lack of clearly defined goals etc.

Therefore the best approach we can take is some combination of opportunistic planning, which gives us the ability to have lightbulb moments while writing poetry, and stochastic search, which gives us the needed flexibility in the face of many constraints without any goals.

Enhanced by Zemanta

Learned something new? Want to improve your skills?

Join over 10,000 engineers just like you already improving their skills!

Here's how it works 👇

Leave your email and I'll send you an Interactive Modern JavaScript Cheatsheet 📖right away. After that you'll get thoughtfully written emails every week about React, JavaScript, and your career. Lessons learned over my 20 years in the industry working with companies ranging from tiny startups to Fortune5 behemoths.

PS: You should also follow me on twitter 👉 here.
It's where I go to shoot the shit about programming.