All computer source code presented on this page, unless it includes attribution to another author, is provided by Ed Halley under the Artistic License. Use such code freely and without any expectation of support. I would like to know if you make anything cool with the code, or need questions answered.
# buzz - a phrase generator which assembles phrases from random pieces



    buzz - a phrase generator which assembles phrases from random pieces


    >>> buzz = Buzz('(She|He) (found|saw) a (red|green|blue) (ball|block).')
    'She saw a green ball.'
    'He found a red block.'
    >>>    # each call returns a new unique permutation
    'He saw a blue block.'
    >>>    # returns None if new permutation not found quickly
    >>> buzz.sample()    # choose one of the earlier select() results
    'He saw a blue block.'
    >>> buzz.sample(2)   # choose N of the earlier select() results
    [ 'He found a red block.', 'He saw a blue block.' ]


    This module was inspired by many novelty tools which purport to
    create convincing buzzword-filled jargon, by assembling sentences
    from a few scraps of impressive but meaningless prose:

        "As a resultant implication, " +
        "initiation of critical subsystem development " +
        "is further compounded when taking into account " +
        "the evolution of specifications over a given time period."

    Besides this narrow novelty application, I have found that having
    pronounceable names, or thematic phrases, has been a recurring
    requirement in many gaming and non-gaming applications.  A
    SimCity-like game may use this sort of generator to produce
    convincing but unique street names or neighborhoods, for example:

        neigbborhoods = Buzz("%{serene} %{botany} %{geography}", terms)

        'Shady Oakwood Estates'
        'Rolling Pine Meadows'
        'Breezy Willow River'
        'Cozy Cottonwood Creek'
        'Rolling Oakwood Heights'

    The constructor supports several ways of defining the pieces which
    are assembled into the final formulaic phrase constructions.  The
    buzzword object keeps track of past selections, so as to avoid
    unintended repeat results.  The terms can even be redefined later, to
    allow more choices when the original set starts to wear thin.


    Ed Halley ( 21 April 2008



import random
import re

class Buzz:

    '''A phrase generator, turning sets of fragments into unique phrases.'''

    TRIFIX = re.compile(r'\A(.*?)(\%\{.*?\}|\%[A-Za-z])(.*)\Z', re.DOTALL)
    EXPAND = re.compile(r'(\%\{.*?\}|\%[A-Za-z])', re.DOTALL)
    PARENS = re.compile(r'(\(([^()]*?)\))', re.DOTALL)
    def __init__(self, format='', *dicts, **terms):
        '''Constructs a new buzz generator that can return unique phrases.
        The first argument should define the basic format for all phrases
        generated.  There are two special strings that can be used in a
        format string: (x|y|z) or %{name}.  If the %{name} syntax is
        used, then the list of choices for that name should be passed in
        additional arguments, such as a dict or a named parameter.

        Each example gives identical results, but uses alternate
        arguments to define the expandable choices.

            # generates phrases like "a red block" or "a green car"
            buzz = Buzz('a (red|green|blue) (ball|block|car)')

            buzz = Buzz('a %{color} %{toy}',

            buzz = Buzz('a %{color} %{toy}',
                        { 'color': "red|green|blue",
                        'toy': ['ball','block','car'] } )

        Note that expansion terms can be nested, referring to other terms:

            buzz = Buzz('a %{adj} %{noun}',
                        adj='%{texture} %{color}',
                        garment='shirt|dress|hat|pair of boots')

        Circular or self-referential terms are not detected, but an error
        may be raised when trying to call the pick() method.

        If called without arguments, the strings returned are always
        empty.  The same options exist for the Buzz.setup() method which
        can be called subsequently.

        self.setup(format, *dicts, **terms)

    def reset(self):
        '''Forgets all previously returned results from select().'''
        self.spent = set([]) = 0

    def addterm(self, name, choices):
        '''Adds or replaces one term with a new set of expansion choices.
        If the example in the __init__() documentation is followed, then
        this method can replace existing terms:

            # changes the toy selection
            buzz.addterm( 'toy', 'doll|ball|truck|book' )
        if isinstance(choices, (str, unicode)):
            choices = choices.split('|')
        self.terms[name] = choices
        self.factors[name] = len(choices) = 0

    def addterms(self, terms):
        '''Adds or replaces multiple expansion terms from a dictionary.'''
        for term in terms:
            self.addterm(term, terms[term])

    def setup(self, format, *dicts, **terms):
        '''Replaces the entire expansion term selection from all arguments.
        This routine is called on initialization, but can be called at any
        other time as well.  See the __init__() documentation on the forms
        of arguments accepted.
        self.factors = { }
        self.terms = { }
        for d in dicts:
        if terms:
        format = self.autodef(format)
        self.format = self.split(format)
        self.picks = { }

    def autodef(self, format):
        '''Defines any terms directly from the format, in (x|y|z) syntax.
        Chooses new unique numerical names for the replacement terms.
        Returns a new format string which uses %{1} syntax instead.
        counter = 0
        m =
        while m:
            counter += 1
            while str(counter) in self.terms:
                counter += 1
            name = '%%{%s}' % counter
            format = format.replace(, name, 1)
            self.addterm( str(counter), )
            m =
        return format

    def split(self, format):
        '''Converts format string into list of string components.'''
        parts = [ ]
        m = Buzz.TRIFIX.match(format)
        while m:
            format =
            m = Buzz.TRIFIX.match(format)
        if format:
        return parts

    def pick(self):
        '''Returns a random phrase expanded from the defined choices.
        Self-referential terms for expansion will raise an error.
        An example of a self-referential expansion term would be:
            Buzz('a %{fruit} is sweet', fruit=['%{fruit}'])
        names = set([])
        parts = [ ]
        for part in self.format:
            m =
            while m:
                term =[2:-1]
                if term in names:
                    raise ValueError, 'self-referential choices '
                if term in self.terms:
                    choice = random.choice(self.terms[term])
                    part = part.replace(, choice)
                m =
        return ''.join(parts)

    def permutes(self):
        '''Returns a conservative estimate of the number of unique string
        selections that can be returned by pick().  This may be above or
        below the actual number of unique strings possible.  Given the
        way that terms can refer to other nested terms, we can only
        determine an estimate, and the select() method uses this estimate
        as a default number of retries to search for a new unique result.
        if > 0:
        total = 1
        for factor in self.factors.values(): total *= factor = total
        return total

    def select(self, retries=None):
        '''Tries to return a never-before returned phrase to the caller.

        Each time select() is called, a unique phrase is returned.
        If no unique phrases can be found, the None value is returned.
        This will happen if all permutations have already been selected.
        The None value may also be returned when nearly all of the
        possible permutations have been selected, and a new unique
        permutation is not found in a limited number of internal retries.
        See the permutes() method for more information.
        if retries is None:
            retries = self.permutes()
        for attempt in xrange(retries):
            try: choice = self.pick()
            except ValueError: continue
            if not choice in self.spent:
                return choice
        return None

    def sample(self, k=None):
        '''Returns a random previous result from earlier select() calls.
        Raises an error if select() has not been called since reset().
        if k is None:
            return self.sample(1)[0]
        if len(self.spent) < k:
            raise IndexError, 'must select() more before sample(%d)' % k
        return random.sample(self.spent, k)


# Run the module directly for a demonstration or self-test.

if __name__ == '__main__':

    buzz1 = Buzz('a %{adj} %{noun}',
                 adj='%{texture} %{color}',
                 garment='shirt|dress|hat|pair of boots')

    buzz2 = Buzz('a %{color} %{toy}',

    buzz3 = Buzz('(She|He) (found|saw) a (red|green|blue) (ball|block).')

    for buzz in (buzz1, buzz2, buzz3):
        seen = set([])
        for i in xrange(buzz.permutes() * 2):
            choice =
            if choice in seen:
                raise ValueError, 'repeated choice ' + repr(choice)
            if choice is not None:
                print repr(choice)
        print '%d unique selections' % len(seen)
        print '%d estimated permutes' % buzz.permutes()
        print '(expected:  selections <= permutes <= selections*10)'
        print repr(buzz.sample(2))

Contact Ed Halley by email at
Text, code, layout and artwork are Copyright © 1996-2013 Ed Halley.
Copying in whole or in part, with author attribution, is expressly allowed.
Any references to trademarks are illustrative and are controlled by their respective owners.
Make donations with PayPal - it's fast, free and secure!