Skip to content

j33433/wordagen

Repository files navigation

Wordagen

Generate unique names for characters, online aliases, places, products, pets, brand names, hostnames or other realistic-sounding but original words.

Two generation methods available:

  • Syllable-based: Fast generation using phonetic rules (default)
  • Markov chains: More realistic words trained on dictionary data (auto-enabled by Markov options)

Installation

git clone https://github.com/j33433/wordagen.git
cd wordagen
python wordagen.py

Example Output

# Batch of unique names generated by Markov chains trained on US name lists
python wordagen.py --name --order=4 --count=10
Carletha Widden
Jeanmarilou Binnis
Millis Carottsch
Talita Butchel
Maryroseanna Frisker
Fletchelle Gloeck
Merlie Vanhollard
Marthey Battison
Laurenae Tucken
Latashida Santanas

# Spanish nonsense words (Hunspell dictionary with morphological expansion)
python wordagen.py --words=es --order=4 --count=20
exime         estomano      mación        faneguillar   anecer      
liviador      almente       ladrilar      fullecedor    mascaloide  
calencia      alfalfeta     huélano       sufrar        melodra     
indonero      percos        manos         emparecen     seminencia  

# Names starting with a prefix (--prefix automatically enables Markov mode)
python wordagen.py --words=names --prefix=joe --order=2 --count=10 --length=5-20
joemardoretta  joell          joellys        joemikaridy    joeston      
joelana        joelie         joellina       joeminicollia  joelanne     

# Words ending with common suffixes (--suffix automatically enables Markov mode)
python wordagen.py --suffix=ing --count=10
rementing     naling        glanning      twing         reminereping
istricarming  sumbetauding  rumming       drobberring   scaring     

# Words with both prefix and suffix
python wordagen.py --prefix=pre --suffix=ing --count=5 --order=4 --length=8-15
prewheretting   prewoodcurring  prewrapping     pretening       prewashinning 

# Syllable-based simple algorithm (default)
python wordagen.py --single
drubriebreat

# Latin tokens word-word-word (--order automatically enables Markov mode)
python wordagen.py --token --order=2 --words=la
sonis-calita-coris

Parameters

Length Control (--length)

  • --length=MIN-MAX: Range (e.g., 5-8)
  • --length=N: Exact length (e.g., 10)

Markov Chain Options

Markov mode is automatically enabled when any of these options are used:

Order (--order, default: 2)

  • 1: Most creative, often unpronounceable
  • 2..3: Good balance
  • 4..6: More realistic

Word Lists (--words, default: "en")

  • Language dictionaries (Hunspell): en, es, fr, de, it, pt, ar, bg, ca, cs, etc.
  • Special word lists: names, surnames, pet
  • Custom URLs: https://example.com/wordlist.txt
  • Use --list to see all available options (50+ languages supported via Hunspell)

Cutoff (--cutoff, default: 0.1)

  • 0.0: Include all transitions (most random)
  • 0.1: Filter rare patterns (balanced)
  • 0.5: Conservative, predictable output

Prefix (--prefix)

  • Start generated words with a specific prefix
  • Example: --prefix=steve generates words like "stevenson", "stevie"

Suffix (--suffix)

  • End generated words with a specific suffix
  • Can be combined with --prefix
  • Example: --suffix=ing generates words like "processing", "marketing"

Manual Markov Mode (--markov)

  • Explicitly enable Markov chains with default settings
  • Not needed if other Markov options are used

Python API

Syllable Generator

from syllable_generator import SyllableWordGenerator

generator = SyllableWordGenerator()
word = generator.generate(min_len=6, max_len=12)
words = generator.generate_batch(10, min_len=4, max_len=8)

Markov Generator

from markov_generator import MarkovWordGenerator

generator = MarkovWordGenerator(order=2, cutoff=0.1, words="en")
word = generator.generate(min_len=6, max_len=12)
words = generator.generate_batch(10)

Performance Notes

  • First run downloads word lists and builds chains (slower)
  • Subsequent runs use cached files (much faster)
  • Use --verbose to see initialization progress

Vibe Coded 🤖

Aider with Claude and ChatGPT

License

MIT

About

Wordagen generates words and names that sound like real words but are not

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages