Predictor

Generative Model using Markov's Chain Algorithm to Analyse a Corpus of Text to Learn Statistical Patterns of Word Sequences & Use Those Patterns to Generate New, Original Text

How It Works

The program is built around a Markov Chain model. The model works in two phases:

Training (Build Phase):
- The program reads a source text (the "corpus").
- It scans the text and breaks it down into sequences of words called prefixes. The length of these prefixes is determined by the prefixLen constant (e.g., a length of 2 means it looks at pairs of words).
- For each prefix, it records the word that immediately follows it (the suffix).
- It builds a map where each key is a prefix and the value is a list of all possible suffixes that have appeared after that prefix in the corpus. For example: {"A computer": ["is", "system"]}.
Generation (Generate Phase):
- The program starts with a random prefix from the ones it learned.
- It randomly selects one of the possible suffixes for that prefix to be the next word.
- The prefix is then updated by "sliding" it one word forward (dropping the first word and adding the newly chosen word).
- This process repeats until the desired number of words has been generated, creating a new block of text.

When running, you will see output in your terminal, first confirming that the model has been trained, and then showing the newly generated text.

Customisation

You can easily customise the behavior of the text predictor by changing the constants and variables in main.go:

Change the Corpus: Modify the corpus constant in the main() function. You can paste any text you like. For larger texts, consider reading from an external file.
Adjust Prefix Length: Change the prefixLen constant at the top of the file. A larger number (e.g., 3) will produce text that is more coherent but less varied, as it relies on longer learned phrases. A smaller number (e.g., 1) will be more random.
Change Output Length: In the main() function, change the number passed to model.Generate(). For example, model.Generate(100) will generate 100 words.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
README.md		README.md
go.mod		go.mod
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Predictor

How It Works

Customisation

About

Uh oh!

Releases

Packages

Languages

WillKirkmanM/predictor

Folders and files

Latest commit

History

Repository files navigation

Predictor

How It Works

Customisation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages