Dreamstate Coding: Prediction

In this little post I will go through a small and very basic prediction engine written in C# for one of my projects.
Basically what it does is the following:

It will collect data in the form of lists of strings
Given an input, it will give back a list of predictions of the next item. In falling probability order.

So no rocket science, but a good and basic prediction engine that can be build upon depending on what scenarios you want to solve.

First, some unit tests:

[TestMethod]
public void PredictorTest_NoMatch()
{
    var target = new Predictor();
    var actual = target.Predict("The rabbit-hole went straight on like a tunnel for some way");
    Assert.AreEqual(0, actual.Count);
}

[TestMethod]
public void PredictorTest_SingleMatch()
{
    var target = new Predictor();
    target.Predict("Alice opened the door and found that it led into a small passage");
    var actual = target.Predict("Alice opened the door");
    Assert.AreEqual(1, actual.Count);
    Assert.AreEqual("and", actual.First());
}

[TestMethod]
public void PredictorTest_SingleMatch_OtherNoise()
{
    var target = new Predictor();
    target.Predict("Alice opened the door and found that it led into a small passage");
    target.Predict("Alice took up the fan and gloves");
    var actual = target.Predict("Alice opened the door");
    Assert.AreEqual(1, actual.Count);
    Assert.AreEqual("and", actual.First());
}

[TestMethod]
public void PredictorTest_SingleMatch_TwoResultsInOrder()
{
    var target = new Predictor();
    target.Predict("Alice thought she might as well wait");
    target.Predict("Alice thought this a very curious thing");
    target.Predict("Alice thought she had never seen such a curious");
    var actual = target.Predict("Alice thought");
    Assert.AreEqual(2, actual.Count);
    Assert.AreEqual("she", actual.First());
    Assert.AreEqual("this", actual.Last());
}

So nothing too fancy. You provide it with input and as expected it returns a list of possible next values. Examples are from Alice's Adventures in Wonderland by Lewis Carroll, the Project Gutenberg edition.

Predictor class

public class Predictor
{
    private readonly Dictionary<string, Dictionary<string, int>> _items = new Dictionary<string, Dictionary<string, int>>();
    private readonly char[] _tokenDelimeter = {' '};
    public List<string> Predict(string input)
    {
        var tokens = input.Split(_tokenDelimeter, StringSplitOptions.RemoveEmptyEntries);
        var previousBuilder = new StringBuilder();
        Dictionary<string, int> nextFullList;
        foreach (var token in tokens)
        {
            nextFullList = GetOrCreate(_items, previousBuilder.ToString());
            if (nextFullList.ContainsKey(token))
                nextFullList[token] += 1;
            else
                nextFullList.Add(token, 1);

            if (previousBuilder.Length > 0)
                previousBuilder.Append(" ");
            previousBuilder.Append(token);
        }
        nextFullList = GetOrCreate(_items, previousBuilder.ToString());
        var prediction = (from x in nextFullList
            orderby x.Value descending
            select x.Key).ToList();

        return prediction;
    }
    private static T GetOrCreate<T>(Dictionary<string, T> d, string key)
    {
        if (d.ContainsKey(key))
        {
            return d[key];
        }
        var t = Activator.CreateInstance<T>();
        d.Add(key, t);
        return t;
    }
}

It's not smart, and it requires having seen quite a lot of samples to work well.
An additional improvement could be to add storage of previous/next token combinations so that if the full search doesn't yield any result, a guess could be made based on only the previous word. But I guess that is up to you.

All code provided as-is. This is copied from my own code-base, May need some additional programming to work.

Lately I've been thinking a lot about intelligence, knowledge, thinking and how to define the goal of my Artificial Intelligence project. What is it that I want to achieve?

This post is basically a lot of in progress thoughts and references to popular culture.

So where to begin?

The inner voice(s)

Early humans may have thought that the voice in their head was the voice of God, but when did we become aware of a self?
Is one voice enough? Or should one strive to include at least two? Maybe with different approaches in a dialog with itself?
Maybe a loop of self triggered input that iterates core values and thus would form the system over time. Even if a lot of input is given from various sources, the repeating of core values would make those beliefs strong in different parts of the system. Like a backstory. Guess more ideas on this will come with more episodes of Westworld.

Neural networks, the brain

I have a rudimentary knowledge of how the human brain works at its lowest level, mainly by building huge networks of interconnected cells, neurons, that can propagate triggers in between them. When a neuron it triggered, it will in turn trigger its output that can be connected to multiple other neurons. A single neuron can in turn receive triggers from multiple neurons. This gets kind of out of hand quite fast.
There are a lot of research into modelling artificial neural networks, and honestly my own work with neural networks have been to solve very specific tasks. Maybe the system should be able to train its own networks to solve different kind of tasks? Maybe put it in the backlog for future pondering. To start with I think this is a too low level to look at intelligence.

Active Symbols

Something I picked up from the book 'Gödel, Escher, Bach: An Eternal Golden Braid' by Douglas R. Hofstadter. The idea to look at the brain on a higher level than the neural. A symbol could be anything, for example a word or a concept. Each symbol it is built by neurons activating in certain patterns. Neurons could be re-used between different symbols and symbols could activate other symbols in a network in itself.
Does the artificial intelligence need the neural layer? Or could we build just symbols and just activate them in different ways?
Need to be able to merge multiple symbols into new ones.
In the end this could give some sort of associative power to the entity, by activating symbols based on input and seeing what other symbols also activate from past knowledge.

Feeling of joy

When do humans feel joy? When we discover new things, when things go according to plan.

Should my project 'feel'?
The following dialog between two characters, played by Dustin Hoffman and Samuel L. Jackson, is from the movie Sphere.

Norman:
I would be happy if Jerry had no emotions whatsoever. Because the thing of it is once you go down that road... here's Jerry, an emotional being cooped up for 300 years with no one to talk to... none of the socialization, the emotional growth that comes from contact with other emotional beings...
Harry:
So...?
Norman:
What happens if Jerry gets mad?

To the point, is it OK to build something that can feel? Could it be a bad thing? But if you build something intelligent that can't feel, are you creating a sociopath?
What about boredom? Doing routine tasks makes me bored, that feeling makes me look at problems in other ways, maybe automate them. Should the agent try to minimize boredom and maximize joy? Or is there different kinds of joy. A too greedy algorithm would maybe not be able to do long term planning, go through times of focus to achieve something great.
Should the fitness function be floating, like it is for me. Sometimes I feel happy just watching TV, other times only a 10 K gets me there. Maybe a healthy balance is what we should strive for here as well.

Breathing

At least for me, a lot of my thinking is based around the basic need to breath. An artificial entity would not have that need, but maybe there should be some kind if core layer pulse that keeps things going. Something that triggers and activates itself, maybe the inner voice should be linked to this.
Is attention span linked to breathing as well? How many breaths can you keep a thought in active play before something else pops up instead?

Compartmentalizing

Working with a lot of information and knowledge will lead to conflicting ideas. Some sort of way to compartmentalize, separate and isolate, should be a good thing to have.

Prediction

The ability to predict future events based on past experiences. A simple way would be to just store sequences of words and increase counters when seen. And when a sequence is seen, different optional predictions would trigger what could be acted on. Maybe one part global and another part connected to a place or person. Being able to predict how someone could react seems core when trying to decide what to do next. This would require some sort of linking of past knowledge to a source.

See:
Basic word prediction in c#

Lastly

Should we strive to create something artificial with all the restrictions that humans have or should we strive for something without restrictions other than the computer hardware in itself. Maybe a lot of the self that we know comes from the constant struggle with the faulty hardware that we are running on. Or is the movie Transcendence with Johnny Depp and Rebecca Hall into something, the idea of unlimited resources.

This was a lot longer than I thought it would be. In the end a lot of questions and few answers but I guess that was expected. I hope this sparked some ideas, thoughts or feelings. :)

Dreamstate Coding

Pages

Thursday, November 24, 2016

Basic word prediction in c#