User Guide

The easiest way to get started exploring SpeakToMe is to download the released source code from the download page.  After you have the source, load the solution into Visual Studio, set the TestUI  project as the startup project and run the application.  You should see a window appear like the one shown below.

TestUi

In the textbox, at the bottom, enter “hello” or “hi there” or “hello mr. roboto”.  The application should respond with “hello, test”.

 

Note that in the source code download, there are two projects, SpeakToMe.Core and SpeakToMe.Speech that constitute the processor, itself.  The remaining projects; SpeakToMe.Data, SpeakToMe.Presence and TestUI are included for demonstration purposes only and show how to implement the interfaces or descend from the base classes exposed. Below are more details on what is required to use the library in a new solution.

 

How it Works / Architecture

Below is a diagram that illustrates, at a high level, how SpeakToMe works.

Archetecture

 

Inside the engine, several Tokens are defined and each knows how to find particular words or phrases in the input.  When the token finds one of these phrases, it returns a TokenResult which includes the token’s type, a value that has been interpreted from the phrase, where in the input the phrase was found, etc.  When input is received, each token gets a shot at processing it and returning its results.  These TokenResults are then organized into a structure that facilitates their being matched with a rule.

Rules are simply methods that have been defined on a class implementing the IRuleClass interface.  These are loaded when the process starts and are cached.  The method signatures for a rule are constructed in such a way as to define a set of tokens as parameters.   When the token results are being matched against rules, the order and type of the tokens must match, exactly, the tokens specified in the rule parameter list.  Once a matching rule has been found, it is executed and the token results are passed in as parameters.

Extending the Processor

There are five places where you can plug your own code into the processor.

Bootstrapper

The engine uses MEF (Microsoft Extensibility Framework) as an IoC container.  In order to initialize the container will all the types the application will need, a catalog must be created.  In order to facilitate this, the concept of a bootstrapper was borrowed from the PRISM guidance.  In order to initialize the catalog, your application must extend the supplied BootStrapperBase class and override the AddCustomAssembliesToCatalog method.  This method is passed the catalog that will be used to resolve dependencies and the developer can add whatever types he desires to the catalog.  The base class will load all types in the SpeakToMe.Core.dll and SpeakToMe.Speech.dll assemblies.  The developer is responsible for adding any additional, required types via the bootstrapper descendant.  Please look at the TestUI project in the source code download to see an example of a bootstrapper implementation.

IUserData

The processor needs to be able to access data about users and their conversations.  Rather than dictate a data access approach the IUserData interface is provided.  Internally, the processing engine accesses data through this interface.  An implementation of the interface is expected to be provided and is loaded from the MEF container.  This means the developer must provide an implementation of the interface and export it so MEF can load it into the catalog.  So, a class implementing the interface would be defined similar to the below.

IUserData Implementation
  1. [Export(typeof(IUserData))]
  2.     [PartCreationPolicy(System.ComponentModel.Composition.CreationPolicy.Shared)]
  3.     public class UserData : IUserData
  4.     {

Also, in the bootstrapper implementation, the assembly containing the class must be added to the catalog.

Tokens

Tokens define the vocabulary of the processor, that is, the words and phrases it knows how to deal with.  There are already many tokens defined that deal with general concepts, dates and times, numbers, etc.  However, for a specific application, the vernacular of the business space represented will most likely need to be added to the system.  A token can be something as simple as locating a word in the input and representing it as a string containing the word or more complex.  Consider something like the word “first”.  Does this mean the first of the month?  The first in a series?  The position of the winner of a race?  In order to determine the meaning, it will be necessary to examine the words surrounding this one to determine the context it was used in which can get quite complicated.

There is a base Token class that already implements the ability to find a word or phrase.  If all your token needs to do is find a piece of text, inheriting from the base class makes it a very simple task.

Token Implementation
  1. [DataContract]
  2.     [Export(typeof(IParseToken))]
  3.     public class TokenYesNo : Token, IParseToken
  4.     {
  5.         /// <summary>
  6.         /// Initializes a new instance of the TokenYesNo class.
  7.         /// </summary>
  8.         public TokenYesNo()
  9.         {
  10.             this.Words = new List<string> { "yes", "no" };
  11.         }
  12.     }

in this example, if either “yes” or “no” are found in the input, a TokenResult for a TokenYesNo will be generated.  Notice the export attribute.  If the class does not implement IParseToken or doesn’t export itself as this type, it will not be loaded or used to parse the input.

Also of note is the ability to use inheritance in the token structure to make life easier when defining rules (discussed below).  Another way to have implemented tokens for “yes” and “no” would have been to define a TokenYesNo that did not inherit from the base Token class and that did not implement any logic.  Next, build both TokenYes and TokenNo classes that inherit from TokenYesNo.  When defining your rule, you could specify that you are expecting a TokenYesNo and because of polymorphism, either TokenYes or TokenNo could be specified.

Rules

If tokens are the vocabulary of the system, then rules are its cognition; its ability to recognize patterns of words and act on them.  Rules are just methods.  However, the following are required for the rule to be recognized and properly used by the system.

  • Rules must be defined in a class that implement the IRuleClass interface (defined in SpeakToMe.Core.Interfaces)
  • Rules must be static methods
  • Rules must return void
  • Rules must take, as their first parameter, a parameter of type ConversationContext
  • Rules must be defined in the SpeakToMe.Speech project (This is because of how the rules are loaded at startup).

After defining the first parameter which must be an instance of ConversationContext, any number of token types can be specified as parameters.  These tokens are meant to follow the structure of the sentence the rule is meant to react to.  For example, here are the rules defined in source code that is available for download.

Rule Definitions
  1. namespace SpeakToMe.Speech.Rules
  2. {
  3.     public class MiscRules : IRuleClass
  4.     {
  5.         public static void RespondToHello(ConversationContext cContext, TokenHello hello)
  6.         {
  7.             cContext.Say("Hello " + cContext.ConversationUser.FirstName, null);
  8.         }
  9.  
  10.         public static void RespondToHello(ConversationContext cContext, TokenHello hello, TokenQuotedPhrase phrase)
  11.         {
  12.             cContext.Say("Hello " + cContext.ConversationUser.FirstName, null);
  13.         }
  14.  
  15.         public static void RespondToHello(ConversationContext cContext, TokenQuotedPhrase phrase, TokenHello hello)
  16.         {
  17.             cContext.Say("Hello " + cContext.ConversationUser.FirstName, null);
  18.         }
  19.  
  20.         public static void RespondToHello(ConversationContext cContext, TokenQuotedPhrase phrase1, TokenHello hello, TokenQuotedPhrase phrase2)
  21.         {
  22.             cContext.Say("Hello " + cContext.ConversationUser.FirstName, null);
  23.         }
  24.     }
  25. }

The purpose of all of these rules is to respond to a greeting.  The first rule will respond to a solitary “hello”.  The second will match on “hello _______”, where the blank can be anything.  The system doesn’t care what follows the hello.  The third rule will match “_______ hello” such as “well hello” and the last rule matches “_________ hello ________” such as “well hello there”.  It is often a good idea to create groups of rules in this way to make what the user can input more natural and flexible.

IPresence

How does one get the input from the user and submit it to the system?  Well, the library contains the idea of presence for this purpose.  The idea is that the application can have many different types of presence.  It can be present on IM, email, a web service, etc.  The library includes the IPresence interface as a way to plug in these different ways to communicate with the system.  This interface looks like below.

Code Snippet
  1. namespace SpeakToMe.Core.Interfaces
  2. {
  3.     //Classes that implement this interface represent a form of presence / mode of communication with the system.  
  4.     public interface IPresence : IPartImportsSatisfiedNotification
  5.     {
  6.         bool IsConnected { get; }
  7.         void Initialize();
  8.         void ProcessCommand(string command, int userId, ISmartHomeServiceCallback callback);
  9.  
  10.     }
  11. }

In your custom implementation of  this interface, you would be responsible for coding up all the protocols requires and listening on that protocol.  Once you determine that a message has been received, you would determine the user that sent it (via the address or id associated with the message received) and pass that along with the command to your ProcessCommand call, which would in turn call into the command processor.  The callback parameter is only relevant if the input was received from an in-process call.  When the command has been processed and the system is ready to respond back to the user, an event will be raised and you must handle the event and return the reply back to the user.  Have a look at the sample in the SpeakToMe.Presence project.

More Resources

The below are other resources you can use if you have questions about this project.

Please note that as we become aware of the common questions that are being asked, we will expand this documentation to explain.

 

To Contribute

We welcome all who would like to contribute to this project!!  You may email me for more information at ghostwrtrone at yahoo dot com.



Last edited Dec 3, 2012 at 11:59 PM by GhostWrtrOne, version 6

Comments

No comments yet.