NAV Navbar

Creating Mix.nlu models

Use Mix.nlu to build a highly accurate, high quality custom natural language understanding (NLU) system quickly and easily, even if you have never worked with NLU before.

About Mix.nlu

Mix.nlu provides a convenient web front-end allowing you to:

The goal of this is the creation of application specific language models (ALMs) for Natural Language Understanding (NLU) and domain language models (DLMs) for Automatic Speech Recognition (ASR). Model resources are built and deployed from the Mix Project Dashboard.

Client applications can then harness these models to transcribe speech into text using the ASR as a Service gRPC API and interpret text meaning using the NLU as a Service gRPC API.

The underlying ontology developed in Mix.nlu is also shared with the Mix.dialog tool, which is used to design conversational agent models that can leverage ASR and NLU resources to interpret user intent and respond appropriately to what people write and say. Client applications harness dialog models using the Dialog as a Service gRPC API.

Mix.nlu is the departure point for this conversational AI journey.

Note that [Mix.api](../mix-api/v4/#mix-api) provides a REST API that lets you programmatically interact with your Mix models to perform different tasks, including authoring.

Model development workflow

The following steps summarize the workflow to develop, deploy, and iterate on an NLU model and optionally a recognition-only domain language model (DLM):

  1. Create a project: The first step is to create a project in Mix.dashboard. This project contains all the data necessary for building your models.
  2. Develop your model: You then develop your model in Mix.nlu by creating your ontology and adding training samples.
  3. Train your model: Training is the process of having the model learn model parameters based on the training data that you have provided.
  4. Test your model: After you train your model, use the Try panel to test it interactively on sample sentences and tune it.
  5. Build your model: When you make a build, you create a model version, which is a snapshot of your model as it exists now.
  6. Create your application configuration: To use your model in an application, you create your application configuration, which is the combination of the model versions that you want to use in your application (for example, Mix.asr model v2 with Mix.nlu model v3 for project CoffeeMaker).
  7. Deploy your application configuration to an environment that is accessible by your application.
  8. Discover what your users say: Collect feedback on how well your model is performing by viewing how the model handled actual user utterances in the deployed application configuration.
  9. Circle back to step 2, refining the model based on insight from user data.

Open the project in Mix.nlu

To open a project in Mix.nlu:

  1. From Mix.dashboard, select your project in the Projects list.
  2. Click the .nlu icon.

Mix.nlu UI overview

The interface of Mix.nlu UI is divided into three tabs containing different functionalities to help you develop, optimize, and refine your NLU model.

  1. intent-verified Develop tab: Define the types of things your users will say, create and annotate examples of these sentences, and use these to train and test your model. The Develop tab offers a simpler interface intended for novice users working on smaller projects.
  2. intent-verified Optimize tab: Allows the same functionality as the Develop tab, but with some more advanced tooling to optimize development. The Optimize tab is intended for more advanced users working on larger projects.
  3. intent-verified Discover tab: For projects with a deployed application configuration, this tab shows recent data on what real users said, with information on how well your model understood what the users were saying. This gives useful feedback to further refine the model.

Click on one of the icons to enter the tab.

About the Mix.nlu Develop tab

You use the Mix.nlu Develop tab to create intents and entities, add samples, try your model, and then train it.

Develop tab

When you open the Develop tab, you see the following elements:

View samples for an intent

In the intents area, click on an intent to select that intent. This will replace the list of intents with a view of the specific intent, any entities that are linked to the intent, and a table of samples connected to the intent. Initially upon creating an intent, the intent will have no entities linked, and no samples. You need to link entities as needed and add samples.

View samples

If there are a lot of samples under the chosen intent, the samples will be displayed in pages. By default, 50 samples are shown per page. Controls at the bottom of the samples area let you navigate from page to page as well as change the number of samples displayed per page.

Multiple language support

Mix.nlu supports multiple languages (or locales) per project. As you can imagine, sample phrases of what your users may say will differ from one language to another. Your samples, therefore, will be different per language/locale.

To filter the list of samples, select the language code from the menu near the name of your project. (If your project includes a single language, no menu appears.)

For example, this project supports three locales, with en_US currently selected:

multi-lang-select

Mix.nlu also allows you to define different literals for list-type entity values per language/locale. This allows you to support the various languages in which your users might ask for an item, such as "coffee", "café", or "kaffee" for a "drip" coffee. More information on how to do this is provided in the sections that follow.

Develop your model

To develop your model, you:

  1. Add intents to your model. An intent defines and identifies an intended action. An utterance or query spoken by a user will express an intent, for example, to order a drink. As you develop an NLU model, you define intents based on what you expect your users to do in your application.
  2. Add entities to your model. Entities identify details or categories of information relevant to your application. While the intent is the overall meaning of a sentence, entities and values capture the meaning of individual words and phrases in that sentence.
  3. Link your entities to your intents. Intents are almost always associated with entities that serve to further specify particulars about the intended action.
  4. Add samples. Samples are typical sentences that your users might say. They teach Mix how your users will interact with your application.
  5. Annotate your samples. Once you define entities in an ontology, you need to annotate the tokens within the samples so that the machine learns.
  6. Modify intents and annotations. Make any required modifications to your intents and annotations.
  7. Verify samples before training. As a final step, review the verification status of each sample phrase or sentence. This is an essential step that has a direct impact on the accuracy of the data used to create your model(s).

Add intents to your model

An intent is something a user might want to do in your application. You might think of intents as actions (verbs); for example, to order. For more information about intents, see Intents.

To add intents to your model:

  1. In Mix.nlu, click the Develop tab.
  2. On the Intents bar, click the plus (+) icon to add an intent.
    add_project_icon
  3. Type the name of your intent (for example, ORDER_COFFEE) and press Enter.

The intent name is added to the list of intents.

Edit an intent name

To edit an intent name:

  1. In the Develop tab intents list, open the menu for the intent.
  2. Select Edit intent name. You can now edit the text of the intent name
  3. Make the edits to the intent name.
  4. Press Enter or click the check icon to make the change. If you instead want to cancel the edit and go back to the existing name, press Escape or click the x icon.

Add entities to your model

Entities collect additional important information related to your intent. You might think of entities as analogous to variable slots or parameters that, when filled in with user-provided details, make the intent specific and actionable.

For example, if a user has the intent to order an coffee-based drink, the user would need to specify to the agent what type of coffee they want, how big a cup they want, any flavoring they want to add, and so on. These details can vary from order to order, but generally speaking some of these details will always need to be specified to make a coffee order. So in this case for example you might include entities such as COFFEE_TYPE, COFFEE_SIZE, FLAVOR, and so on.

Each entity, as a variable, can take on some set of possible values. So for example, when a user wants to order a coffee and says "Can I have a large vanilla latte," entities take on the following values:

This section describes how to create and define custom entities, which are specific to the project. It also describes the configurable settings for entities.

Note that when you want to define entities for your intents, you also have the option to use one of the existing predefined entities, which are entities that have already been defined to save you the trouble of creating them from scratch.

Examples of predefined entities include:

For more information, see Predefined entities.

To simplify your model, avoid adding a unique entity for each instance of a similar item. Instead, add a single entity that describes a general type of item. For example, if you are making a model that will handle orders for Cappuccino, Espresso, and Americano, it doesn't make sense to treat these as different entities, because they are closely related. It makes sense to treat these as different values of a common entity named COFFEE_TYPE.

NOTE TO SELF: WILL WANT TO HAVE A NOTE HERE ABOUT ENGINE PACKS, DATA TYPES ONLY FOR ENGINE PACKS AFTER x.y.z.

Data types

An entity is like a variable containing a piece of information relevant to an intent. Like a variable in a computer program, an entity in Mix can be specified with a data type aligned with the kind of contents the entity will hold. Entities in Mix are shared between Mix.nlu and Mix.dialog. The data type forms a contract between Mix.nlu and Mix.dialog that allows dialog designers to use methods and formatting appropriate to the data type of the entity in messages and conditions.

The available data types are as follows:

Data type Description Use case examples
Generic Text data without any special format. A name of a person, names of product types
Yes/No Yes or no The answer to a yes/no question
Boolean True or false The answer to a true/false question
Number A numerical quantity A quantity measured with a whole number or decimal
Digits A sequence of digits from 0-9 A PIN, an ID code
Alphanumeric A sequence of letters or numbers, A-Z, a-z, 0-9 A user name, an ID code, a license plate number
Date A YYYYMMDD date A calendar date
Time An HHMM time A clock time
Amount A quantity with units, defined by the magnitude and units A monetary amount
Distance A measure of distance, including magnitude and distance unit Distance in kilometers, meters, miles, and so on
Temperature A measure of temperature, including possibly signed magnitude and units Temperature in Celsius or Fahrenheit

If you have a previously existing project, with previously created entities, by default the entities will initially have a special data type of "Not set." This will behave the same as Generic type. You cannot set a newly created entity to Not set.

Collection methods

An entity also has a collection method. A collection method is related to how the set of possible values of the entity can be enumerated or defined.

Along these lines, entity collection methods are as follows:

Collection method Description
List A list entity has possible values that can be enumerated in a list. For example, if you have defined an intent called ORDER_COFFEE, the entity COFFEE_TYPE would have a list of the types of drinks that can be ordered. See List entities.
Regex-based A regex-based entity defines a set of possible structured text string values using a regular expression pattern. See Regex-based. For example account numbers, postal (zip) codes, order numbers, and other pattern-based formats.
Rule-based A custom rule-based entity defines a set of values based on a GrXML grammar file. While regular expressions can be useful for matching patterns in text-based input, grammars are useful for matching multi-word patterns in spoken user inputs. This type is only available for some users. See Rule-based entities.
Relationship A relationship entity has a specific relationship to one or more existing entities, with either a subtype (isA) or composition (hasA) relationship. See Relationship entities. For example a NAME hasA FIRST_NAME and hasA LAST_NAME; a DESTINATION isA LOCATION
Freeform A freeform entity is used to capture user input that you cannot enumerate in a list. For example, a text message body could be any sequence of words of any length. In the query "send a message to Adam hey I'm going to be ten minutes late", the phrase "hey I'm going to be ten minutes late" becomes associated with the freeform entity MESSAGE_BODY. See Freeform entities.

The collection method determines how the NLU service will look for and collect matches for the entity in user text input. If the data type specifies what is collected, the collection method specifies how it is collected. Choosing the right collection method makes it easier for your semantic model to pick out the appropriate entity content and interpret entity values from user utterances.

Data type and collection method compatibility

Specific data types are compatible with some collection methods but not with others. Each data type has a default collection method which will be set initially if a data type is selected but the collection method is not specified.

Data type Compatible collection methods Default collection method
Generic All collection methods List
Yes/No List
Rule-based
Relationship isA YES_NO
Relationship isA YES_NO
Boolean List
Rule-based
Relationship isA nuance_BOOLEAN
Relationship isA nuance_BOOLEAN
Number Rule-based
Regex
Relationship isA nuance_CARDINAL_NUMBER
Relationship isA nuance_DOUBLE
Relationship isA nuance_NUMBER
Relationship isA nuance_NUMBER
Digits Rule-based
Regex
Relationship isA nuance_CARDINAL_NUMBER
Relationship isA nuance_CARDINAL_NUMBER
Alphanumeric List
Rule-based
Regex
List
Date Rule-based
Relationship isA DATE
Relationship isA DATE
Time Rule-based
Relationship isA TIME
Relationship isA TIME
Amount Rule-based
Relationship isA nuance_AMOUNT
Relationship isA nuance_AMOUNT
Distance Rule-based
Relationship isA nuance_DISTANCE
Relationship isA nuance_DISTANCE
Temperature Rule-based
Relationship isA nuance_TEMPERATURE
Relationship isA nuance_TEMPERATURE

When creating a new entity, Mix will support you in selecting a compatible collection method. When you first create your entity, Mix will automatically assign the default compatible collection method.

If you then decide to choose a different collection method, Mix will give you recommendations for the most compatible collection methods and advise you on which collection methods are not recommended for the data type.

If you use Relationship isA as a collection method, the predefined entities available to choose from for the isA relationship will be restricted based on what is compatible with the chosen data type. For example, if your data type is Date, Mix will allow you to choose Relationship isA DATE.

The Generic data type should be used if you want to set an entity with collection method of isA relationship to predefined entities that are not covered by other data types. For example, nuance_DURATION or nuance_QUANTITY.

Why is compatibility important?

Choosing collection methods compatible with the data type helps Dialog work more effectively when Dialog is using the NLU service for interpretation of the text of user inputs. In this case NLU is more likely to capture entity values whose format aligns with the format of the data type Dialog expects. This allows you to more effectively tune conditions and message formatting in your dialog flows.

Impacts of changes to data or collection method

If you try to change either the data type or the collection method in a way that would break compatibility, you will receive a warning, and be invited to select a collection method compatible with your data type.

Data type and collection method compatibility

You can however choose to ignore the compatibility warning and proceed.

Create an entity

To add entities to your model:

  1. On the Entities bar, click the search bar.
  2. Type the name of the entity (for example, COFFEE_TYPE) and click the Entity icon Add entity icon.
    A menu Add a new entity to collect appears.
  3. Under Add a new entity to collect, select a data type for your entity.
  4. Click the Add Entity icon Add entity icon to create the new entity.
    Select data type
  5. Click the name of the new entity in the Custom Entities list to open the entity editor and perform additional configurations. The entity editor appears. It contains two sections: Data type and Advanced settings. The Data type section is collapsed initially, but allows you to modify the data type. The Advanced settings section allows you to set other configurable items for the entity.
    Advanced settings
  6. To simplify things for you, the default collection method for your chosen data type is preselected for you. If you are happy with this, and the default is not the list collection method, you are done. Otherwise, continue. If the default collection method chosen is list, proceed to step 8. Otherwise click the Edit collection method toggle and then proceed.
    Edit collection method
  7. Under How you are collecting from the user, select a collection method for the entity. Mix gives you a short list of recommended collection methods for your chosen data type. Again, the most recommended default option for your data type is pre-selected.
    Choose collection method
  8. Make sure to select the sensitive checkbox if your entity will collect sensitive data that should not appear in call logs.
    Sensitive checkbox
  9. Configure other details of the entity as appropriate (see the Advanced settings table below for a description of the fields).
    Advanced settings

Advanced settings

The following settings are available in the advanced settings section. Note that some of these are applicable only when specific collection methods are selected.

Field Description
Collection method Specifies the type of entity. Selectable under How you are collecting info from the user .
Referenced as Defines how the entity can be referred to; for example, whether it is referring to a person (CONTACT: "him"), a place (CITY: "there"), a thing (APPOINTMENT, "it"), or a moment in time (APPOINTMENT_TIME: "then"). These are used for handling anaphoras in dialogs.
Sensitive Indicates whether the entity contains sensitive personally identifiable information. Values assigned to any entity marked as Sensitive at runtime will appear in call logs as a masked value. for more details, see Handling sensitive information
Note: This only applies to call logs, not diagnostic logs.
Dynamic (Appears when editing entities with list collection method only) Indicates if the entity is dynamic or not. Dynamic list entities allow you to upload data dynamically at runtime. See Dynamic list entities.
Literals (Appears when editing entities with list collection method only) Lets you enter literals and values. A set of literals is the range of tokens in a user's query that corresponds to a certain entity. With literals, you can specify misspellings and synonyms for an entity's value. For example, in the queries "I'd like a large t-shirt" and "I'd like t-shirt, size L", the literals corresponding to the entity SHIRT_SIZE are "large" and "L", respectively. In both cases, the value is the same. Literals can be paired with values, which are then returned in the NLU interpretation result. For example, "small", "medium", and "large" can be paired with the values "S", "M", and "L". For projects that include multiple languages, you can specify variations per language/locale for an entity value.
See List entities for details.
Note: There is a limit to the number of literals that you can enter. See Limits for more information.
Your relationships (Appears when editing entities with relationship collection method only) Lets you define the entity in relation to other user-defined or predefined entities.

The next step is to link your entities to your intents so that they can be interpreted.

For example, if you have an intent called ORDER_COFFEE that uses the COFFEE_SIZE and COFFEE_TYPE entities, you need to link these entities with the ORDER_COFFEE intent. You also need to link any predefined entities that you want to use.

To link entities to your intents:

  1. On the Intents bar, select the intent.
  2. Click the link entity plus (+) icon and select the entity to link.
    link_entity_to_intent
  3. Repeat for each entity that you want to link to the intent.

Add samples

Samples are typical phrases or sentences that your users might say. They teach Mix how your users think (their mental models) when interacting with your application.

If your project includes multiple languages, be sure to select the appropriate language before you start to enter samples.

multi-lang-select

You can enter a maximum of 500 characters per sample.

In Mix.nlu, you can add samples in a few different ways:

Samples can be added one at a time under a selected intent in the Develop tab. Samples can also be added up to 100 at a time in the Optimize tab.

Samples can be uploaded as a .txt file from:

The more samples you include for each intent, the better your model will become at interpreting.

For optimal machine learning, samples should be based on data of real-world usage.

Add samples one at a time under a selected intent

To add samples:

  1. (As required) Select the language from the menu near the name of the project.
  2. In the Intents area, click the name of the intent.
  3. In the "The user says" field, type a sample utterance and press Enter. For example, "I want a double espresso."
  4. Repeat this procedure as needed to add samples.

Import multiple samples at once using text file import

To add multiple samples at once via a .txt file upload:

  1. (As required) Select the language from the menu near the name of the project.
  2. In the intents bar, click the file-upload upload icon. An Upload a file dialog will open.
    Text file upload
  3. Use the file picker to select a .txt file containing samples.
  4. Select an intent under which to upload the samples
  5. Click Upload to initiate the upload
    Upload

Samples uploaded to a specific intent are attached to that intent in Mix.nlu, but there is no annotation marked for any of the new samples. You will want to go in and add annotations after uploading.

The file upload in the Develop tab is intended for simple imports under one intent.

More advanced text file upload of samples is available in Mix.dashboard and in the Optimize tab. The dashboard and Optimize file import allow you to apply Auto-intent to the samples.

For additional details on importing samples, see Import data. For information about creating data sets see Generating data and training the initial model.

Note on samples and contractions

Contractions are common in a number of languages, in particular in many European languages like English, French, and Italian. A contraction is a shortened version of a word or group of words combined together by dropping letters and joining with an apostrophe. For example, he's and didn't in English, c'est and l'argent in French, and c'è and l'estratto in Italian.

When sample sentences are added to Mix, whether via import or by typing the sentences in the Develop tab under an intent, the sample sentence is tokenized — broken up into individual tokens (individual units of meaning, usually words) that can be marked up with annotations.

For some languages, the tokenization may work differently than you might expect when encountering contractions using an apostrophe. Sometimes, the tokenization will split the two parts at the apostrophe, with the first part, apostrophe, and second part split as separate tokens.

There is not currently a workaround for this, but be aware that you may see this behavior in some cases.

Edit the sample text

To edit the text of a sample:

  1. Open the menu for the sample.
  2. Select Edit sample.
  3. Make the edits to the sample text.
  4. Press Enter or click the check icon to make the change. If you instead want to cancel the edit and go back to the existing text, press Escape or click the x icon.

Annotate your samples

The final step in developing your training set is to annotate the literals in your samples with entities and tag modifiers.

Annotate samples

This will help your model learn to not only interpret intents, but also the entities related to the intents.

Annotated sentence example

As a simple example, consider the following sentence for an intent ORDER_COFFEE:

I want a large cappuccino.

Suppose that this intent has two linked entities, COFFEE_SIZE and COFFEE_TYPE. You can annotate this sample sentence to indicate which entities correspond to what literals. You could annotate the sample as follows:

I want a [COFFEE_SIZE]large[/] [COFFEE_TYPE]cappuccino[/]

Here, the word large is annotated with the COFFEE_SIZE entity and cappuccino is annotated with the COFFEE_TYPE entity.

Annotation use cases

Be aware that some of the details of annotation will depend on whether you are:

More details are available in the sections below.

Selecting tokens

To annotate a sample, you first need to select the relevant tokens in the sample that you want to annotate. Note that a literal can potentially span multiple consecutive tokens, for example, "United States of America". Click on the first and last words for the literal. This highlights and brackets the span of words you want to label. It also opens an entity selection menu to select an entity label.

Annotate multiword entity

If you make a mistake and need to deselect and start again, simply click anywhere on the screen. Once you have finished selecting the relevant tokens, select the appropriate entity from the menu to apply the annotation.

Annotating tokens with no previous annotations

If you are annotating a previously un-annotated span of tokens, you can choose an entity from one of two sources in the entity selection menu:

  1. From a list of entities that have already been linked to the present intent. If any entities have already been linked, these will appear at the top of the list in the menu.
  2. From one of the other user-defined or predefined entities available in your project, using Link Entity.
    1. Select Link Entity from the menu.
    2. Select Custom Entities to browse the list of user-defined entities, or Predefined Entities to browse the list of predefined entities.
    3. Select an entity to complete the annotation. This entity will also be linked to the present intent.

Annotating previously annotated tokens

If you try to annotate a span of text that has already been annotated with an entity, the Link Entity option will be unavailable.

Generally, you will also not be able to annotate that span of text with any of the other entities linked to the intent. The exception to this is if a hierarchical relationship (hasA) entity has already been linked to the intent, and the entity for the annotated text is either the inner or outer part of that relationship. In that case the other entity will be available in the list of entities and you will be able to annotate over or within the same text.

For example, suppose your intent has a linked entity FULL_NAME, which is a hasA relationship entity containing two inner entities GIVEN_NAME and FAMILY_NAME. Suppose you have a sample with the following partial annotation:

Notify [FULL_NAME]John Anderson[/].

You will still be able to annotate within this span of text to annotate John with GIVEN_NAME and Anderson with FAMILY_NAME.

You can also still apply tag modifiers, as applicable.

Tag modifiers

A tag modifier modifies or combines entities using a logical operator AND, OR, or NOT.

AND and OR modify two instances of the same entity type to represent one entity value and/or the other. NOT modifies one entity to represent not selecting that entity.

To add AND, OR, or NOT tag modifiers to your annotation, first annotate the entities you want to modify. Then select the entities to modify by clicking the first annotation and then clicking the last annotation. Select Tag Modifier and the appropriate modifier from the entity selection menu.

For example, consider the following partially annotated sentence:

I want a [COFFEE_TYPE]cappuccino[/] and a [COFFEE_TYPE]latte[/]

To annotate with the AND modifier, click the annotation for cappuccino and then the annotation for latte to select both as well as any tokens in between. With the span encompassing both COFFEE_TYPE annotations selected, choose the AND modifier in the Tag modifier sub-menu. The AND modifier is added, wrapping the two COFFE_TYPE annotations:

I want a [AND][COFFEE_TYPE]cappuccino[/] and a [COFFEE_TYPE]latte[/][/]

Annotating with an OR modifier is similar.

To understand how to annotate with a NOT modifier, consider the following partially annotated sentence:

I would like a [COFFEE_SIZElarge[/] [COFFEE_TYPE]coffee[/] with no [SWEETENER]sugar[/].

Here you want to add a NOT annotation to the sample to help your model distinguish between asking for sweetener vs asking specifically not to put sweetener. Click the word not and the SWEETENER annotation to select both, and then choose NOT from the Tag modifier sub-menu. The NOT modifier is added:

I would like a [COFFEE_SIZElarge[/] [COFFEE_TYPE]coffee[/] with [NOT]no [SWEETENER]sugar[/][/].

For information on verifying the status of samples, see Verify samples.

Modify intents and annotations

Mix.nlu provides various ways to modify the intents and annotations that you have added.

Fix incorrect samples

If you make typos while adding samples, or if some samples were not transcribed correctly, you should fix them to make sure that they correspond to what users actually said. This builds a better model.

To fix an incorrect sample:

  1. Click the ellipsis icon ellipsis icon beside the sample that you want to edit and click Edit.
  2. Correct the text as appropriate.
  3. Click the checkmark to save your changes.

Edit or remove annotations

To change an entity that annotates a sample:

  1. Click the entity in the sample then click Remove.
  2. To choose a new entity, click the literal and choose a new entity.

Change intent

To assign one or more samples to a different intent, use the Move selected Samples dialog. When moving sample sentences, you can choose to also move or delete any annotations that you've made.

You can move the samples to either an existing intent, or a new intent that you create on the fly.

There are three ways to initiate a change of intent for samples:

To assign one or more sample sentences to a different intent:

  1. Select one or more samples. You can click the ellipsis icon ellipsis icon or the intents dropdown (Optimize tab) for the sample to select a single sample, or use the checkboxes to select one or more samples.
  2. Select to move sample using one of the available ways:
    • If using the ellipsis menu, click Move sample.
    • If using the intents dropdown in Optimize, select one of the existing intents or create a new one. If you choose NO_INTENT or UNASSIGNED_SAMPLES, or create a new intent, the sample will be moved immediately to the chosen intent, and you will be done. Otherwise, proceed to step 3.
    • If selecting with checkboxes, click the change intent icon in the header bar. This launches the Move sample(s) dialog.
  3. In the Move samples dialog, if not done in the previous step, select an existing intent to move to, or create a new one. If choosing an existing Intent, pick a specific other intent, NO_INTENT, or UNASSIGNED_SAMPLES. If creating a new Intent, enter a name for the new intent.
    move samples
  4. Click Move to proceed.
    move samples Mix.nlu will review the samples you are moving, the entity annotations for those samples, the target intent, and its linked entities as applicable. In the following cases, Mix.nlu will simply proceed with the move, and you will be done (otherwise proceed to step 5):
    • The samples do not contain annotations
    • You are moving the samples to a newly created intent. In this case, the entities will automatically be linked to the new intent upon moving.
    • You are moving to an existing intent and the entities in the annotations are all already linked to the new intent
  5. If the samples do contain annotations, and some of the entities are not already linked to the target intent, you will be invited to either keep the annotations and import the entities or remove them from the samples. (This choice is not available when moving intents to UNASSIGNED_SAMPLES. Annotations will be removed if moving to UNASSIGNED_SAMPLES.)
    move samples
  6. Click Move.

The verification status of the samples after the move depends on the initial verification state and how sample entities are being handled.

Initial verification status Final verification status
Excluded Excluded flag removed. Goes to either Intent-assigned or Annotation-assigned depending on native state and previous considerations.
UNASSIGNED_SAMPLES Goes to Intent-assigned.
Existing intent, Intent-assigned Goes to Intent-assigned.
Existing intent, Annotation-assigned If removing entity annotations, goes to Intent-assigned.
If not removing entity annotations, goes to Annotation-assigned.

Assign NO_INTENT

Sometimes an entity applies to more than one intent or, to look at it another way, an entity can mean different things depending on the dialog state. Rather than add this entity to multiple intents, it's best to use NO_INTENT.

Consider these two example interactions. The first one is in the context of booking a meeting.

User: Create a meeting
System: For when?
User: Tomorrow at 2

This second example is in the context of booking a flight.

User: Book flight to Paris
System: For when?
User: Tomorrow at 2

In each of these interactions, there is a clear intent in the user's first statement, but the second utterance on its own has no clear intent.

In this case, it's best to tag "Tomorrow at 2" as [nuance_CALENDARX]Tomorrow at 2[/] to cover both scenarios (and not as [MEETING_TIME]Tomorrow at 2[/] or [FLIGHT_DEPARTURE_TIME]Tomorrow at 2[/]).

As shown in the examples, often these words or phrases are fragments and are used in a dialog as follow-up statements or queries.

NO_INTENT can also be used to support the recognition of global commands like "goodbye," "agent" / "operator," and "main menu" in dialogs. For more information, see configure global commands in the Mix.dialog documentation.

Verify samples before training

Before generating models, verify your training sample data. This step involves reviewing each sample phrase or sentence for intents and entities and ensuring that they have been assigned the correct status. It also involves confirming which samples to include in the training set for the model, and which to exclude.

This process improves your model's accuracy.

Verification of the sample data needs to be carried out for each language in the model, and for each intent.

Open and view samples by language and intent

To get started, open up the set of sample sentences for the language and intent.

  1. Open the Develop tab.
  2. (For multi-language projects) Select the language from the menu near the name of the project.
  3. Click an intent to view the samples.

Display status information

By default, status information for samples is not displayed. To see the status information, click the status visibility toggle.

verify_status_toggle

Status icons will then appear to the left of the sample items (Or on the right for samples in right-to-left scripts).

View samples with status

In the same area as the Status visibility toggle are toggles for:

Overview of verification states

Samples can be in the following verification states:

Icon State Description
intent-assigned Intent-assigned A half-filled circle icon indicates that the sample has been assigned an intent.
For example, via .txt or TRSX file upload, by adding a sample using Try, or by manually adding a sample phrase or sentence to an intent in the Mix.nlu UI.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state will only be used to detect the intent. The data provided by this sample will not be used to detect the presence of Entities.
annotation-assigned Annotation-assigned A filled-circle icon indicates that the sample has been assigned an intent and annotation is complete.
Sample can be annotation-assigned via TRSX file upload or in the Mix.nlu UI.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state are used to detect the intent as well as any annotated entities. If such a sample contains a literal that appears in an entity but is not annotated, it will be used as a "counter example" for that entity; that is, it will lower the chance of such entity literals being detected.
excluded Excluded A "pause" icon indicates that the sample, although assigned an intent, is to be Excluded from the model.
Sample can be Excluded in the UI or via TRSX file upload.
Sample may or may not be annotated.
Impact of this state on the model: Samples assigned this state are Excluded.

Samples assigned to UNASSIGNED_SAMPLES, either via .txt or TRSX file upload or manually in the UI, do not have a status icon. These samples contain no annotations and are excluded from the model.

Exclude or include samples

You can exclude a sample from your model without having to delete and then add it again. By default, new samples are included in the next model that you build. By excluding a sample, you specify that you do not want it to be used for training a new model. For example, you might want to exclude a sample from the model that does not yet fit the business requirements of your app.

To exclude a sample, click the ellipsis icon ellipsis icon beside the sample and then choose Exclude.

Exclude sample

An excluded sample appears with gray diagonal bars and the status icon changes to indicate it is excluded.

Sample excluded

You can still modify the excluded sample. Any annotations that were attached to the sample before it was excluded are saved in case you want to re-include it later.

To include a previously excluded sample, either use the ellipsis icon menu or click on the status icon. The sample is restored to its previous state with any previous intent and annotations restored.

Change the status of a sample

When you start annotating a sample assigned to an intent, its state automatically changes from Intent-assigned to Annotation-assigned. This signals to Mix.nlu that you intend to add the sample to your model(s). You can always choose to assign a different state to the sample; for example, to exclude it (change the state to Excluded) or to use it to detect intent only (change to Intent-assigned).

To change the status of a sample, hover over the status icon and click. This will allow you to change the state from Intent-assigned to Annotation-assigned or vice-versa.

Filter displayed samples by status

When there are a lot of samples for an intent, you may want to filter the displayed samples by status. To do this, open the drop-down menu next to the status visibility toggle to choose the status to display.

Filter samples by status

Notes

Bulk operations

For convenience, bulk operations are available to allow you to perform actions on multiple samples within an intent at once. You can include or exclude samples, assign them as Intent-assigned, or assign them as Annotation-assigned. You can also choose to remove the selected samples or move them to another intent.

Before you can apply a bulk operation, you first need to select one or more samples.

There are a few ways to do this.

To choose a few samples on the present page, use the check boxes beside the samples to individually select the samples.

Alternatively, you can select all samples on the current page by clicking the Select this page check box above the list of samples. Clicking the check box beside an individual selected sample deselects that sample.

There is an indicator on the row above the samples indicating how many samples are currently selected out of how many total samples. When you have not yet selected samples, this will show 0 / total samples. The total samples count is shown as a hyperlink. Clicking the total selects all samples on all pages.

Bulk select

Deselecting an individual sample when all samples on all pages are selected deselects that sample, as well as the samples on the other pages not currently displayed.

Changing the number of rows per page or navigating to a different page within the intent will not affect the current selection if no other changes are made.

However, all selected samples will be deselected if you do any of the following:

Once you have selected a set of samples, apply the bulk operation to the selected samples by clicking the appropriate icon in the row above the samples.

Choose bulk operation

The general idea here is that bulk operations apply to all selected samples, but there are operation-specific particularities you should be aware of.

Operation Notes on behavior
Exclude Already excluded samples will stay as-is. Intent-assigned and Annotation-assigned samples will be excluded, but the previous state, including any assigned intent and annotations, will be remembered in case you want to re-include the sample.
Include Already included samples will stay as-is. Previously excluded samples will be re-included with the same verification state as they had before being excluded.
Intent-assigned Excluded samples are not impacted and stay excluded.
Annotation-assigned Excluded samples are not impacted and stay excluded.

Only visible samples can be selected for bulk changes, that is, samples that have not been filtered from the view.

It is also possible to Perform bulk operations in the Optimize tab. The Optimize tab allows a broader set of operations which can be applied across all intents rather than just one.

Train your model

Training is the process of building a model based on the data that you have provided.

If your project (or locale) contains no samples, you cannot train a model. You need at least one sample sentence that is either intent-assigned or annotation-assigned. Be sure to verify samples.

Developing a model is an iterative process that includes multiple training passes. For example, you can retrain your model when you add or remove sample sentences, annotate samples, verify samples, include or exclude certain samples, and so on. When you change the training data, your model no longer reflects the most up-to-date data. As this happens, the model must be retrained to enable testing the changes, exposing errors and inconsistencies, and so on.

Training a model

To train your model:

  1. In Mix.nlu, click the Develop tab.
  2. (As required) Select the locale from the menu near the name of the project.
  3. Click Train Model.

Mix.nlu trains your model. This may take some time if you have a large training set. A status message is displayed when your model is trained.

To view all status messages (notifications), open the Console panel Console panel icon.

Training a model that includes prebuilt domains

If you have imported one or more prebuilt domains, click the Train Model button to choose to include your own data and/or the prebuilt domains. Since some prebuilt domains are quite large and complex, you may not want to include them when training your model.

To train your model to include one or more domains:

  1. Click the arrow beside Train Model.
    The list of prebuilt domains is displayed in addition to your own data.
    In the example below, the Nuance TV and Nuance Weather prebuilt domains have been imported into the project:
  2. Check the domains you want to include.
  3. Check My data to include your data.
  4. Click Train Model.

Training warning and error logs

Training error log example

Sometimes during the training process, issues can arise with the training set. This can result in either warnings or errors or both.

Errors are more serious issues that cause the training to fail outright.

Warnings are other issues that are not serious enough to make the training fail but nevertheless need to be brought to your attention.

Samples with invalid characters and entity literals and values with invalid characters are skipped in training but the training will continue. Such a sample is set to excluded in the training set so that it will not be used in the next training run or build.

Detailed information about any errors and warnings encountered during training is provided as a downloadable log file in CSV format. If only warnings are encountered, a warning log file is generated. If any errors are encountered, an error log file is generated describing errors and also any warnings.

A download link appears next to the Train Model button. The type of log file (error vs warning) is indicated by an icon beside the link, error icon for errors and warning icon for warning. Click to download the CSV file.

The file includes one line for each error and/or warning encountered, with two columns. One column gives the severity of the issue, either WARNING or ERROR, while the other column gives a message containing details.

Download log

Test it

After you train your model, you can test it interactively in the Try panel. Use testing to tune your model so that your client application can better understand its users.

The Try panel is available in both the Develop and Optimize tabs.

Try to interpret a new sentence

To test your model:

  1. In Mix.nlu, click the Develop tab.
  2. (As required) Select the language from the menu near the name of the project.
  3. Click Try. The Try panel appears.
  4. Enter a sentence your users might say and press Enter.

Read and understand the results

The Try panel presents the response from the NLU engine.

The Results area shows the interpretation of the sentence by the model with the highest confidence. In the image here, the Results area displays the orderCoffee intent with a confidence score of 1.00. The Results area also shows any entity annotations the model has been able to identify.

Note that the Results area will not reflect any the changes you have made to intents and entities since the last time you trained the model.

No annotations appear in the Results area if the NLU engine cannot interpret the entities in your sample using your model. Also, there is no annotation for dynamic list entities. Only your client application can provide this information at runtime.

Full information from the NLU engine, including all interpretations, appears formatted as a JSON object. For easier reading, you can expand or collapse sections of the information. You can also copy the results JSON, or sections of it to the clipboard.

For more information on the fields in an interpretation, see InterpretResult in the NLUaaS API documentation.

Add the sentence to the training set

If you are unsatisfied with the result in Try, you can add the sentence to your project as a new sample and then manually correct the intent or annotations. Realistic sentences that the model understands poorly are excellent candidates to add to the training set. Adding correctly annotated versions of such sentences helps the model learn, improving your model in the next round of training.

To add a sentence you have just tested, click Add Sample. The sample will be added to the training set for the intent identified by the model, along with any entity annotations the model recognized.

If Try recognized an intent, but no entities, the new sample will be added as Intent-assigned.

If Try also recognized entities, the new sample will be added as Annotation-assigned.

If the same sentence is already in the training set with the same annotations, the count will be updated for that sentence. If the same sentence is already in the training set, but with different annotations, then to maintain consistency in the training set you will not be able to add the sample from Try.

Correct errors in the interpretation

Once the sample is added into the training set, make corrections to the intent and annotation labels to help the model better recognize such sentences in the future.

If the recognized intent was incorrect, change the intent.

If the annotated entities were incorrect, edit the annotation.

Roll out your model

Now that you have developed, trained, and tested out your model, you are ready to roll out the model and the project. This way, users can interact with it via an application and you can see how well your application works "in the wild".

To do this, you need to:

  1. Build your model resources.
  2. Create and deploy an application configuration.
  3. Create authorization credentials.

This will build and deploy resources and give you application-specific credentials to access the resources.

With resources deployed and credentials in hand, you will be able to build a client application that harnesses the resources. Resources are accessed via the NLUaaS gRPC API or the ASRaaS gRPC API.

The data collected from applications can then be brought back in to Mix.nlu via the Discover tab.

Discover what your users say

Now that your model is ready, and rolled out to users in an application, you can look at what people say or type while using your application. These samples from users can be brought in and visualized in the Discover tab, along with information about the origin of the samples and how your model interpreted each sample. You’ll review them there, then add the ones you want directly into your intents in your training set to improve and grow your model.

Gain access to Discover data

In order to bring user data from a deployed application into Discover, note that you need to have call logs and the feedback loop enabled for your specific Mix application.

Contact your Nuance representative for more details about how to set this up.

To view the data in the Discover tab, you also need to be a member of the organization where the project associated with the application lives, as well as the project itself.

View Discover data

To open the Discover tab for a project:

  1. From the Mix Dashboard, select a project with a deployed application configuration.
  2. Click the .nlu icon to open Mix.nlu.
  3. Select the Discover tab.

When you first open the Discover tab, there will be no data displayed, and you will be prompted to select a source of data to display.

discover-load-samples

To access data for an application configuration within the Discover tab:

  1. Use the source selectors at the top of the tab to identify the source and time range from which to pull data. Select the application, associated context tag, environment, and date range using the selectors. This will specify an application configuration over the selected period of time. By default, date range will select the past seven days, but you can choose a custom date range using either a start and end date, a number of days, or one of the available preset range options.
    NOTE: The start date can be no more than 28 days prior to the current date.
    select-source-and-date-range
  2. Click Load Samples.

Mix.nlu will look for user sample data from the specified source and time frame. If there is data from the application in the selected time frame available to retrieve, it will be displayed in a table. The Load samples button becomes a Reload samples button.

Is there is no applicable data, you will see a no samples screen instead.

no-samples

Refresh Discover data

Sometimes you might want to refresh the displayed data for the same application configuration and date range. For example, if the date range includes the current day, you might want to see the very latest user inputs. To refresh the loaded samples, click the Reload Samples button.

Discover tab contents

Within the Discover tab, you can view information on speech or text input from application users. The information is presented in tabular format, with one row for each sample.

Here is more detail about the contents for each column in the table.

Column Description
Intent The intent identified by the model for the user input.
If the model determines that the sample does not seem to fit any of the expected intents, it will show NO_MATCH. NO_MATCH cases can help you identify intents that were not considered before but which are important to users. These can be added to refine and improve the model.
Samples The content of the user input, as text. The sample may include annotations attached by the model if (1) the model identified an intent, (2) the identified intent has entities defined, and (3) the model confidently identified entity values in the sample.
Note: For entities marked as sensitive in the model underlying the application, the information will show up as ****redacted****.
Score The model’s level of confidence in the inferred intent, as a decimal between 0.00 and 1.00.
Collected on Date and time the input was collected in your time zone.
Region Deployment region where the user interaction occurred.

If there is a lot of user data, the data is presented in pages.

You can sort the rows by the values of the Intent, Score, Collected on, or Region columns. Click on the column title to sort. By default, the data is sorted on the Collected on column to show the data in chronological order. Clicking on a column header a second time will sort on that column in the opposite order.

Invalid intents and entities

If you have changed the model ontology since last deploying your application configuration, and these changes impact the intents and/or entities interpreted for the samples, this is flagged in the table contents to remind you that the interpreted results are based on an outdated version of the model.

Intents and entities within the table will be visibly flagged with an orange marker if the intent or entity inferred by the application is no longer in the model ontology in Mix.nlu.

discover-intent-invalid

discover-entity-invalid

Filtering displayed data

As the usage of your application ramps up, and you get multiple pages of loaded user data, the amount of recent data displayed in Discover can become difficult to make sense of.

The Discover tab provides filters to help reduce the loaded and displayed samples down to a smaller subset of samples. To do this, use the filter panel beside the table.

discover-filters

You can filter the samples on the following dimensions:

For Intents and Entities, you can select multiple items to include in each filter by clicking the available checkboxes. Click once on a checkbox to select and a second time to deselect.

Filters for which at least one selection has been made are marked with a blue dot. When you select the first item, the filter value is displayed on the filter label. If you select more than one item, a simple count of how many are selected out of the total number of options is displayed.

Within the Intents and Entities filters you can click Select All to check all the checkboxes; this makes it easier to select all except by selecting all then deselecting the specific items you don't want to see. Clear All unchecks all the checkboxes for a filter.

Once you have chosen the filters you want to apply, click Apply in the filters header. The data displayed in the table will update to show only data corresponding to the filter values.

Clicking Clear all in the Filters header resets the selections in the filters to their original defaults and displays all samples.

You can hide the filter panel to free up space as needed and open it again to go back.

Change the intent for a sample

You can change the intent for a sample to one of the intents that are currently in the model ontology. This is useful if the model version used in the application interpreted the sample as an intent that is no longer in the model. This could happen, for example, if you have recently refactored your ontology.

To change the intent for a sample, open the intent menu and select the desired intent.

You can choose either one of the existing intents, or UNASSIGNED_SAMPLES.

Change intent

The sample will be labeled with the updated valid intent, and the the intent column will be marked with a blue dot to indicate that the intent has been updated.

Hovering over the dot will reveal a tooltip indicating the originally inferred intent.

Original intent rollover

Add samples to the training set

From the Discover tab, you can add selected samples for valid intents directly to the training set.

There are two options available for this:

Samples can be added to the training set under one of three verification states:

Note the following behaviors which apply to importing individual samples and bulk imports:

Note that once a sample has been imported to the training set, the sample will remain in Discover.

Add an individual sample

To add a sample with a valid intent to the training set:

  1. Click the add-sample-icon icon to open the add menu.
  2. Select one of the verification state options from the menu to add the sample to the training set with the chosen verification state.

Add single sample

Add multiple samples using bulk-add

To save time adding multiple samples from Discover to your training set, you can select multiple samples at once for import, and then add the samples to the training set in a chosen verification state.

Checkboxes are provided beside each sample to select the samples. A checkbox in the header above the samples allows you to select all selectable samples on the current page.

A bulk-add samples button in the header allows you to choose the target verification state for the selected samples.

To add a selection of samples:

  1. Use the checkboxes to select samples.
  2. Select the desired state for the samples in the bulk actions bar above the samples.

Add multiple samples

Download bulk-add errors data

When bulk-adding multiple samples, it is possible that errors and warnings will be produced. A pop up appears when a bulk-add is completed, summarizing the results of the operation, including any errors and warnings. To read detailed error logs, you can download an errors log file in CSV format. A Download Logs button for the CSV file will be displayed in the popup. To download the file, click the button.

Bulk add download error logs

Download Discover data

You can download the currently selected loaded data from the Discover tab as a CSV file. This includes, for each sample, any entity annotations identified by the model and displayed in Discover.

If filters are currently applied, only the filtered portion of the data will be downloaded.

To download the sample data as CSV, click on the download icon download-data above the table. You can then process the CSV data externally into a format that can be imported into Mix.nlu. For more information about importing data into a model, see Importing and exporting data.

If you change the application, associated context tag, environment, or date range using the source selectors, the download option is diabled until you press Reload Samples. Note that in this case this will clear any filters that were set.

Iterating your model

Using the insights gained from the Discover tab, you can refine your training data set, build and redeploy your updated model, and finally view the data from your refined model on the Discover tab. Rinse and repeat! You can improve your model (and your application) over time using an iterative feedback loop.

Optimize model development

The Optimize tab is a feature intended for advanced power users.

It provides advanced automation tools to help make it more efficient to develop larger or more complex projects and perform more sophisticated work on your NLU models.

For users new to Mix.nlu, the Develop tab is the best place to start developing models. The Develop tab is more appropriate for smaller DIY projects.

Optimize tab overview

Visible at the top of the screen are:

The Train Model button initiates training using the training data samples.

The Try panel, as in the Develop tab, allows you to interactively test the model by typing in a new sentence.

Sample Sentences panel

The Sample Sentences panel gives a unified view of all samples in the project for the currently selected language, of all intent types and all verification statuses.

The Optimize tab also gives a unified set of controls to perform operations on samples, whether for a single sample, or a chosen set of samples.

The data is displayed in a table, with one row for each sample and with data displayed for the following columns:

Column Description
Intent Intent type for the sample. This can have one of the following values:
  • A user-defined Intent type
  • NO_INTENT
  • UNASSIGNED_SAMPLES
  • A new intent suggestion coming from an Auto-intent run
An intent menu in this column for each sample allows you to change the intent for the sample, either to an existing intent or a new intent created on the fly.
Status Indicates the sample status with an icon. This includes the same values used in the Develop tab.
  • Excluded
  • Intent-suggested: Sample with intent in a suggested state pending acceptance of Auto-intent suggestion by user
  • Intent-Assigned
  • Annotation-assigned
Note: UNASSIGNED_SAMPLES do not have a verification status, and appear in this column with a dash.
Sample The text of the sample, along with any already assigned entity annotations, as well as:
  • Checkbox selector to select multiple samples for bulk operations.
  • Ellipsis menu to perform actions on the individual sample.
  • Count indicator (optional) showing the number of times the exact sample appears in the corpus. You can also increase or decrease the number of appearances.
Note: Counts and annotations can be toggled on and off using the controls in the sample column header.

The data in the table can be sorted by column values:

Click on the column header to sort the samples by that column. Click again to sort in the opposite order.

### Sample status progress bar A progress bar above the data table gives a visual sense of what proportion of the sample data has been processed through to Annotation-assigned, and is thereby ready to use for training a model.

As with the Develop tab, when there are a lot of samples, the contents will be divided into pages. Similar to the Develop tab, controls on the bottom of the table let you navigate between pages and change the number of samples per page.

Visibility toggles

The header bar above the Sample contents column has toggles to control the visibility of:

Personal Data: show or hide personally identifying information (PII) in the displayed samples

sentence visibility toggles

Filter displayed samples

By default, the Optimize tab displays all samples.

To filter the samples down to a smaller subset of samples, use the filter panel beside the table. You can filter the samples on these dimensions:

Filters

Multiple items to include can be selected in the Intents and Entities filters by clicking the available checkboxes. Click once on a checkbox to select and a second time to deselect.

Filters for which at least one selection has been made are marked with a blue dot. When you select the first item, the filter value is displayed on the filter label. If you select more than one item, a simple count of how many are selected out of the total number of options is displayed.

Filters selected

Within the Intents and Entities filters you can click Select All to check all the checkboxes; this makes it easier to select all except by selecting all then deselecting the specific items you don't want to see. Clear All unchecks all the checkboxes for a filter.

Once you have chosen the filters you want to apply, click Apply in the filters header. The data displayed in the table will update to show only data corresponding to the filter values. If there are enough samples fitting the filter criteria, they will be displayed in pages.

Clicking Clear all in the filters header resets the selections in the filters to their original defaults and displays all samples.

You can hide the filter panel to free up space as needed and open it again to go back.

Apply automation

The Automate data menu appears in the samples actions bar above the samples. Automate data provides options for automating basic tasks of grouping and annotating samples. Currently this menu supports one automation task, Auto-intent. In future releases, additional automations will be added.

Clicking Automate data launches an Automate data popup module. Here, the chosen automation can be selected (Currently Auto-intent is the only available automation).

Note: Automation can also be applied when importing a file with samples, whether in the Develop tab of Mix.nlu or in Mix.dashboard. See the Import project data documentation for more details on file import options.

Auto-intent

Auto-intent performs an analysis of UNASSIGNED_SAMPLES, suggesting intents for these samples.

Each previously unassigned sample is tentatively labeled with one of a small number of auto-detected intents present within the set of unassigned samples.

There are two options for Auto-intent:

If a sample is recognized as fitting the pattern of an already defined intent, Auto-intent suggests this existing intent.

In the second option, for groups of samples that appear related to each other, but which do not appear to fit the pattern of an existing intent, the samples are labeled generically as AUTO_INTENT_01, AUTO_INTENT_02, and so on.

Health checks

When an automation action is initiated, Mix.nlu runs a health check of the training sample, model, and data sent for automation. This involves a check of several things:

health-check-go

These checks assure that you have a robust, up to date model and that the Auto-intent run will give useful results when running automation. When the checks are done, results will be displayed visually in the Automate data pop-up module.

If the checks all pass, you will be able to proceed straightaway with automation using the existing trained model.

Consequences of failed health checks

If any of the checks do not pass, you will be informed and advised of how that impacts the next steps.

Health check Consequences if check fails 
Quantity of annotated samples Informs that adding a starting ontology and/or more annotated samples will improve performance of the predicted intents.
Model available Informs that a trained model is needed and that a new one will be trained before running automations, adding additional latency.
Project data reflected in model Informs that a new model will be trained due to the changes in the data, adding additional latency.
Quantity data sent for automation Informs that the automation needs a sufficient volume of samples to be performant. Smaller uploads will have sub-optimal performance.

If you don't have any UNASSIGNED_SAMPLES on which to apply Auto-intent, you will not be able to proceed with the automation.

If there are not enough annotated samples in your training set, you will be advised to add more. You can still continue with the Auto-intent request.

If there is no existing trained model or your model is out of date, Mix.nlu will train a new model before proceeding with the automation. This will add some time to the automation process.

Run Auto-intent on UNASSIGNED_SAMPLES

Note: To run Auto-intent, you need to have UNASSIGNED_SAMPLES.

To run Auto-intent:

  1. Choose Automate data from the actions bar above the table. This will launch a pop-up automation module
  2. Choose the automation to apply. Currently, Auto-intent is the only option and is pre-selected.
    auto-intent
  3. Select Identify new intents for the Auto-intent run if needed, using the toggle.
  4. Click Next step. This will initiate health checks of the samples and model. Depending on the results of the pre-check, you may receive feedback.
  5. Click Automate or Train and automate to continue. (Train and automate appears if the health check reveals there is no trained NLU model or the model is out of date).
    health-check

This initiates the Auto-intent process. When the run is finished, it returns a suggested intent classification for each previously unassigned sample.

Review Auto-intent suggestions

When the Auto-intent operation completes, you can view the suggestions. Initially, these suggestions are tentative, and from a verification perspective, they are in the status Intent-suggested. No intent is yet assigned.

If there are any newly identified intents, you should review the new intents to see if any of them need to be merged after the fact.

Accept or discard Auto-intent suggestions

You can next choose to accept or discard the Auto-intent suggestions.

auto-intent

Clicking the checkmark icon accepts a suggestion, while clicking the x icon discards the suggestion.

For a sample with a suggestion for an existing intent, accepting the suggestion assigns the sample to that intent and moves the sample from Intent-suggested to Intent-assigned. Discarding the suggestion moves the sample back to UNASSIGNED_SAMPLES. A toast icon will be displayed to confirm your choice has been applied.

For any individual samples that were misidentified, you can manually change the sample intent.

Rename a newly identified intent

For a sample identified as a newly identified intent (AUTO_INTENT_01, AUTO_INTENT_02...), you are prompted to rename the intent to a meaningful name when you try to accept the suggestion.

auto-intent

Enter a new name in the text field provided and press Enter.

Three things happen when you do this:

auto-intent

Merge two newly identified intents

You may find in some cases that Auto-intent will interpret multiple new intents that in reality represent the same intent. The Auto-intent algorithm inclines toward identifying "smaller" intents to give more flexibility to developers.

If you find that this has happened, it is relatively simple to merge the two newly identified intents.

First Rename one of the intents.

Then move the samples (for example, using bulk move intents) from the second new intent to the renamed intent.

#### Auto-annotation Auto-annotation is a feature that works on un-annotated samples (Intent-assigned but not Annotation-assigned) for a specified intent. Working within the selected intent, Auto-annotation attempts to identify any instances of entities associated with the intent and labels them accordingly. #### Run Auto-annotation To run Auto-annotation: 1. Choose **Automate data** from the action bar above the table. 2. Select the **Auto-Annotation** option. 3. Choose an intent from the list of existing intents on which to apply Auto-annotation. 3. Click **Automate** (or **Train and automate**, if applicable) to continue. This initiates the Auto-annotation process. When the run is finished, it returns a suggested entities annotation for each previously unassigned sample. #### Accept or discard auto-annotate suggestions When the Auto-annotate operation completes, you can view the suggestions. Initially, these suggestions are tentative, and from a verification perspective, they are in the status *Annotations-suggested*. No annotations are yet assigned. You can next choose to *accept* or *discard* the suggestions. Clicking the checkmark icon will accept the suggestion, while clicking the trash icon will discard the suggestion. A toast icon will be displayed to confirm your choice has been applied.

Add multiple samples to an intent

A Samples editor provides an interface to create and add multiple new samples in one shot. This serves as a faster way to create new samples.

Samples are added as plain text without annotations. Individual samples can have up to a maximum of 500 characters. You can add up to 100 samples at one time using this editor.

To add samples:

  1. Select Sample from the actions bar above the table. An editor will launch with multiple lines to type in samples.
    add samples
  2. Use the Select Intent dropdown to choose the intent to which you want to add new samples.
    choose intent
    You can also select instead to apply Auto-intent to the new samples.
  3. Enter samples in the editor. There are a few ways to do this:
    • Type in a sample and press the Tab or Enter key or click the next line to enter another sample.
    • Copy-paste a list of samples from a word processor or other text editor. The samples need to be separated with hard or soft returns in the source for the editor in Mix to correctly divide them into separate samples. The samples will appear in the editor on separate lines.
      choose intent
  4. Repeat as needed until you have entered all the samples you want to add.
  5. Once you have added your samples, click Submit to add the samples.

If you chose an intent for the samples, the new samples should now appear in Optimize and in Develop under the intent. You can annotate the samples in either of these tabs.

If you chose to apply Auto-intent to the samples, the samples will appear in the table of samples with intent suggestions. You can then proceed to rename any newly detected intents, accept or discard the suggested intents, and annotate the samples.

Upload samples with text file import

The file upload feature in Optimize is similar to that in Develop, allowing you to upload a text file with samples. The file upload in Optimize allows for additional functionality however. To add multiple samples at once via a text file upload:

  1. In the top bar, click the file-upload file upload icon. An Upload a file dialog will open. File upload
  2. Use the file picker to select a .txt file containing samples or drag a text file onto the dialog window. You will then be given two options on how to handle the file:
    • Upload to a specific intent: Import samples under one existing intent.
    • Auto-Intent: Import a set of samples and apply Auto-intent to suggest, for each sample. Auto-intent can either look only for existing intents or it can search for both existing intents and newly detected intents.
  3. Select the desired option.
    • If uploading to a specific intent, select an intent as well.
    • If you want to apply Auto-intent, select whether or not to try to identify new intents in the uploaded samples.
  4. Click to proceed.
    • If you are uploading the samples to a specific intent, click Upload to initiate the upload, and you are done.
      Upload to specific intent
    • If you are applying Auto-intent, click Next step and proceed to step 5.
      Upload to specific intent
  5. If you choose to apply Auto-intent to the uploaded samples, this will trigger a Health check of samples and model. The health check results may give you guidance on how to improve the performance of the Auto-intent. In some cases, particularly if your training set does not have a sufficient number of Intent-assigned samples, you will be blocked from proceeding until you remedy the issue. If you do not yet have a trained model in your project or the model is out of date, Mix.nlu will train a new model before proceeding with the Auto-intent. If there are no blocking issues, click Automate or Train and Automate as the case may be to proceed.
  6. When the upload and processing of the file are complete, a pop-up View Report window appears. This gives summary information about how successful the upload was. For more details, you can click Download logs to download a CSV log file.
    Download logs

Samples uploaded to a specific intent are attached to that intent. You will want to go in and add annotations after uploading.

Samples uploaded with Auto-intent applied are added initially as UNASSIGNED_SAMPLES with the identified intents initially only suggestions. You will want to view suggested intents in Optimize and accept or discard those suggestions. See Auto intent for more details on Auto-intent.

### Find and replace Find and Replace fields in the samples actions bar above the table allow you to do a substring search or search and replace on the entire training set. Regex patterns also can be used for the search. #### Perform find and replace To perform find samples matching a string or pattern (and if desired, do a replace): 1. Click the **Find** field and type a search string or a regex pattern. 2. If you want to do a replace on samples that match, click the **Replace** field and type in replacement text. 3. Press **Enter**. Samples containing the search substring or matching the regex pattern will be displayed, and if replace was selected, the matches will be replaced with the replacement text.

Update individual samples

You can perform several actions on individual samples:

The controls and behavior for individual sample operations are mostly the same as those in the Develop tab.

Change sample intent in intent menu

An intent menu available in the Intent column of each sample allows an alternate means to change the intent for a sample.

To change the sample intent to an existing intent:

  1. Click to open the intent menu.
    change-intent-dropdown
  2. Select a new intent for the sample. There are multiple ways to do this:

    • Scroll through the list of existing intents, and find the intent you want.
    • If there are a lot of intents in your project, you can also use the search field to track down the intent you want more quickly.
  3. Click on the intent name to select the intent.

Sometimes, you may realize that the sample does not fit any of the existing intents. In this case, you can create a new intent directly in the menu. With the intent menu open:

  1. Type in a new intent name in the search field. You will see no results in the search field and will be prompted to add the intent.
    name-new-intent
  2. Click the add icon add-icon in the intent menu

In both cases, the Move Samples menu will open to allow you to move the sample to the new intent and decide how you want to deal with any entities in the sample.

Perform bulk operations

As in the Develop tab, you can perform bulk operations on a selected subset of multiple samples at the same time. The behavior for bulk operations in the Optimize tab is similar to that for bulk operations in the Develop tab, as described in Bulk operations. The key differences are that In the Optimize tab:

bulk-operations

As in the Develop tab, you can select:

As described in the Develop tab bulk operations discussion, making any changes to the samples will deselect any selected samples. This includes all the types of sample changes mentioned under Develop. For the Optimize tab specifically, this also includes:

Once you have selected the subset of samples, click an icon on the header bar to apply one of the available operations:

Bulk accept and discard suggested intents

The icons for accepting and discarding suggested intents on selected samples will only be active if at least one of the selected samples has a pending auto-intent suggestion. In addition, bulk accept/discard can only be chosen if the selected samples are on the same page in the current filter view. If you want to more efficiently perform bulk accept/discard, it is a good idea to filter by Automation result first to aggregate and see only those samples.

Clicking the bulk accept icon opens a window summarizing the selected samples with samples grouped by suggested intent. For newly identified intents, you need to choose a global rename for the intent. Only once all newly identified intents have been renamed can you click to accept the suggestions.

OTHER FUTURE BULK OPERATIONS Auto-intent (applies to unassigned samples) Auto-annotate (applies to samples that are Intent-assigned but not Annotation-assigned)

Coming attractions

Additional functionality will be added to the Optimize tab in future releases. This includes:

Handling sensitive information

Sometimes when building an NLU model for your application, you will need to handle user inputs that contain sensitive personally identifiable information (PII). Sensitive PII is personal data, not generally easily accessible from public sources, that alone or in conjunction with other data can identify an individual.

Sensitive PII includes data such as:

When collecting such information during an interaction with a user, it is important to mask this data in logs to protect the users.

Mix.nlu allows you to mark any entity as Sensitive in the Entities panel. Once an entity has been marked as sensitive, user input interpreted by the model as relating to the entity at runtime will be masked in call logs.

Mark sensitive

Similarly, entities and contents of variables can be marked as Sensitive in Mix.dialog and are then handled the same at runtime.

Ontology

In natural language understanding, an ontology is a formal definition of entities, ideas, events, and the relationships between them, for some knowledge area or domain. The existence of an ontology enables mapping natural language utterances to precise intended meanings within that domain.

In the context of Mix.nlu, an ontology refers to the schema of intents, entities, and their relationships that you specify and that are used when annotating your samples, and interpreting user queries.

Intents

An intent identifies an intended action. For example, an utterance or query spoken by a user expresses an intent to order a drink. As you develop an NLU model, you define intents based on what you want your users to be able to do in your application. You then link intents to functions or methods in your client application logic.

Here are some examples of intents you might define:

Intents are often associated with entities to further specify particulars about the intended action.

Entities

An entity is a language construct for a property, or particular detail, related to the user's intent. For example, if the user's intent is to order an espresso drink, entities might include COFFEE_TYPE, FLAVOR, TEMPERATURE, and so on. You can link entities and their values to the parameters of the functions and methods in your client application logic.

If an entity applies to a particular intent, it is referred to as a relevant entity for that intent. The idea of relevant entities is important:

Mix.nlu supports the following user-defined entity collection methods:

Mix.nlu also supports two classes of predefined types:

Mix.nlu also provides some mechanisms to modify, combine and refer to the existing types:

Collection method and data type

Your options for collection method will depend on your chosen data type for the entity. For more details see Data type and collection method compatibility.

List entities

An entity with list collection method has possible values that can be enumerated in a list. For example, if you have defined an intent called ORDER_COFFEE, the entity COFFEE_TYPE would have a list of drink types that can be ordered. Other examples of entities using list collection might include song titles, states of a light bulb (on or off), names of people, names of cities, and so on.

Literals and values

A literal is the range of tokens in a user's utterance or query that corresponds to a certain entity. The literal is the exact literal written or transcribed spoken text. For example, in the query "I'd like a large t-shirt", the literal corresponding to the entity SHIRT_SIZE is "large". Other literals might be "small", "medium", "large", "big", and "extra large". When you annotate samples, you select a range of text to tag with an entity. For list-type entities, you can then add the text to the list for the entity. Lists of literals can also be uploaded in .list or .nmlist files. For more information, see Importing entity literals.

Literals can be paired with values. In comparison to literals, values are the canonical semantic meaning associated to a literal. A value specifies the entity and allows the system to act on the user's intent. For example, "small", "medium", and "large" can be paired with values "S", "M", and "L", respectively. Multiple literals can have the same value, which makes it easy to map different ways a user might say an entity into a single common meaning. For example, "large", "big", "very big" could all be given the same value "L".

Defining literal-value pairs per language

If your project includes multiple languages, you will want to support the various ways that users might ask for an item in their language of choice. List-based entities created in a project are shared across languages. The values and associated literals connected to the entity, however, are created and managed separately by language. This gives flexibility to handle situations where the value options vary by language and location.

When you add a value-literal pair, this pair will apply to the entity only in the currently selected language. The same value name can be used in multiple languages for the same list-based entity, but the value and its literals need to be added separately in each language.

To add a new value and a literal for a list-based entity within the currently selected language, enter the literal and value in the Entity list pane where indicated and then click the plus (+) icon. The new value appears in the list along with the first literal. You can also click there to add new literals that map to the same entity value. Again, the literal-value pairs added will not be automatically added to the other languages in the project.

To remove a literal, click the delete icon close-icon next to the literal. You are asked to confirm the deletion. This removes the literal from the currently selected language.

entity-edit-literal

Dynamic list entities

It is not always feasible to know all possible literals when you create a model, and you may need the ability to interpret values at runtime. For example, each user will have a different set of contacts on his or her phone. It is not practical (or doable) to add every possible set of contact names to your entity when you are building your model in Mix.nlu.

Dynamic list entities allow you to upload data dynamically in a client application at runtime. The data is uploaded in the form of a wordset using the Mix NLUaaS or ASRaaS API. Wordsets can either be uploaded and compiled ahead of time or uploaded at runtime. The ASRaaS or NLUaaS runtime can then use this data to provide personalization and to improve spoken language recognition and natural language understanding accuracy.

Defining dynamic entities

To define an entity with list collection method as dynamic, check the Dynamic box for this entity.

While the values for dynamic data are uploaded in the form of wordsets, it is still important to define a representative subset of literal and value pairs for dynamic list entities. This ensures that the model is trained properly and improves the accuracy of the ASR. Using our contact example, this means that you should include a representative subset of what you expect contact names to look like, and ensure that you have samples with the proper annotation.

When naming your dynamic entities in each model, keep in mind that they are global per application ID (across languages and deployed model versions).

Relationship entities: isA and hasA

An entity with relationship collection method has a specific relationship to one or more existing entities, either an "isA" or a "hasA" relationship.

isA relationship entities

An isA relationship states that ENTITY_X is a type of ENTITY_Y. The definition of Y is inherited by X, such as Y's list of literals, as well as any applicable grammars and relationships. Note that while the definition of the child entity is the same as the parent entity, the child entity picks up differences because of its different role in your samples.

For example, say you have a train schedule app and you want to accept queries such as "When is the next train from Boston to New York." Both "Boston" and "New York" are instances of the STATION entity. If you annotated the query using STATION for both cases, then you would have no way of determining which is the origin and which is the destination. To resolve this, you could instead define two list-type entities, FROM_STATION and TO_STATION, and associate each with the same list of literals. This would, of course, be time consuming and difficult to manage. The better solution is to define one list-type entity STATION with an associated list of cities/stations, and then define FROM_STATION isA STATION, and TO_STATION isA STATION. Now, you only have one list of stations to manage. The model interprets queries and returns FROM_STATION or TO_STATION as appropriate for the roles they play in the query, and returns literals and values from the list associated with the STATION entity.

You can also make isA relationships to predefined entities. For example, AGE is a nuance_CARDINAL_NUMBER.

In Mix, an entity can only have an isA relationship with one entity.

hasA relationship entities

A hasA relationship states that ENTITY_Y is a property or a part of ENTITY_X. That is, ENTITY_X has a ENTITY_Y. For example, the entity FULL_NAME might have the sub-entities GIVEN_NAME and FAMILY_NAME as part of it. The entity DRINK might have COFFEE_TYPE and SIZE as part of it. Note that unlike an isA relationship, an entity can have multiple hasA relationships.

You would use hasA relationships if the entities in your queries have structure. However, Nuance recommends that you use hasA relationships only if you have a definite need, since they can be tricky to work with, and the complexity means the NLU models may be less accurate than desired. An example of a definite need is to be able to interpret a query like "put the red block into the green box".

In this case you need a way to associate the color red with the block and the color green with the box. Without using hasA relationships the JSON object returned would be flat and you would not know which color went with which object. Using hasA, you would define an OBJECT that has a COLOR and SHAPE. Then the following annotation becomes possible: "put the [OBJECT][COLOR]red[/][SHAPE]block[/][/] into the [OBJECT][COLOR]green[/][SHAPE]box[/][/]".

Essentially, isA creates a subclass sort of relationship, while hasA creates a relationship of composition.

Note that hasA relationships are not supported in Mix.dialog, so your should avoid using hasA if you will be building a dialog project.

Create a new relationship entity

  1. Create a new entity and give it a name.
  2. Click on the new entity in the Entities panel to open the editor.
  3. Set the data type and set the collection type to Relationship.
    select relationships entity type
    A relationships definition editor appears underneath.
    relationship editor
  4. Click the + icon for the type of relationship entity you want to create, isA or hasA. A dropdown will open allowing you to pick from the existing custom and predefined entities. For isA, you can only select one entity here, while for hasA you can select multiple entities
    choose related entity
  5. Select one of the sub-entities to which your new entity is related.
  6. Repeat steps 4 and 5 for any other sub-entities in the relationship definition.

The relationship is now defined.

relationship-defined

  1. Go to the Develop tab and open the intent containing the sentence.
  2. Click to select the portion of the sentence containing the (outer) hasA entity.
    select-outer-entity
    In the entity selection menu that appears, you can see both the outer, hasA entity, as well as the sub-entities to which it is related by a hasA relationship.
  3. Select the hasA entity from the menu. The outer entity will be annotated.
    annotate-outer-entity
  4. For each of the inner sub-entities, select the portion of the sentence containing the entity, and select the entity from the menu.
    annotate-inner-entities

The sentence is now fully annotated.

annotate-inner-entities

Relationship entities and sensitive flag

Note that an entity defined in relationship to custom entities via isA or hasA does not automatically inherit the sensitive flag from the original entities. You need to separately mark the new entity as sensitive.

Regex-based

An entity with regex-based collection method defines a set of values using regular expressions. For example, product or order values are typically alphanumeric sequences with a regular format, such as gro-456 or ABC 967. Both of these examples, and many more codes with the same general pattern, can be described with the regex pattern:
[A-Za-z]{3}\s?-?\s?[0-9]{3}

Similarly, you might use entities with regex-based collection to match account numbers, postal (zip) codes, confirmation codes, PINs, or driver's license numbers, and other pattern-based formats.

Creating regex-based entities

To use a regular expression to validate the value of an entity (for example, an order number as shown below), enter the expression as valid JavaScript.

In this example the user is creating a regex-based entity called ORDER_NUMBER, which will match order numbers in the form gro-456, COF-123, sla 889, and so on (three characters + an optional hyphen and/or space + three digits).

To save the pattern, click Download project and save regex-based entity.

Before the entity-type is created (or modified), Mix.nlu exports your existing NLU model to a ZIP file containing a TRSX file so that you have a backup. Creating (or modifying) a regex-based entity requires your NLU model to be re-tokenized, which may take some time and impact your existing annotations. You receive a message when the entity is saved successfully.

Mix.nlu validates the search pattern as you enter it and alerts you if it is invalid. Invalid expressions (including empty values) are not saved.

Notes and cautions

Note the following points when creating regular expressions for entities with regex-based collection method:

Capture groups

Be careful when using parentheses in a regular expression, for example to quantify a sub-pattern with +, *, ?, or {m,n}. Enclosing in parentheses creates a capture group. In general programming, matching a regex pattern with capture groups on a string returns both the full pattern, and the individual capture groups, in order, packaged as an array.

With Mix.nlu specifically, however, an entity expects a single value. When you use a regex with capture groups, Mix.nlu will return the result from the first capture group only rather than the full pattern. This is to allow extra flexibility for developers; for example if you want to recognize a date pattern, but only need the month to fulfill the user's intent. If you need to use a parenthetical group, but want the full pattern match as the value returned for the entity, there are two options:

Anchors

Avoid using a caret (^) to denote the beginning of a regular expression, or a dollar sign ($) to denote the end, as doing so will cause the NLU engine to expect the expression at the beginning, or end, of a sentence. Consider this phone number regex-based entity (any phone number of format 123-456-7890):

Annotating with regex-based entities

Annotating with regex-based entities means identifying the tokens to be captured by the regex-defined value. At runtime the model tries to match user words with the regular expression.

For example:

What's the status of order [ORDER_NUMBER]COF-123[/]

Rule-based

An entity with rule-based collection method defines a set of values based on a GrXML grammar file.

While regular expressions can be useful for matching short alphanumeric patterns in text-based input, grammars are useful for matching multi-word patterns in spoken user inputs. A grammar uses rules to systematically describe all the ways users could express values for an entity.

Creating rule-based entities

To create an entity using the rule-based collection method:

  1. Prepare the grammar file. See Understanding grammar files and GrXML file rules below for more details on filename conventions and the required format of the file.
  2. (As required) In Mix.nlu select the language from the menu near the name of the project. (GrXML files are language-specific.)
  3. Create a new entity and name it appropriately, keeping in mind the requirements described in the link above.
  4. Select a data type for the entity.
  5. Under How you are collecting from the user, select: Rule-based.
  6. Browse to upload the grammar file that you have prepared.
  7. Click Download project and save rule-based entity.
    create_grxml_entity
  8. If your project includes multiple languages, upload separate grammar files, one for each language. See the note below.

Before the new entity is saved (or modified), Mix.nlu exports your existing NLU model to a ZIP file (one ZIP file per language) so that you have a backup of your NLU model. Creating (or modifying) a rule-based entity requires your NLU model to be retokenized, which may take some time and impact your existing annotations. You receive a message when the entity is saved successfully.

At any time you can use the download button to view the contents of the GrXML file.

download_grxml_

Note the following additional points when creating entities using rule-based collection method:

#### Annotating with rule-based entities Annotating with rule-based entities means identifying the words to be captured using a rule grammar (GrXML file). At runtime, the model tries to match user words with the grammar file. For example: `I'd like to pay with my [CARD_TYPE]Visa[/]`

Understanding grammar files

Example GrXML file:

<?xml version='1.0' encoding='utf-8'?>
<grammar xml:lang="en-US" version="1.0" root="DP_NUMBER" xmlns="http://www.w3.org/2001/06/grammar">
   <meta name="swirec_normalize_to_probabilities" content="1"/>
   <meta name="swirec_enable_robust_compile" content="1"/>

   <rule id="DP_NUMBER" scope="public">
      <one-of>
         <item>
            <ruleref uri="#S"/>
            <tag>DP_NUMBER = S.V</tag>
         </item>
         <item>
            <ruleref uri="#EMIR"/>
            <tag>DP_NUMBER = EMIR.V</tag>
         </item>
      </one-of>
   </rule>

   <rule id="S">
      <item repeat="1-16">
        <one-of>
            <item>
              <ruleref uri="#DIGIT"/>
              <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
            </item>
            <item> <ruleref uri="#dash"/> </item>
        </one-of>
      </item>
   </rule>

   <rule id="EMIR">
         seven eight four <tag><![CDATA[V = "784"]]> </tag>
         <item repeat="0-1"> <ruleref uri="#dash"/> </item>
         <one-of>
            <item> nineteen <tag><![CDATA[V=V+"19"]]></tag> </item>
            <item> twenty <tag><![CDATA[V=V+"20"]]></tag>  </item>
         </one-of>
         <one-of>
            <item>
               <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
               <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
            </item>
            <item> eighty <tag><![CDATA[V=V+"80"]]></tag> </item>
            <item> eighty one <tag><![CDATA[V=V+"81"]]></tag>  </item>
            <item> eighty two <tag><![CDATA[V=V+"82"]]></tag>  </item>
            <item> eighty three <tag><![CDATA[V=V+"83"]]></tag>  </item>
            <item> eighty four <tag><![CDATA[V=V+"84"]]></tag>  </item>
            <item> eighty five <tag><![CDATA[V=V+"85"]]></tag>  </item>
            <item> eighty six <tag><![CDATA[V=V+"86"]]></tag>  </item>
            <item> eighty seven <tag><![CDATA[V=V+"87"]]></tag>  </item>
            <item> eighty eight <tag><![CDATA[V=V+"88"]]></tag>  </item>
            <item> eighty nine <tag><![CDATA[V=V+"89"]]></tag>  </item>
         </one-of>
         <item repeat="0-1"> <ruleref uri="#dash"/> </item>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
         <item repeat="0-1"> <ruleref uri="#dash"/> </item>
         <ruleref uri="#DIGIT"/> <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
   </rule>

   <rule id="DIGIT" scope="private">
      <one-of>
         <item> <ruleref uri="#zero"/> <tag><![CDATA[V="0"]]></tag> </item>
         <item> <item>one</item>  <tag><![CDATA[V="1"]]></tag> </item>
         <item> <item>two</item>  <tag><![CDATA[V="2"]]></tag> </item>
         <item> <item>three</item>  <tag><![CDATA[V="3"]]></tag> </item>
         <item> <item>four</item>  <tag><![CDATA[V="4"]]></tag> </item>
         <item> <item>five</item>  <tag><![CDATA[V="5"]]></tag> </item>
         <item> <item>six</item>  <tag><![CDATA[V="6"]]></tag> </item>
         <item> <item>seven</item>  <tag><![CDATA[V="7"]]></tag> </item>
         <item> <item>eight</item>  <tag><![CDATA[V="8"]]></tag> </item>
         <item> <item>nine</item>  <tag><![CDATA[V="9"]]></tag> </item>
         <item> double <ruleref uri="#zero"/> <tag><![CDATA[V="00"]]></tag> </item>
         <item> double <item>one</item>  <tag><![CDATA[V="11"]]></tag> </item>
         <item> double <item>two</item>  <tag><![CDATA[V="22"]]></tag> </item>
         <item> double <item>three</item>  <tag><![CDATA[V="33"]]></tag> </item>
         <item> double <item>four</item>  <tag><![CDATA[V="44"]]></tag> </item>
         <item> double <item>five</item>  <tag><![CDATA[V="55"]]></tag> </item>
         <item> double <item>six</item>  <tag><![CDATA[V="66"]]></tag> </item>
         <item> double <item>seven</item>  <tag><![CDATA[V="77"]]></tag> </item>
         <item> double <item>eight</item>  <tag><![CDATA[V="88"]]></tag> </item>
         <item> double <item>nine</item>  <tag><![CDATA[V="99"]]></tag> </item>
         <item> triple <ruleref uri="#zero"/> <tag><![CDATA[V="000"]]></tag> </item>
         <item> triple <item>one</item>  <tag><![CDATA[V="111"]]></tag> </item>
         <item> triple <item>two</item>  <tag><![CDATA[V="222"]]></tag> </item>
         <item> triple <item>three</item>  <tag><![CDATA[V="333"]]></tag> </item>
         <item> triple <item>four</item>  <tag><![CDATA[V="444"]]></tag> </item>
         <item> triple <item>five</item>  <tag><![CDATA[V="555"]]></tag> </item>
         <item> triple <item>six</item>  <tag><![CDATA[V="666"]]></tag> </item>
         <item> triple <item>seven</item>  <tag><![CDATA[V="777"]]></tag> </item>
         <item> triple <item>eight</item>  <tag><![CDATA[V="888"]]></tag> </item>
         <item> triple <item>nine</item>  <tag><![CDATA[V="999"]]></tag> </item>
      </one-of>
   </rule>

   <rule id="dash" scope="private">
      <one-of>
         <item> dash </item>
         <item> minus </item>
      </one-of>
   </rule>

   <rule id="zero" scope="private">
      <one-of>
         <item> zero </item>
         <item> null </item>
         <item> oh </item>
      </one-of>
   </rule>

</grammar>

Shown here is an example GrXML file. This grammar file is designed to recognize a specific account number type in conjunction with a rule-based entity called DP_NUMBER.

From the attributes of the grammar element, we know the language for the grammar is United States English (xml:lang="en-US")

Notice that the header of the file identifies "DP_NUMBER" (the same name as the rule-based entity) as the root rule (root="DP_NUMBER").

Below this, we see the root rule definition (<rule id="DP_NUMBER" scope="public">).

This rule itself consists of a one-of list with two options representing two possible formats for the account number. Each of these options refers to a sub-rule appearing further on in the file via a ruleref element. The first option refers to a rule entitled "S" (<ruleref uri="#S"/>). The second option refers to another rule entitled "EMIR" (<ruleref uri="#EMIR"/>). These sub-rules themselves reference additional rules "DIGIT", "dash", and "zero" used by both.

At runtime, Mix.nlu compares what the user says with the patterns defined in the different sub-rule branches. If the user utterance matches a pattern, this activates that branch. The code in the tag element of the branch assigns the appropriate value to the DP_NUMBER variable and returns this value.

If the user utterance doesn’t match an option from any of the rules with reasonable accuracy, the rule-based entity and any intents using the entity will not match with significant confidence.

A rule includes some number of items, which represent parts of possible matches for the rule. A rule can look for any one of a set of items matching the rule. For example, this rule looks for different ways to say the same digit zero:

<rule id="zero" scope="private">
  <one-of>
   <item> zero </item>
   <item> null </item>
   <item> oh </item>
  </one-of>
</rule>
A rule or item can also look for a specified number or range of repetitions of some pattern. For example, the following looks for zero or one matches to a rule that recognizes a dash. ` ` These can also be combined. For example, the following rule looks for a sequence of between one and sixteen digits and dashes.

<rule id="S">
  <item repeat="1-16">
    <one-of>
      <item>
        <ruleref uri="#DIGIT"/>
        <tag>V = V ? V + DIGIT.V : DIGIT.V</tag>
      </item>
      <item><ruleref uri="#dash"/></item>
    </one-of>
  </item>
</rule>

For more information on GrXML, refer to the standard at Speech Recognition Grammar specification.

GrXML file rules

The filename for the GrXML file must have from 1-128 characters, and may include upper and lowercase letters, 0-9, - (hyphen), and _ (underscore).

A rule grammar file has this format:

Troubleshooting GrXML errors

Here are some notes that may help if you encounter problems creating rule-based entities.

Issue Description
Invalid file extension The file is not a GrXML file. If you are creating a rule-based entity, you must upload a GrXML file with the *.grxml extension.
Invalid file name The filename must not exceed 128 characters and is limited to upper and lowercase letters, 0-9, - (hyphen), and _ (underscore).
Grammar root value The grammar root in the GrXML file must be the entity name. For example:
<grammar ... root="DP_NUMBER" ...>
File contains GrXML errors There are format errors in the file’s GrXML markup. For example, check that the grammar root, the rule ID, and the return tag all use the entity name:
<grammar... root="DP_NUMBER" ...>
<rule id="DP_NUMBER" ...>
<tag>DP_NUMBER = S.V</tag>
Grammars may not reference other files The grammar file may not include references to other files; for example, this is not supported: <ruleref uri="acct_num.grxml#emir"/>
Any related rules required by the grammar must be included in the file being uploaded.

Freeform entities

An entity with freeform collection method is used to capture, as a single block, user input that you cannot:

Take the example of an intent for sending a text message to a specified user. A text message body could be any sequence of words of any length. In the query "send a message to Adam hey I'm going to be ten minutes late", the phrase "hey I'm going to be ten minutes late" becomes associated with a freeform entity MESSAGE_BODY.

An important aspect of an entity with freeform collection method is that the meaning of the literal corresponding to the entity is not important or necessary for fulfilling the intent. In the example of sending a text message, the application does not need to understand the meaning of the message; it just needs to send the literal text as a string to the intended recipient.

Having difficulty determining which type to use? See the examples below.

Example sports application – List type

Consider a sports application, where your samples would include many ways of referring to one sports team, for example, the Montreal Canadiens:

Since you could enumerate each option, you would make this a list type and annotate it accordingly. Additionally, the NLU engine would learn about the entity from these different ways of referring to the Canadiens. You would not have to enumerate every possible sports team or every possible way to refer to the Canadiens.

Example SMS app message recipient – regex or rule-based type

Consider an SMS messaging application, where samples include the destination phone number. There are billions of possible phone number combinations, so clearly you could not enumerate all the possibilities, nor would it really make sense to try. However, phone numbers would not be considered freeform input, since there is a fixed, systematic structure to phone numbers that falls under a small set of pattern formats. These patterns can be recognized either with a regex pattern (for typed in phone numbers) or a grammar (for spoken numbers). Another problem with handling a phone number as a freeform entity is that understanding the phone number contents will be necessary to properly direct the message.

Example SMS app message contents – Freeform type

When your sample entity includes text that does not have well-defined many-to-one relationships and that cannot be fully enumerated or described with rules or patterns, use the freeform entity type. Consider an SMS app, where it is impossible to list or specify every way that a user may say something to your app. The body of an SMS message could be literally anything. Here is an example of what those annotations might look like:

MESSAGE_BODY would be a freeform entity because the contents of a message are unpredictable and cannot be fully enumerated. Moreover, understanding the contents is not necessary to send the message to its destination.

Notes on freeform entity annotation

Some important points to remember about annotating freeform entities:

Notes on freeform entity recognition

Some important points to remember about recognition of entities using freeform collection method:

Best practice

Be careful not to overuse freeform entities, especially when a large base grammar already exists for the information you want to collect, such as SONGS or CITIES. Avoid using a freeform entity to collect this type of information—the NLU engine has already been trained on a huge number of values, and you won't benefit from this if you use a freeform entity.

Predefined entities

Mix.nlu includes a set of predefined entities that can be useful as you develop your own NLU models. Predefined entities save you the trouble of defining entities that are generally useful in a number of different applications, such as monetary amounts, Boolean values, calendar items (dates, times, or both), cardinal and ordinal numbers, and so on.

A predefined entity is not limited to a flat list of values, but instead can contain a complete grammar that defines the various ways that values for that entity can be expressed. A grammar is a compact way of expressing a vast range of possible constructions.

For example, within the nuance_DURATION entity, there is a grammar that defines expressions such as "3.5 hours", "25 mins", "for 33 minutes and 19 seconds", and so on. It would simply not make sense to try to capture the possible expressions for this entity in a list.

Some notes:

For more information, including on specific predefined entities, see Predefined entities.

Dialog predefined entities

Mix.nlu adds a default set of entities to simplify your Mix.dialog applications. These dialog entities are isA entities that refer to predefined entities. Dialog entities have shorter, more descriptive names than predefined entities. This can make it easier to develop and maintain your Mix.dialog application while taking advantage of the convenience of predefined entities.

For example, DATE is a dialog predefined entity that is defined as an isA entity for nuance_CALENDARX. If your Mix.dialog application processes dates, use the DATE entity instead of nuance_CALENDARX.

Like the predefined entities prefaced with nuance_, you cannot rename dialog predefined entities, delete them, or edit them.

Dialog entities appear in the Predefined Entities section of the Entities area. Mix adds them when you create your project.

This table briefly describes the purpose of each dialog predefined entity.

Dialog entity isA predefined entity Description
DATE nuance_CALENDARX Calendar date
TIME nuance_CALENDARX Time of day
YES_NO nuance_BOOLEAN Yes or no

Note: The following dialog entities are deprecated and, therefore, may appear in the Custom Entities list. These dialog entities can be edited, renamed, and deleted.

Dialog entity isA predefined entity Description
CC_EXP_DATE nuance_EXPIRY_DATE Credit card expiry date
CREDIT_CARD nuance_CARDINAL_NUMBER Credit card number
CURRENCY nuance_AMOUNT Monetary amount
DIGITS nuance_CARDINAL_NUMBER String of digits
NATURAL_NUMBER nuance_CARDINAL_NUMBER Round number with no decimal point
PHONE nuance_CARDINAL_NUMBER Telephone number
SSN nuance_CARDINAL_NUMBER Social Security Number
ZIP_CODE nuance_CARDINAL_NUMBER Postal zip code

Tag modifiers

A tag modifier modifies or combines entities in a sample by adding a logical operator: AND, OR, or NOT. You specify tag modifiers by annotating samples.

Your Mix.nlu model can use the AND and OR modifiers to connect multiple entities. It can use the NOT modifier to negate the meaning of a single entity.

For example, "a cappuccino and a latte" would be annotated as [AND][COFFEE_TYPE]cappuccino[/] and a [COFFEE_TYPE]latte[/][/]. The AND modifier applies to the two COFFEE_TYPE annotations.

The literal "no cinnamon" would be annotated as [NOT]no [SPRINKLE_TYPE]cinnamon[/][/]. The NOT modifier applies to the SPRINKLE_TYPE annotation.

Note how you do not simply annotate the literals "and" and "no" as an entity or tag modifier. Instead, tag modifiers are the parents of the annotations that they connect or negate.

Anaphoras

An anaphora is defined as "the use of a word referring back to a word used earlier in a text or conversation, to avoid repetition" (from Lexico/Oxford dictionary).

An anaphora often occurs in dialogs and makes it difficult to understand what the user means. For example, consider the following phrases:

In this example, "there" is an anaphora for "Montreal".

In this example, "him" is an anaphora for "Bob".

An ellipsis (intent anaphora) occurs when a user references an intent that was identified in a previous request. The dialog recognizes when the wording of the new request refers to the intent of the previous request, including its entities. For example: * User: “What is the weather in Boston this weekend?” * System: “This weekend in Boston the weather will be …” * User: “What about Montreal?” * The system understands the intent is to find the weather and includes the entity weekend: “This weekend in Montreal, the weather will be …” Note: Ellipsis are supported in the context of the most recent intent; the system cannot recognize previous intents.

Tagging anaphoras

In Mix.nlu, you can:

This will help your dialog application determine to which entity the anaphora refers, based on the data it has, and internally replace the anaphora with the value to which it refers. For example, "Drive there" would be interpreted as "Drive to Montreal".

The four types of anaphora entities are:

Identify an entity as referable

First, you want to identify the entity as referable.

  1. In the Entities area of the Develop tab, select the entity.
  2. In the Referenced as field, select the correct anaphora type for this entity.
    For example, for a location, select REF_PLACE:
    Anaphora types selection

Annotate a sample containing an anaphora

Once the entity has been identified as referable, you can annotate a sample containing an anaphora reference to that entity.

  1. In the Develop tab, open the intent containing the sample.
  2. Locate the sample containing an anaphora reference to the referable entity, and click the reference word.
    Anaphora click reference
  3. An entity selector menu will open. You should see as options both the referable entity, as well as the corresponding anaphora entity type (REF_xxxx) to which the entity is referable. Select the anaphora entity type from the menu.
    Select anaphora type

The sentence is now annotated as containing an anaphora reference.

Anaphora sentence annotated

Language support

The Nuance Mix Platform offers a growing number of languages. To determine the languages (locales) available to your project, go to the Mix.Dashboard, select your project, and click the Targets tab. For more information, see Build resources.

For the complete list of supported languages, see Languages.

Change log

2022-11-16

Adding notice about relationship collection entities and sensitive data status. For more details, see Handling sensitive information.

2022-10-26

Minor updates to content in Rule-based. A new Expert organization role opens up permissions to access rule-based entity functionality in Mix. Previously this was only available to Nuance Professional Service users.

2022-10-19

Minor updates to content in Discover what your users say to clarify behavior of download Discover data functionality in relation to source selectors and filters.

2022-09-28

Adding ability to set a data type for entities indicating the type of contents the entity will contain. Data types form a contract between Mix.nlu and Mix.dialog, allowing dialog designers to use methods and formatting appropriate to the data type of the entity in messages and conditions. For more details see Add entities to your model.

2022-08-25

Updates to Train your model. The format of the CSV log produced when there are issues in training has been updated. The log now also includes warning information as well as error information. The log also contains clearer messages about the sources of any issues.

2022-06-22

Updates to Bulk operations under both Develop and Optimize. When the number of samples is large and samples are displayed in pages, you can now select all samples on all pages to apply bulk operations.

2022-05-04

Minor updates to Roll out your model.

2022-03-23

The Develop tab file upload module has been re-skinned, and a new file upload option has been added to the Optimize tab. The Develop tab file upload gives a simplified interface to upload samples under a single intent via a text file. The Optimize file upload offers the same, but with additional functionality for power users, allowing for Auto-detection of sample intents, including detection of previously unseen intents.

2021-11-11

Updates to Apply automation.

2021-11-03

Updates to Freeform entities to reflect conventions for values for freeform entities.

2021-10-27

Adding new section Handling sensitive information.

2021-09-29

Updates to Change intent to reflect changes to the move sample intents flow.

2021-09-15

2021-08-25

2021-08-04

2021-06-09

2021-04-21

2021-03-31

2021-03-03

Updates to Optimize tab.

2021-02-03

2021-01-27

2020-12-14

2020-12-02

2020-11-25

2020-10-14

Update to Discover tab enabling export of data as .csv.

2020-09-03

Update to Verify samples to enable bulk operations changing the verification state of multiple samples at the same time.

2020-09-02

Adding new Discover tab. The Mix.nlu Discover tab allows you to see what users are saying to your deployed application, giving you the opportunity to refine your NLU models based on actual data. For now the data is read-only; additional functionality will be added in future releases, such as ability to export data, assign intents, annotate the data, and add selected samples to your training set.

2020-08-30

Update and refactoring of Modify samples and Verify samples sections to reflect updates to the UI of the Develop tab samples view and changes in functionality.

2020-08-11

2020-07-17

Added additional information to Verify samples to explain the impact of the new "intent verified" and "fully verified" states.

Note that action is required to approve (fully verify) entity annotations. This crucial step ensures that models are built with the correct data.

2020-07-14

2020-06-11

2020-05-04

Updated screenshots.

2020-03-31

2020-02-19

2020-01-22

Updated predefined entities section.

2019-12-18

2019-12-02

Updated occurrences of the term "concept" with "entity."

2019-11-15

Below are changes made to the Mix.nlu documentation since the initial Beta release: