クマは森で用を足しますか?

アウトプットは重要です。

What is the best way to get complete user utterance with Alexa?

(この記事の日本語版はこちら)

cheerio-the-bear.hatenablog.com

I am looking for the best way to get complete user utterance with Amazon Alexa in order to port the following application for Google Home to Amazon Echo.

assistant.google.com

As far as I investigated, there are three possible options available for this kind of purpose.

Option 1. Custom slot type

To get straight to the point, the solution that I selected is to create my custom slot type with meaningless sample utterance and synonim. The following screenshot shows you that my custom slot type 'Utterance' has only one sample utterance 'a' and the regitered sysnonim for it is only 'a a'.

f:id:cheerio-the-bear:20190126183719p:plain
Custome slot type with sample utterance 'a' and its synonim 'a a'

I did not add the strange synonim 'a a' in the first attempt. The user utterance 'book' was stored into the slot 'utterance' that I applied my custom slot type without the synonim as expected.

f:id:cheerio-the-bear:20190126184844p:plain
User utterance 'book' with no synonim registered

However, it worked for a single word. That mean that it did not work if user utterance contain more than one word. Please see the following screenshot. The user utterance 'red book' was not matched with my custom slot type, then AMAZON.FallbackIntent was unexpectedly selected.

f:id:cheerio-the-bear:20190126184933p:plain
User utterance 'red book' with no synonim registered

Then, I added the synonim 'a a'. It looks that the system understand that the slot can contain more than one word because of the added synonim 'a a'. See the following screenshot that shows the user utterance consists of multiple words can be stored into the slot 'utterance' my custom slot type applied.

f:id:cheerio-the-bear:20190126190413p:plain
User utterance with the strange synonim 'a a'

This is not a perfect solution, but might be the best possible one at this moment. I described the reason why I think this is not a perfect solution in the next article.

cheerio-the-bear.hatenablog.com

Option 2. AMAZON.Color

I found the following page and actually used AMAZON.Color for getting complete user utterance until just before the end of the development. It works as expected in the most cases.

stackoverflow.com

Then, you are wondering why I changed the solution from this to the other one that I described above, right? I just thought that the recognition rate of user utterance might become worse if I continue to use AMAZON.Color because it must have many sample utterance like "red", "green" and so on. It might be possible that the voice recognition system recognizes user's unclear utterance matches one of the sample utterance meaning color even it is not related to color at all. Please note that I just thought that and have no evidence.

Option 3. AMAZON.SearchQuery

Only AMAZON.SearchQuery is the slot type for phrase when I am writing this article. I thought that this is the one I was looking for and believed that it works. However, it is not allowed for me to use it even just for a trial because AMAZON.SearchQuery requires carrier phrase. You can see the error message displayed when I attempted to save the sample utterance I defined.

f:id:cheerio-the-bear:20190126174411p:plain
AMAZON.SearchQuery with no carrier phrase

In my case, the sample utterance must be only "{slot_name}" with no additional word because I have to get complete user utterance, so this slot type could not be an option to choose.