Add VPE to voice product docs (#37513)

* Draft * simplest way to get started * Re working local * Small line break * typo * removing "Questions" section that makes not a lot of sense. * tiny rephrase * markdown * use heading instead of bold * Update source/voice_control/voice_remote_local_assistant.markdown * Update source/voice_control/index.markdown Co-authored-by: Paulus Schoutsen <[email protected]> * Remove one sentence * Update source/voice_control/voice_remote_local_assistant.markdown Co-authored-by: c0ffeeca7 <[email protected]> * Update source/voice_control/index.markdown Co-authored-by: Paulus Schoutsen <[email protected]> * Remove wizard picture * Update source/voice_control/voice_remote_local_assistant.markdown --------- Co-authored-by: c0ffeeca7 <[email protected]> Co-authored-by: Paulus Schoutsen <[email protected]>
home-assistant · Feb 21, 2025 · 54a1dd2 · 54a1dd2
1 parent f05da50
commit 54a1dd2
Show file tree

Hide file tree

Showing 3 changed files with 54 additions and 43 deletions.
diff --git a/source/voice_control/builtin_sentences.markdown b/source/voice_control/builtin_sentences.markdown
@@ -147,27 +147,6 @@ Unlike regular voice timers, delayed commands cannot be canceled or modified.
 - *Pause TV in 10 minutes*
 - *Open the blinds in 5 minutes*
 
-## Questions
-
-### Get information about a state
-
-- *What is the amount of energy from solar production?*
-- *what is the heat pump co2 sensor's co2 level?*
-- *what is the battery level of my phone?*
-
-### Ask about the weather
-
-- *What is the weather*
-- Struggling with this one? Check the [troubleshooting section](/voice_control/troubleshooting/).
-
-### Ask about people
-
-(that have device tracking activated in Home Assistant)
-
-- *How many people are in the kitchen*
-- *Who is in the garage*
-- *Where is Anne*
-
 ## Aborting
 
 - *Nevermind*: If you triggered the wake word by mistake and want to stop Home Assistant from listening

diff --git a/source/voice_control/index.markdown b/source/voice_control/index.markdown
@@ -19,19 +19,26 @@ This section will help you set up Assist, which is Home Assistant voice assistan
 
 Assist allows you to control Home Assistant using natural language. It is built on top of an open voice foundation and powered by knowledge provided by our community.
 
-Assist is available to use on most platforms that can interface with Home Assistant. Look for the Assist icon <img src='/images/assist/assist-icon.svg' alt='Assist icon' style='height: 32px' class='no-shadow'>:
+The simplest way to try out Assist is inside our companion app. Look for the Assist icon <img src='/images/assist/assist-icon.svg' alt='Assist icon' style='height: 32px' class='no-shadow'> at the top right of your dashboard.
+
+The simplest way to get started with Assist is with our recommended voice assistant hardware, the [Home Assistant Voice Preview Edition](/voice-pe/).
 
 As for the rest of Home Assistant core functionalities, Assist can be personalized and extended to fit your needs.
+
 - It can work locally or leverage the greatest LLMs of the moment.
 - It can work on your phone or tablet or other custom voice devices.
 
 <lite-youtube videoid="XF53wUbeLxA" videotitle="Voice at Home Assistant"></lite-youtube>
 
-Although adding voice to your smart home configuration is exciting, it will require you to check your existing setup of Home Assistant, especially if you made a lot of customization. But we have prepared a guide of steps and best practices to help you out, as well as our [Troubleshooting](/voice_control/troubleshooting/) guides.
+## Getting Started
+
+When you configure voice assistant hardware made for Home Assistant, it will use a wizard to help you configure your system and get started to use voice.
+
+Our recommended voice assistant hardware is the [Home Assistant Voice Preview Edition](/voice-pe/).
 
-Ready? Now let's get started
+In case your hardware does not support our wizard, do not worry. Here are two detailed guides based on how you plan to process your voice (Locally, or using Home Assistant Cloud voice services)
 
-- [I plan to use a local speech-to-text/text-to-speech setup](/voice_control/voice_remote_local_assistant/)
+- [I plan to process my voice locally](/voice_control/voice_remote_local_assistant/)
 - [I plan to use Home Assistant Cloud](/voice_control/voice_remote_cloud_assistant/) (recommended as it is the simplest)
 
 ## Expand and Experiment

diff --git a/source/voice_control/voice_remote_local_assistant.markdown b/source/voice_control/voice_remote_local_assistant.markdown
@@ -30,9 +30,37 @@ In Home Assistant, the Assist pipelines are made up of various components that t
 
 ## Some options for speech-to-text and text-to-speech
 
-There is a speech-to-text and text-to-speech option that runs entirely local. No data is sent to external servers for processing.
+There are speech-to-text and text-to-speech options that run entirely local. No data is sent to external servers for processing.
+
+### Speech-to-text engines
+
+There are currently two options to run speech-to-text locally: **Speech-to-Phrase** and **Whisper**.
+
+#### Speech-to-Phrase
+[Speech-to-Phrase](https://github.com/OHF-voice/speech-to-phrase) is a close-ended speech model. 
+
+- It transcribes what it knows.
+- Extremely fast transcription even on a Home Assistant Green or Raspberry Pi 4 (under one second).
+- Only supports a subset of Assist’s voice commands.
+   - More open-ended items such as shopping lists, naming a timer, and broadcasts are *not* usable out of the box.
+- Speech-to-Phrase supports [various languages](https://github.com/OHF-voice/speech-to-phrase?tab=readme-ov-file#supported-languages).
+- These qualities make it a great option for Home control!
+
+#### Whisper
+
+[Whisper](https://github.com/openai/whisper) is an open-ended speech model.
+
+- It will try to transcribe everything.
+- The cost is slower processing speed: 
+    - On a Raspberry Pi 4, it takes around 8 seconds to process incoming voice commands. 
+    - On an Intel NUC, it is done in under a second.
+- Supports [various languages](https://github.com/openai/whisper#available-models-and-languages).
+- Whisper is only a great option in the following case: 
+    1. You have powerful hardware at home.
+    2. You plan to extend your voice set-up beyond simple home control. For example, by pairing your assistant with an LLM-based agent.
+
+### Text-to-speech engine
 
-The speech-to-text option is [Whisper](https://github.com/openai/whisper). It’s an open source AI model that supports [various languages](https://github.com/openai/whisper#available-models-and-languages). We use a forked version called [faster-whisper](https://github.com/SYSTRAN/faster-whisper). On a Raspberry Pi 4, it takes around 8 seconds to process incoming voice commands. On an Intel NUC, it is done in under a second.
 For text-to-speech, we have developed [Piper](https://github.com/rhasspy/piper). Piper is a fast, local neural text-to-speech system that sounds great and is optimized for the Raspberry Pi 4. It supports [many languages](https://rhasspy.github.io/piper-samples/). On a Raspberry Pi, using medium quality models, it can generate 1.6s of voice in a second.
 
 Please be sure to check how either option will work in your language, since quality can change quite a bit.
@@ -42,20 +70,15 @@ Please be sure to check how either option will work in your language, since qual
 For the quickest way to get your local Assist pipeline started, follow these steps:
 
 1. Install the add-ons to convert text into speech and vice versa.
-   - Install the {% my supervisor_addon addon="core_whisper" title="**Whisper**" %} and the {% my supervisor_addon addon="core_piper" title="**Piper**" %} add-ons.
-     ![Install the Whisper and Piper add-ons](/images/assist/piper-whisper-install-01.png)
-   - If you want to use a wake word, also install the {% my supervisor_addon addon="core_openwakeword" title="**openWakeWord**" %} add-on.
+   - Install the speech-to-text add-on of your choice, either {% my supervisor_addon addon="core_speech-to-phrase" title="**Speech-to-Phrase**" %} or {% my supervisor_addon addon="core_whisper" title="**Whisper**" %}.
+   - Install {% my supervisor_addon addon="core_piper" title="**Piper**" %} for text-to-speech.
    - Start the add-ons.
    - Once the add-ons are started, head over to the integrations under {% my integrations title="**Settings** > **Devices & Services**" %}.
-     - You should now see Piper and Whisper being discovered by the [Wyoming integration](/integrations/wyoming/).
+     - You should now see both services being discovered by the [Wyoming integration](/integrations/wyoming/).
        ![Whisper and Piper integrations](/images/assist/piper-whisper-install-new-02.png)
    - For each integration, select **Add**.
-     - Once the setup is complete, you should see both Piper and Whisper (and, optionally, also openWakeword) in one integration.
-
-       ![Whisper and Piper integration](/images/assist/piper-whisper-install-new-03.png)
-       - **Whisper** converts speech into text.
-       - **Piper** converts text into speech.
-       - **Wyoming** is the protocol they are both using to communicate.
+   - You now have integrated a local speech-to-text engine of your choice (either {% my supervisor_addon addon="core_speech-to-phrase" title="**Speech-to-Phrase**" %} or {% my supervisor_addon addon="core_whisper" title="**Whisper**" %}) and a text-to-speech engine ({% my supervisor_addon addon="core_piper" title="**Piper**" %}).
+
 2. Setup your assistant.
 
    - Go to {% my voice_assistants title="**Settings** > **Voice assistants**" %} and select **Add assistant**.
@@ -71,13 +94,9 @@ For the quickest way to get your local Assist pipeline started, follow these ste
    - Enter a name. You can pick any name that is meaningful to you.
    - Select the language that you want to speak.
    - Under **Conversation agent**, select **Home Assistant**.
-   - Under **Speech-to-text**, select **faster-whisper**. Select the language.
-   - Under **Text-to-speech**, select **piper**. Select the language.
+   - Under **Speech-to-text**, select the speech-to-text engine you choose in the previous step (either **Whisper** or **Speech-to-Phrase**). Select the language.
+   - Under **Text-to-speech**, select **Piper**. Select the language.
      - Depending on your language, you may be able to select different language variants.
-   - If you like, pick one of the predefined wake words.
-     ![Select wake word](/images/assist/assist_predefined_wakeword.png)
-     - You can even [define your own wake word](/voice_control/create_wake_word/). This is not difficult to do, but you will need to set aside a bit of time for this.
-     - Once you defined your own wake word, it will show in this pick list.
 
 3. That's it. You ensured your voice commands can be processed locally on your device.
 4. If you haven't done so yet, [expose your devices to Assist](/voice_control/voice_remote_expose_devices/#exposing-your-devices).
@@ -94,6 +113,12 @@ The options are also documented in the add-on itself. Go to the {% my supervisor
 
 Also be sure to check the specific tutorial for [using Piper in Automations](voice_control/using_tts_in_automation/)
 
+## Learning more about Speech-to-Phrase
+
+You can check out [Voice Chapter 9](/blog/2025/02/13/voice-chapter-9-speech-to-phrase/) to learn more about why we introduced Speech-to-Phrase, and why it's a great option for home control.
+
+<lite-youtube videoid="k6VvzDSI8RU" videotitle="Voice Chapter 9"></lite-youtube>
+
 ## Next steps
 Once Assist is configured, now can now start using it. You can now talk through your device ([Android](/voice_control/android/), [iOS](/voice_control/apple/) or [Voice Preview edition](https://voice-pe.home-assistant.io/getting-started/)).