@speech action
A custom Datastar action plugin that leverages the Web Speech API in order to generate text-to-speech output.
1<button data-on:click="@speech('hello world')">
2 Listen
3</button>
#Getting started
The plugin expects you to provide an import map that specifies the location of the datastar module, then it's a simple matter of including a <script type="module"> element for the plugin. For example:
1<script type="importmap">
2 {
3 "imports": {
4 "datastar": "https://cdn.jsdelivr.net/gh/starfederation/datastar@1.0.0-RC.6/bundles/datastar.js"
5 }
6 }
7</script>
8<script type="module" src="https://cdn.jsdelivr.net/gh/regaez/datastar-speech@main/datastar-speech.min.js"></script>
You can view the source code on Github.
#Why?
The Web Speech API is generally quite clunky to use. Considering its imperative nature, asynchronous callback functions, poor queue management support, and user-agent inconsistencies; it is not particularly suited for use within the context of HTML attributes / Datastar expressions, where one ideally wants to write concise, functional code.
The plugin attempts to smooth over some of the rough edges, simplify its use and expose the capabilities of the Web Speech API via a declarative API, enabling you to harness text-to-speech functionality more easily with Datastar.
#Examples
For each of these, you can inspect the button element to see how @speech is used.
It accepts string inputs, e.g.
hello world
It accepts number inputs, e.g.
1,928,374.65
It accepts boolean expressions and values, e.g.
true
It accepts HTML elements and will read their text context.
It can also accept a signal created by the
data-refattribute.By default,
@speechwill enqueue the text, so that it will finish speaking any existing queued text before starting the next. You may override this behaviour by setting thequeueoption. Available values areimmediate,next,replace, andappend(default). Note:replacewill remove all other text in the queue.Try clicking this button with different queue options selected. It helps to queue up some of the other speech examples first to notice the difference.
Queue:
You can also customise the
rate,pitch,volume, andvoiceoptions. These values can be controlled via the@speechCtrl('configure', { rate, pitch, volume, voice })action, with signals bound to inputs, if desired. Inspect the<fieldset>element below to see how thedata-on-signal-patchattribute is used to keep the properties in sync with signals.Due to limitations of the Web Speech API, changing any options during playback will cause the current utterance to restart, as
SpeechSynthesisUtterancedoes not allow for these settings to be adjusted on-the-fly.If you manually specify a
voiceoption when using@speechthen it will be preferred over one set by the@speechCtrlaction. This allows you to assign specific voices to utterances that will be retained in the queue history.For example: click 'Listen' above, change the voice selection, click 'Listen' again, then use the Queue from the previous example to play back both entries. Notice how they retain which voice was assigned to them when originally queued to play? Compare that with the earlier examples; those should use whichever voice is currently selected (because they did not specify any
voiceparameter when queued).You can also specify a
langoption. This is useful when the text's language does not match that of the main page, as it will indicate to the user agent which type of voice to prefer using.If this option is not specified, it will try to use the
<html lang="...">attribute value, then fall back to the user-agent's default language preference, similar to the SpeechSynthesisUtterance's behaviour.An important note on the behaviour here is that, if you have previously selected a custom
voice, then that voice will be preferred, regardless oflang. In such cases, in order to let the user agent automatically pick a language-appropriate voice, you must also specifyvoice: 'prefer-lang'as an option.The following, for example, should use a German voice, if your user-agent supports it:
Ich glaube, dass mein Schwein pfeift.
You can try changing the language and see how it affects the voice used, and thus the pronunciation, when clicking the button above:
You can also create custom playback controls using the
@speechCtrlaction, which accepts the following parameters:play,pause,reset,next,previous, andremove.You can also listen for the
datastar-speech-statusevent, which is fired during initialisation, and after every playback status change. From the event details, you can extract properties, such as:isPlaying,canPlay,canPause,canReset,hasNext,hasPrevious, etc, and store them as signals. Inspect the HTML of the following controls to see how they're used.Signal state:
#API
The plugin adds two new actions that you can use within Datastar expressions:
@speech@speechCtrl
#Action @speech
This action is the primary means to start speech, and the only way you can enqueue text, in order for it to be spoken by the Web Speech API. By default, playback will start immediately if the queue is empty or all existing items in the queue have already finished playing.
@speech(input: string|number|boolean|HTMLElement, opts?: SpeechOptions)input
The input must be one of type string, number, boolean, HTMLElement. The input text length cannot exceed 32767 characters (this is a limitation of the Web Speech API).
opts: SpeechOptions
These are not required, and all SpeechOptions properties are also optional. If not set, they will fall back to the SpeechSynthesisUtterance default values.
Properties
- queue
'append' | 'immediate' | 'next' | 'replace'; whenreplace, it cancels current playback, clears the existing speech queue, and plays the given text input immediately.- lang
string; a string representing a BCP 47 language tag. This should match the language of the input text, so that an appropriateSpeechSynthesisVoiceis used. If not set, it will use the value of the HTMLlangattribute, or the user-agent's language.- voice
string; the name of an availableSpeechSynthesisVoice. Note: the list of available voices is different for every user agent, and many sound terrible, thus it is probably best to not specify a voice.
In order to reliably use this option, you should almost always refer to the list of available voices, returned bywindow.speechSynthesis.getVoices()or thedatastar-speech-voices-loadedcustom event, rather than hardcoding a value. An exception to this: when you have manually specified thelangoption, you should also specifyvoice: 'prefer-lang'; this custom value will override voice selections configured via the@speechCtrlaction and let the user-agent pick a language-appropriate voice automatically.
#Action @speechCtrl
This action enables you to control playback, and the primary means to adjust the playback options of the SpeechSynthesisUtterance.
When invoked, it will automatically apply the new configuration to the existing utterance, and any future utterances. If an utterance is currently being played, it will trigger the utterance to restart immediately with the new settings; the Web Speech API does not expose the capability to change these properties on-the-fly.
Properties
@speechCtrl(command: string, opts?: SpeechCtrlOptions)command
The command must be one of the following string values:
- play
- This will start playback, or resume (if paused). You may also, optionally, pass an
indexproperty within theSpeechCtrlOptionsargument to play a specific utterance stored in the queue. If theindexis invalid for the current queue, playback will resume as if noindexhad been provided (i.e. it will restart the current item, or play the next item in the queue). - pause
- This will pause playback. If the user-agent supports it, playback will be able to resume from the exact moment it was paused; otherwise, the utterance will be restarted. If nothing is playing, this command will have no effect.
- reset
- This will stop current playback immediately and clear the playback queue. The queue state cannot be restored once reset.
- next
- This will trigger the next item in the queue to play immediately. If there is no next item, this command will have no effect.
- previous
- This will trigger the previous item in the queue to play immediately. If there is no previous item, this command will have no effect.
- remove
- Can be used to remove an item from the queue. It requires an
indexproperty to be provided with theSpeechCtrlOptionsargument. If the index is invalid for the current queue, this command will have no effect. - configure
- Can be used to adjust playback settings of the
SpeechSynthesisUtterance. You may, optionally, pass the following properties within theSpeechCtrlOptionsargument:rate,pitch,volume, andvoice. If not specified for any given invocation of the@speechCtrlaction, these properties will retain their previous configured value, or fallback to their defaults.
opts: SpeechCtrlOptions
This argument is generally not required, unless the command being invoked expects additional parameters, i.e. play, remove, orconfigure. All SpeechCtrlOptions properties are optional; if not set, there will be no effect to the respective properties when invoking the action.
- index
number; zero-based, targets a specific item in the queue. Only applicable to theplayandremovecommand parameters. It will otherwise be ignored.- rate
number; adjusts the speed of the utterance. Valid values range from 0.1-10. Defaults to 1. Only applicable to theconfigurecommand parameter.- pitch
number; adjusts the pitch of the utterance. Valid values range from 0-2. Defaults to 1. Only applicable to theconfigurecommand parameter.- volume
number; adjusts the volume of the utterance. Valid values range from 0-1. Defaults to 1. Only applicable to theconfigurecommand parameter.- voice
string; the name of an availableSpeechSynthesisVoice. Only applicable to theconfigurecommand parameter. Note: the list of available voices is different for every user agent, and many sound terrible, thus it is probably best to not specify a voice.
In order to reliably use this option, you should almost always refer to the list of available voices, returned bywindow.speechSynthesis.getVoices()or thedatastar-speech-voices-loadedcustom event, rather than hardcoding a value.
#Custom Events
The plugin will emit the following events in order to be able to notify you of any internal state changes within the plugin/Web Speech API:
datastar-speech-statusdatastar-speech-voices-loaded
You can hook onto these using the standard data-on attribute and, if you wish, store any relevant details into your own signals to enable your UI to react to them accordingly.
#Event datastar-speech-status
This custom event is dispatched on the window once the plugin has initialised and each time the playback state of any queued speech has changed.
Properties
The following fields can be found within the evt.detail object:
- isPlaying
boolean; indicates whether an utterance is currently being played.- canPlay
boolean; indicates whether it is possible to start/resume playback, i.e. an utterance is currently paused, or there is at least one utterance remaining in the queue.- canPause
boolean; indicates whether it is possible to pause an utterance, i.e. an utterance is currently playing.- canReset
boolean; indicates whether it is possible to stop playback and clear the utterance queue.- hasNext
boolean; indicates whether there is an item in the queue after the utterance currently being played.- hasPrevious
boolean; indicates whether there is an item in the queue before the utterance currently being played.- queue
string[]; a list containing the text of all utterances that have been queued to play. This also includes previously-ended utterances, unless the queue has since been replaced or reset.- index
number; the position within the playback queue, i.e. which utterance is currently being played, or was last played.
Example
Here is an example using object deconstruction, with property assignment, to store each event property in its own local signal, updating them each time a status event is emitted:
1<div data-on:datastar-speech-status__window="({
2 isPlaying: $_isPlaying,
3 canPlay: $_canPlay,
4 canPause: $_canPause,
5 canReset: $_canReset,
6 hasNext: $_hasNext,
7 hasPrevious: $_hasPrevious,
8 queue: $_queue,
9 index: $_index
10} = evt.detail)"></div>
Pay particular attention to the surrounding parentheses, i.e. ({...} = obj), as these are necessary to evaluate the assignment-pattern deconstruction as a valid expression. It is worth noting that you do not have to deconstruct all properties from evt.detail, only those you actually want to use; you have complete flexibility here to use as many, or as few, signals as you desire.
It is also important to understand that the signal binding is one-way; changing the values for any of these signals yourself will not affect the equivalent internal state of the datastar-speech plugin.
#Event datastar-speech-voices-loaded
This custom event is dispatched on the window when the plugin has detected that the Web Speech API has finished loading the SpeechSynthesisVoice list.
Properties
- detail
SpeechSynthesisVoice[]; the list of voices that are available for this user agent. Please refer to this documentation for more information.
Example
You could use this event with data-on, data-bind, and a custom function, in order to populate a select input options and assign the value to a signal:
1<select
2 data-bind:voice
3 data-on:datastar-speech-voices-loaded__window="appendVoiceOptions(el, evt.detail);"
4></select>
To see this page's implementation of theappendVoiceOptions()function, open up the developer tools console and enter:exampleJS.