How to create a pronunciation assessment App (Part 2)
Jérôme M
2024-08-26T15:36:59Z
The purpose of this tutorial is to create an application to control the user's pronunciation.
To follow it, you must have knowledge of javascript and more ideally Vue.js 3.
What we are going to do
In the previous article we set up the skeleton of the application with the Quasar framework based on VueJS
We will now set up the first of the two main components of the application:
- The component responsible for entering the word or sentence to be pronounced. The latter will be able to read the word to allow the user to listen to what he must pronounce
- The voice acquisition component. This is a button to record his voice and then send the recording to the API.
We will start by creating the skeleton of the component and positioning it in the main page of the application.
To do this, create the WordInputField.vue
file in the /src/components
directory. You can take this opportunity to delete the EssentialLink.vue
file which is no longer useful.
Creating a vue component
<template>
<q-input
class="input-text q-mt-lg"
v-model="sentence"
type="textarea"
:lines="2"
autogrow
hint="Input a word or a sentence"
clearable
/>
</template>
<script setup>
import { ref } from 'vue'
// Reference on the word or sentence to be pronounced
const sentence = ref('')
</script>
<style scoped>
.input-text {
width: 80vw;
max-width: 400px;
}
</style>
Once this is done we can use the component in the main page: /src/pages/IndexPage.vue
<template>
<q-page class="column wrap content-center items-center">
<sentence-input-field />
<div>
<q-btn
class="q-mt-lg"
icon="mic"
color="primary"
round
size="30px"
@click="record"
/>
</div>
</q-page>
</template>
<script setup>
import SentenceInputField from '../components/SentenceInputField'
function record () {
console.log('Record')
}
</script>
As you can see, this allows us to remove the style
section and the script code declaring the sentence
reference
We will now be able to focus on the design of this component
Component design
We will implement the following component.
With its q-input
component, Quasar has already done almost all the work for us!
Here is how we will modify our html template
<template>
<q-input
class="input-text q-mt-lg"
v-model="sentence"
type="textarea"
:lines="2"
autogrow
hint="Input a word or a sentence"
clearable
>
<template v-slot:after>
<q-btn
round
dense
flat
icon="record_voice_over"
>
<q-tooltip>Listen the word or the sentence</q-tooltip>
</q-btn>
</template>
</q-input>
</template>
Speech synthesis
Now we have to pronounce what was entered, that is to say the content of the sentence reference to our device.
We will use the SpeechSyntesis API. To simplify the use of the API we will use a wrapper around it: Artyom.js
which is available as an npm module.
In a console, install the module:
npm i --save artyom.js
Then add the initialization code in the javascript part of the component
<script setup>
import { ref } from 'vue'
import Artyom from 'artyom.js'
// Reference on the word or sentence to be pronounced
const sentence = ref('')
const artyom = new Artyom()
artyom.initialize({
debug: false,
continuous: false,
listen: false,
lang: 'de-DE'
})
async function speak () {
artyom.say(sentence.value)
}
</script>
Note that we are using the German language by initializing Artyom with: lang: 'de-DE'
We could use another language like:
- French: 'fr-FR'
- Spanish: 'es-ES'
All that's left is to add our speak
function to the button:
<template v-slot:after>
<q-btn
round
dense
flat
icon="record_voice_over"
@click="speak"
>
<q-tooltip>Listen the word or the sentence</q-tooltip>
</q-btn>
</template>
Small improvements to the component
We will make some improvements to our component.
First of all, if the API is not available, the button should not appear.
To do this, we will simply create a reference that will test the availability of the API isSpeachSyntesisAvailable
and condition the creation of the button on this:
<script setup>
import { ref } from 'vue'
import Artyom from 'artyom.js'
// Reference on the word or sentence to be pronounced
const sentence = ref('')
const isSpeechSyntesisAvailable = ref(false)
const artyom = new Artyom()
artyom.initialize({
debug: false,
continuous: false,
listen: false,
lang: 'de-DE'
}).then(() => {
isSpeechSyntesisAvailable.value = artyom.speechSupported()
})
async function speak () {
artyom.say(sentence.value)
}
</script>
In the implementation we wait for the end of the initialization to know if the speech synthesis is available, so we must only make the button available once the initialization is done. To do this we will add an isArtyomReady
reference and a computed to know if the button is disable
cannotSpeak
(we prefer to use cannotSpeak
than canSpeak
because we will associate it with the disable
state, so it is better that it is true
when the button is disabled
)
<script setup>
import { ref, computed } from 'vue'
import Artyom from 'artyom.js'
// Reference on the word or sentence to be pronounced
const sentence = ref('')
const isArtyomReady = ref(false)
const isSpeechSyntesisAvailable = ref(false)
const cannotSpeak = computed(() => {
return !sentence.value || !isArtyomReady.value || !isSpeechSyntesisAvailable.value
})
const artyom = new Artyom()
artyom.initialize({
debug: false,
continuous: false,
listen: false,
lang: 'de-DE'
}).then(() => {
isSpeechSyntesisAvailable.value = artyom.speechSupported()
isArtyomReady.value = true
})
async function speak () {
artyom.say(sentence.value)
}
</script>
This point must be taken into account in our template
<template>
<q-input
class="input-text q-mt-lg"
v-model="sentence"
type="textarea"
:lines="2"
autogrow
hint="Input a word or a sentence"
clearable
>
<template v-slot:after>
<q-btn
round
dense
flat
icon="record_voice_over"
:disable="cannotSpeak"
@click="speak"
>
<q-tooltip>Listen the word or the sentence</q-tooltip>
</q-btn>
</template>
</q-input>
</template>
All that remains is to prohibit the user from clicking the button twice. For this we will use a new ref isSpeaking
to know if the speech synthesis is in progress or not, and add its state to the calculation of the computed cannotSpeak
<script setup>
import { ref, computed } from 'vue'
import Artyom from 'artyom.js'
// Reference on the word or sentence to be pronounced
const sentence = ref('')
const isArtyomReady = ref(false)
const isSpeechSyntesisAvailable = ref(false)
const isSpeaking = ref(false)
const cannotSpeak = computed(() => {
return isSpeaking.value || !sentence.value || !isArtyomReady.value || !isSpeechSyntesisAvailable.value
})
const artyom = new Artyom()
artyom.initialize({
debug: false,
continuous: false,
listen: false,
lang: 'de-DE'
}).then(() => {
isSpeechSyntesisAvailable.value = artyom.speechSupported()
isArtyomReady.value = true
})
async function speak () {
isSpeaking.value = true
artyom.say(sentence.value, {
onEnd: function () {
isSpeaking.value = false
}
})
}
We now have an application that allows you to enter a sentence and listen to it in the chosen language. For this we have isolated an independent view component.
What is left to do?
In a future part we will see how to acquire the audio, and obtain a score via the SpeechSuper API
- Part 3: Acquiring the audio and the score via the SpeechSuper API
- Part 4: Packaging the application
Conclusion
Feel free to comment on the article! Part 3 is coming soon!