Currently there is an approximately ~2 second lag latency between pressing the sentence audio button, and before the audio plays. The wait is jarring to the study experience, and bringing this lag time down would be a great improvement. Perhaps consider an option to trigger TTS generation when the card is loaded by default?