Introducing the new Transcription Editor for Podcasts and long voice recordings
23 February 2023 | 3 min min
Navigating and understanding the structure of long voice recordings can be challenging, especially for those not experienced in reading waveforms.
At Altered we aim to make voice editing as intuitive and accessible as possible and let users focus on content and creativity.
Following feedback from our users, Altered Studio has undergone a major redesign that made the Transcription Editor intuitive and user-friendly. The new design centers around speech blocks, allowing the user to immediately visualize the pattern of the speech in the audio file. It also greatly improves ease of access to basic operations like Speech-To-Speech Morphing, Text-To-Speech, Speech-To-Text, and Translation, which are now 1 click away.
Block-Based Editing
By introducing speech blocks we made navigation in long voice recording a much simpler process. The blocks offer an overview of speech structure, allowing the user to:
- navigate the recording without using the waveform;
- focus on the most important parts of the voice performance with a glance;
- edit, cut and apply Voice AI and audio effects directly on whole blocks in 1 click;
- edit (increase/decrease/remove) silences with just a mouse drag.
Quick access to Voice AI
The new Transcription Editor interface allows users to perform the most essential actions directly on the text or the audio elements.
Users can easily add new elements to their audio file directly from the Transcription Editor. For example, inserting a recording, a silence break, or Text-To-Speech in-between speech blocks can be done with 1 click.
Moreover, all the Voice AI effects, including Speech-To-Speech Morph, Text-To-Speech, and Speech-To-Text are now available directly from the speech blocks, without the need to select individual words or to navigate the waveform. Audio effects like reverb, chorus, or EQ can also be directly applied to a block.
Simple Transcription Edits
All transcriptions have a modest amount of word error rates that depend on the quality of the input audio and the articulation of the speakers therein. Fixing the transcription errors is critical in many use-cases, such as creating subtitles, etc.
In this release we improved simple transcription edits such as editing text, moving and splitting word boundaries, to speed up corrections of the workflow.
Moving Word Boundaries
Word boundaries can now be edited directly on the waveform by dragging the boundaries lines of each word. This allows Users to quickly fix any alignment issues between transcriptions and audio.
Splitting Words
We made it easier to split word boundaries, which can now be made on the waveform with CTRL+CLICK.
Editing Text
Editing text is made quicker and straightforward. Any word can be edited with SHIFT+CLICK.
Voice Activity Detection
One of the most useful features of the new Transcription Panel, is the automatic detection of Voice Activity, that essentially breaks down the audio into speech and silence regions. Speech regions can be used as a cost-free replacement for the transcription tool, allowing users to conveniently navigate and interact with their recordings without consuming Speech-To-Text quota. This is particularly useful for users navigating large audio files on a daily basis.
The new Transcription Panel in Altered Studio reflects our aim to make speech editing more accessible and intuitive as possible, in order to boost productivity and creativity, rather than being bogged down in technicalities and housekeeping.
It is yet another step tackling User Experience challenges of merging audio and speech editing with traditional effects and cutting edge Voice AI. Stay tuned by subscribing to our newsletter and please let us know your thoughts and wishes.