Disclosure for Voice Talent

This document aims to help voice talents understand the technology their voice will be used for, so that they can assess any potential risks involved and make more informed decisions on whether to provide their voice.

Key Terms

Singing Voice Synthesis: Abbreviated as SVS, is a technology that utilizes artificial intelligence to generate realistic singing voices. The fundamental principle of SVS involves analyzing a large number of singing samples from a voice talent to understand their acoustic characteristics, such as pitch, timbre, pronunciation habits, etc., and then using these characteristics to synthesize new vocals.
Singing Voice Model: A computer model that can simulate the unique vocal characteristics of a target singer and convert MIDI and lyrics into singing voices. A singing voice model is a set of binary-format parameters that is not human readable and does not contain audio recordings. It cannot be reverse engineered to derive or construct the original recordings of the voice talent.
Voice Talent: Individuals or target singers whose voices are recorded and used to create singing voice models that possess their acoustic characteristics, enabling the generation of synthesized singing results that are similar to their vocals through SVS.

SVS Service of ACE AI Voice Engine

How it works

ACE AI Voice Engine, hereinafter referred to as "ACE", utilizes deep neural networks to synthesize singing voice. Unlike concatenative singing voice synthesis that rely on classical programming or statistical methods, deep neural networks have the ability to "learn" the combination of expressive elements in human singing voices, resulting in synthesized vocals that are more natural and closer to the target singer.
During the training of a singing voice model, in addition to recordings from voice talent (hereinafter referred to as "voice talent data"), ACE also utilizes a source library containing recordings from multiple target singers and across various languages. This enables the singing voice model to synthesize languages or styles that may not have been present in voice talent data.


Language Transfer: SVS is capable of performing vocals in languages different from voice talent data.
Style Transfer: SVS is capable of performing vocals in styles different from voice talent data.
VoiceMix: In some cases, SVS is able to blend the acoustic characteristics of the voice talent with those of other target singers, resulting in the creation of "another" voice.
Editable AI Parameters: Humans can intervene in the synthesis of singing voices by adjusting the pitch and AI emotional parameters generated by SVS in ACE Studio software, allowing for the creation of synthesized vocals with unique expressiveness.

Approach to Responsible Use of SVS

SVS is an emerging technology that is rapidly advancing. On one hand, this technology has tremendous potential to help music creators break the limitations of vocal abilities and unleash greater musical imagination. On the other hand, it allows people to manipulate a singing voice model to produce new content that emulates specific human vocal traits, and the misuse of it can potentially cause harm.
Honestly speaking, we have not found a perfect method to completely prevent the misuse of SVS. However, we will make every effort to uphold responsible use of ACE SVS services.
  • ACE requires customers who request Custom Voice Model to make a commitment regarding the legality of the source of the voice talent data.
  • ACE requires customers who request Custom Voice Model to provide this document to the voice talent, enabling them to have a clear understanding of the purpose for which their recordings will be used.
  • ACE requires customers who request Custom Voice Model to commit to using the voice models in a manner that complies with all applicable laws and regulations and aligns with existing societal norms. Otherwise, ACE reserves the right to terminate the provision of SVS services to them.
  • ACE will not use voice talent data for any purpose other than training singing voice models. For customers who customize singing voice models using the "Custom Voice Model" feature in ACE Studio, the customer has exclusive rights to use their customized singing voice model.
The public's understanding and perception of SVS technology is an evolving process that changes over time. In the future, we may periodically update or upgrade measures to ensure that the use of SVS aligns with positive, reasonable, and public expectations. If you have any related questions, please contact us at support@acestudio.ai.

Still need help? Contact Us Contact Us