Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying. To support the research community, we are providing access to pretrained model checkpoints ready for inference.
## Prerequisites
- Node.js environment
-`aonweb` library installed
- Valid Aonet APPID
## Basic Usage
### 1. Import Required Modules
```js
import{AI,AIOptions}from'aonweb';
```
### 2. Initialize AI Instance
```js
constai_options=newAIOptions({
appId:'your_app_id_here',
dev_mode:true
});
constaonweb=newAI(ai_options);
```
### 3. Prepare Input Data
```js
constdata={
input:{
"prompt":"Hello, my name is Suno. And, uh — and I like pizza. [laughs] But I also have other interests such as playing tic tac toe."
-`prompt`: String, the text content to be converted into speech.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
-`task`: String, the text content to be converted into speech.
-`image`: String, the URL of the audio file used as the voice sample for cloning.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Blip-2 API, which is used for voice cloning and text-to-speech conversion.
-`image`: String, the text content to be converted into speech.
-`caption`: String, the URL of the audio file used as the voice sample for cloning.
-`question`: String, specifies the language of the text, with "en" indicating English.
-`temperature`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Chattts API, which is used for voice cloning and text-to-speech conversion.
## Prerequisites
- Node.js environment
-`aonweb` library installed
- Valid Aonet APPID
## Basic Usage
### 1. Import Required Modules
```js
import{AI,AIOptions}from'aonweb';
```
### 2. Initialize AI Instance
```js
constai_options=newAIOptions({
appId:'your_app_id_here',
dev_mode:true
});
constaonweb=newAI(ai_options);
```
### 3. Prepare Input Data
```js
constdata={
input:{
"text":"chat T T S 是一款强大的对话式文本转语音模型。它有中英混读和多说话人的能力。\nchat T T S 不仅能够生成自然流畅的语音,还能控制[laugh]笑声啊[laugh],\n停顿啊[uv_break]语气词啊等副语言现象[uv_break]。这个韵律超越了许多开源模型[uv_break]。\n请注意,chat T T S 的使用应遵守法律和伦理准则,避免滥用的安全风险。[uv_break]",
-`text`: String, the text content to be converted into speech.
-`speaker`: String, the URL of the audio file used as the voice sample for cloning.
-`language`: String, specifies the language of the text, with "en" indicating English.
-`cleanup_voice`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Codeformer API, which is used for voice cloning and text-to-speech conversion.
-`image`: String, the text content to be converted into speech.
-`upscale`: String, the URL of the audio file used as the voice sample for cloning.
-`face_upsample`: String, specifies the language of the text, with "en" indicating English.
-`background_enhance`: Boolean, whether to perform cleanup processing on the generated voice.
-`codeformer_fidelity`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Controlnet API, which is used for voice cloning and text-to-speech conversion.
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the DreamBooth API, which is used for voice cloning and text-to-speech conversion.
-`text`: String, the text content to be converted into speech.
-`speaker`: String, the URL of the audio file used as the voice sample for cloning.
-`language`: String, specifies the language of the text, with "en" indicating English.
-`cleanup_voice`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Gfpgan API, which is used for voice cloning and text-to-speech conversion.
-`img`: String, the text content to be converted into speech.
-`scale`: String, the URL of the audio file used as the voice sample for cloning.
-`version`: String, specifies the language of the text, with "en" indicating English.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the MiniGpt-4 API, which is used for voice cloning and text-to-speech conversion.
-`text`: String, the text content to be converted into speech.
-`speaker`: String, the URL of the audio file used as the voice sample for cloning.
-`language`: String, specifies the language of the text, with "en" indicating English.
-`cleanup_voice`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Real-Esrgan API, which is used for voice cloning and text-to-speech conversion.
-`text`: String, the text content to be converted into speech.
-`speaker`: String, the URL of the audio file used as the voice sample for cloning.
-`language`: String, specifies the language of the text, with "en" indicating English.
-`cleanup_voice`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Sdxl API, which is used for voice cloning and text-to-speech conversion.
## Prerequisites
- Node.js environment
-`aonweb` library installed
- Valid Aonet APPID
## Basic Usage
### 1. Import Required Modules
```js
import{AI,AIOptions}from'aonweb';
```
### 2. Initialize AI Instance
```js
constai_options=newAIOptions({
appId:'your_app_id_here',
dev_mode:true
});
constaonweb=newAI(ai_options);
```
### 3. Prepare Input Data
```js
constdata={
input:{
"width":768,
"height":768,
"prompt":"An astronaut riding a rainbow unicorn, cinematic, dramatic",
-`text`: String, the text content to be converted into speech.
-`speaker`: String, the URL of the audio file used as the voice sample for cloning.
-`language`: String, specifies the language of the text, with "en" indicating English.
-`cleanup_voice`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Stable-Diffusion API, which is used for voice cloning and text-to-speech conversion.
## Prerequisites
- Node.js environment
-`aonweb` library installed
- Valid Aonet APPID
## Basic Usage
### 1. Import Required Modules
```js
import{AI,AIOptions}from'aonweb';
```
### 2. Initialize AI Instance
```js
constai_options=newAIOptions({
appId:'your_app_id_here',
dev_mode:true
});
constaonweb=newAI(ai_options);
```
### 3. Prepare Input Data
```js
constdata={
input:{
"prompt":"an astronaut riding a horse on mars, hd, dramatic lighting",
-`text`: String, the text content to be converted into speech.
-`speaker`: String, the URL of the audio file used as the voice sample for cloning.
-`language`: String, specifies the language of the text, with "en" indicating English.
-`cleanup_voice`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
This document will guide developers on how to use the aonweb library to call the Whisper API, which is used for voice cloning and text-to-speech conversion.
-`text`: String, the text content to be converted into speech.
-`speaker`: String, the URL of the audio file used as the voice sample for cloning.
-`language`: String, specifies the language of the text, with "en" indicating English.
-`cleanup_voice`: Boolean, whether to perform cleanup processing on the generated voice.
### Notes
- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
- Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
### Example Response
The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.