update document

8fd1fb38 · duanjinfei · 22552bfc · 8fd1fb38 · 8fd1fb38 · 8fd1fb38
Commit 8fd1fb38 authored Aug 07, 2024 by duanjinfei
19 changed files
--- a/docs/publicModelsAPI/Bark API Usage Guide.md
+++ b/docs/publicModelsAPI/Bark API Usage Guide.md
@@ -75,11 +75,3 @@ try {
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/Blip API Usage Guide.md
+++ b/docs/publicModelsAPI/Blip API Usage Guide.md
@@ -105,12 +105,3 @@ try {
 The API response will contain the results of the image recognition or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Unimodal encoders, which separately encode image and text. The image encoder is a vision transformer. The text encoder is the same as BERT. A token is appended to the beginning of the text input to summarize the sentence.
- Image-grounded text encoder, which injects visual information by inserting a cross-attention layer between the self-attention layer and the feed forward network for each transformer block of the text encoder. A task-specific token is appended to the text, and the output embedding of  is used as the multimodal representation of the image-text pair.
- Image-grounded text decoder, which replaces the bi-directional self-attention layers in the text encoder with causal self-attention layers. A special token is used to signal the beginning of a sequence.
- Image-Text Contrastive Loss (ITC) activates the unimodal encoder. It aims to align the feature space of the visual transformer and the text transformer by encouraging positive image-text pairs to have similar representations in contrast to the negative pairs.
- Image-Text Matching Loss (ITM) activates the image-grounded text encoder. ITM is a binary classification task, where the model is asked to predict whether an image-text pair is positive (matched) or negative (unmatched) given their multimodal feature.
- Language Modeling Loss (LM) activates the image-grounded text decoder, which aims to generate textual descriptions conditioned on the images.
--- a/docs/publicModelsAPI/Blip-2 API Usage Guide.md
+++ b/docs/publicModelsAPI/Blip-2 API Usage Guide.md
@@ -112,11 +112,3 @@ try {
 ### Example Response
 The API response will contain the results of the image recognition or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
\ No newline at end of file
--- a/docs/publicModelsAPI/Chattts API Usage Guide.md
+++ b/docs/publicModelsAPI/Chattts API Usage Guide.md
@@ -104,11 +104,3 @@ try {
 The API response will contain the URL of the generated text-to-speech output or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/Codeformer API Usage Guide.md
+++ b/docs/publicModelsAPI/Codeformer API Usage Guide.md
@@ -83,11 +83,3 @@ try {
 The API response will contain the results of the image recognition or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/Controlnet API Usage Guide.md
+++ b/docs/publicModelsAPI/Controlnet API Usage Guide.md
@@ -126,11 +126,4 @@ try {
 The API response will contain the results of the image recognition or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/DreamBooth API Usage Guide.md
+++ b/docs/publicModelsAPI/DreamBooth API Usage Guide.md
@@ -98,12 +98,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/FunASR API Usage Guide.md
+++ b/docs/publicModelsAPI/FunASR API Usage Guide.md
@@ -17,62 +17,48 @@ This document will guide developers on how to use the `aonweb` library to invoke
 - `aonweb` library installed
 - Valid Aonet APPID
-## Installation
+## Basic Usage
-Ensure the `aonweb` library is installed. If not, you can install it using npm:
+### 1. Import Required Modules
-```bash
+```js
-npm install aonweb
+import { AI, AIOptions } from 'aonweb';
 ```
-## Usage Instructions
+### 2. Initialize AI Instance
-### 1. Import the `aonweb` Library
+```js
+const ai_options = new AIOptions({
+    appId: 'your_app_id_here',
+    dev_mode: true
+});
-```javascript
+const aonweb = new AI(ai_options);
-const AI = require("aonweb");
 ```
-### 2. Configure Options
+### 3. Prepare Input Data Example
-Create an `options` object containing your APPID:
+```js
+const data = {
-```javascript
-const options = {
-    appid: "your_APPID"
-};
-```
-Make sure to replace `"your_APPID"` with your actual Aonet APPID.
-### 3. Initialize AI Instance
-Initialize the AI instance using the configuration options:
-```javascript
-const aonweb = new AI(options);
-```
-### 4. Invoke FunASR API
-Use the `prediction` method to call the FunASR API:
-```javascript
-async function performSpeechRecognition() {
-    try {
-        let response = await aonweb.prediction("/predictions/ai/funasr", {
    input: {
        "awv": "https://aonweb.ai/mgxm/d9fa255c-4c47-4fec-99ce-f190539f10c4/olle.mp3",
        "batch_size": 300
    }
-        });
+};
-        console.log("FunASR result:", response);
+```
-    } catch (error) {
-        console.error("Error performing speech recognition:", error);
-    }
-}
-performSpeechRecognition();
+### 4. Call the AI Model
+```js
+const price = 8; // Cost of the AI call
+try {
+    const response = await aonweb.prediction("/predictions/ai/funasr", data, price);
+    // Handle response
+    console.log("Face swap result:", response);
+} catch (error) {
+    // Error handling
+    console.error("Error performing face swap:", error);
+}
 ```
 ### Parameter Description
@@ -91,13 +77,6 @@ performSpeechRecognition();
 The API response will contain the recognized text content. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch audio processing by processing multiple audio files in a loop or concurrently.
- Add a user interface to allow users to upload their audio files or provide audio URLs.
- Implement real-time speech recognition by integrating the API into live audio streams.
- Integrate post-processing features for text, such as punctuation addition, semantic analysis, or sentiment analysis.
- Consider implementing multi-language support to handle audio in different languages as needed.
 By following this guide, you should be able to effectively use the FunASR API for automatic speech recognition in your applications. If you have any questions or need further clarification, feel free to ask.
--- a/docs/publicModelsAPI/Gfpgan API Usage Guide.md
+++ b/docs/publicModelsAPI/Gfpgan API Usage Guide.md
@@ -79,11 +79,4 @@ try {
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/IDM-VTON AI Model Usage Guide.md
+++ b/docs/publicModelsAPI/IDM-VTON AI Model Usage Guide.md
@@ -94,37 +94,6 @@ Use try-catch blocks to catch and handle possible errors.
 - Ensure the validity and accessibility of image URLs.
 - Adhere to the API provider's terms of use and restrictions.
-## Example Code
-```js
-const AI = require("aonweb");
-async function runIDMVTON() {
-    const options = {
-        auth: process.env.AONET_API_KEY // Store API key in environment variable
-    };
-    const aonweb = new AI(options);
-    try {
-        const response = await aonweb.prediction("/predictions/ai/idm-vton", {
-            input: {
-                "seed": 42,
-                "steps": 30,
-                "garm_img": "https://example.com/garment.jpg",
-                "human_img": "https://example.com/person.jpg",
-                "garment_des": "elegant blue dress"
-            }
-        });
-        console.log("IDM-VTON Result:", response);
-        // Further processing of the response...
-    } catch (error) {
-        // Error handling...
-    }
-}
-runIDMVTON();
-```
 ## Conclusion

--- a/docs/publicModelsAPI/LLaMA 3 API Usage Guide.md
+++ b/docs/publicModelsAPI/LLaMA 3 API Usage Guide.md
@@ -40,20 +40,7 @@ const aonweb = new AI(ai_options);
 ```js
 const data = {
-   input:
+   input:{
-};
-```
-### 4. Call the LLaMA 3 API
-Use the `prediction` method to call the LLaMA 3 API:
-```js
-async function generateText() {
-    try {
-        let response = await aonweb.prediction("/predictions/ai/lllama3:0.0.8",
-        {
-            input: {
        "top_p": 1,
        "prompt": "Plan a day of sightseeing for me in San Francisco.",
        "temperature": 0.75,
@@ -61,14 +48,22 @@ async function generateText() {
        "max_new_tokens": 800,
        "repetition_penalty": 1
    }
-        });
+};
-        console.log("LLaMA 3 result:", response);
+```
-    } catch (error) {
-        console.error("Error generating text:", error);
-    }
-}
-generateText();
+### 4. Call the AI Model
+```js
+const price = 8; // Cost of the AI call
+try {
+    const response = await aonweb.prediction("/predictions/ai/lllama3:0.0.8", data, price);
+    // Handle response
+    console.log("IDM-VTON Response:", response);
+} catch (error) {
+    // Error handling
+     console.error("Error generate :", error);
+}
 ```
 ### Parameter Description
@@ -90,12 +85,3 @@ generateText();
 ### Example Response
 The API response will contain the generated text. Parse and use the response data according to the actual API documentation.
\ No newline at end of file
-## Advanced Usage
- Implement a conversation system by maintaining conversation history to generate coherent multi-turn dialogues.
- Add a user interface that allows users to customize system prompts and other parameters.
- Implement text post-processing features, such as summary generation, keyword extraction, or sentiment analysis.
- Integrate a content filtering system to ensure the generated content complies with usage policies.
- Consider implementing a caching mechanism to improve response times for frequent queries.
--- a/docs/publicModelsAPI/MiniGpt-4 API Usage Guide.md
+++ b/docs/publicModelsAPI/MiniGpt-4 API Usage Guide.md
@@ -81,12 +81,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/Real-Esrgan API Usage Guide.md
+++ b/docs/publicModelsAPI/Real-Esrgan API Usage Guide.md
@@ -76,12 +76,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/SadTalker API Usage Guide.md
+++ b/docs/publicModelsAPI/SadTalker API Usage Guide.md
@@ -80,11 +80,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated talking avatar or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement error retry mechanisms, especially for long-running tasks.
- Add input validation logic to ensure the provided URLs point to valid image and audio files.
- Consider implementing progress tracking, especially for generating video output.
- For production environments, implement rate limiting and caching mechanisms to optimize API usage.
--- a/docs/publicModelsAPI/Sdxl API Usage Guide.md
+++ b/docs/publicModelsAPI/Sdxl API Usage Guide.md
@@ -86,12 +86,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/Stable Diffusion 3 API Usage Guide.md
+++ b/docs/publicModelsAPI/Stable Diffusion 3 API Usage Guide.md
@@ -83,11 +83,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated image or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch image generation by processing multiple prompts in a loop or concurrent requests.
- Add a user interface that allows users to input custom prompts and parameters.
- Implement image post-processing features, such as resizing, cropping, or applying filters.
- Integrate an image storage solution to save and manage the generated images.
--- a/docs/publicModelsAPI/Stable-Diffusion API Usage Guide.md
+++ b/docs/publicModelsAPI/Stable-Diffusion API Usage Guide.md
@@ -79,12 +79,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/Whisper API Usage Guide.md
+++ b/docs/publicModelsAPI/Whisper API Usage Guide.md
@@ -73,12 +73,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.
--- a/docs/publicModelsAPI/XTTS-V2 API Usage Guide.md
+++ b/docs/publicModelsAPI/XTTS-V2 API Usage Guide.md
@@ -80,12 +80,3 @@ try {
 ### Example Response
 The API response will contain the URL of the generated cloned voice or other relevant information. Parse and use the response data according to the actual API documentation.
-## Advanced Usage
- Implement batch text-to-speech conversion by processing multiple text segments in a loop or concurrent requests.
- Add a user interface that allows users to upload their own voice samples and input custom text.
- Implement voice post-processing features, such as adjusting volume, adding background music, or applying audio effects.
- Integrate a voice storage solution to save and manage the generated voice files.
- Consider implementing a voice recognition feature to convert the generated voice back to text for verification or other purposes.