update document

d0543d3d · duanjinfei · 4063a179 · d0543d3d · d0543d3d · d0543d3d
Commit d0543d3d authored Aug 08, 2024 by duanjinfei
8 changed files
--- a/docs/publicModelsAPI/Gfpgan API Usage Guide.md
+++ b/docs/publicModelsAPI/Gfpgan API Usage Guide.md
@@ -64,16 +64,16 @@ try {

 ### Parameter Description

- `img`: String, the text content to be converted into speech.
- `scale`: String, the URL of the audio file used as the voice sample for cloning.
- `version`: String, specifies the language of the text, with "en" indicating English.
+- `img`: String, Provide the image file that needs optimization.
+- `scale`: Number, Rescaling factor.
+- `version`: String, GFPGAN version. v1.3: better quality. v1.4: more details and better identity.

 ### Notes

- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
+- Ensure that the provided image URL is publicly accessible and of good quality to achieve the best recognition results.
+- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states
 - Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
+- Adhere to the terms of use and privacy regulations, especially when handling image samples of others.

 ### Example Response


--- a/docs/publicModelsAPI/IDM-VTON AI Model Usage Guide.md
+++ b/docs/publicModelsAPI/IDM-VTON AI Model Usage Guide.md
@@ -41,12 +41,16 @@ const aonweb = new AI(ai_options);
 ```js
 const data = {
    input: {
-    "seed": 42,
-    "steps": 30,
-    "garm_img": "https://replicate.delivery/pbxt/KgwTlZyFx5aUU3gc5gMiKuD5nNPTgliMlLUWx160G4z99YjO/sweater.webp",
-    "human_img": "https://replicate.delivery/pbxt/KgwTlhCMvDagRrcVzZJbuozNJ8esPqiNAIJS3eMgHrYuHmW4/KakaoTalk_Photo_2024-04-04-21-44-45.png",
-    "garment_des": "cute pink top"
-  }
+      "crop": false,
+      "seed": 42,
+      "steps": 30,
+      "category": "upper_body",
+      "force_dc": false,
+      "garm_img": "https://replicate.delivery/pbxt/KgwTlZyFx5aUU3gc5gMiKuD5nNPTgliMlLUWx160G4z99YjO/sweater.webp",
+      "mask_img": "https://replicate.delivery/pbxt/KnaDKqnN0h1DDF5CnK7iRSSkFnJrk9kyRiQlcc5gBcy8gpPA/replicate-prediction-wfj8g6sgmxrgp0cf1gnv7btfh8.jpg",
+      "human_img": "https://replicate.delivery/pbxt/KgwTlhCMvDagRrcVzZJbuozNJ8esPqiNAIJS3eMgHrYuHmW4/KakaoTalk_Photo_2024-04-04-21-44-45.png",
+      "garment_des": "cute pink top"
+    }
 };
 ```

@@ -66,11 +70,15 @@ try {

 ### Parameter Description

- `seed`: Random seed for generating reproducible results
- `steps`: Number of processing steps
- `garm_img`: URL of the garment image
- `human_img`: URL of the human image
- `garment_des`: Description of the garment
+- `seed`: Number,Random seed for generating reproducible results
+- `steps`: Number,Provide the steps required for model inference
+- `garm_img`: String,Garment, should match the category, can be a product image or even a photo of someone
+- `human_img`: String,Model, if this is not 3:4 check crop
+- `garment_des`: String,Description of garment e.g. Short Sleeve Round Neck T-shirt
+- `crop` Boolean,Use cropping? (check this if your image is not 3:4)
+- `category` String,Category of garment
+- `force_dc` Boolean,Use the DressCode version of IDM-VTON (this is default false, except if category=dresses)
+- `mask_img` String,Mask image, optional (but faster)

 ### Handling the Response


--- a/docs/publicModelsAPI/MiniGpt-4 API Usage Guide.md
+++ b/docs/publicModelsAPI/MiniGpt-4 API Usage Guide.md
@@ -69,7 +69,14 @@ try {

 ### Parameter Description

- 
+- `image` String,Provide the image url required for model inference
+- `top_p` Number,Sample from the top p percent most likely tokens
+- `prompt` String, Prompt for mini-gpt4 regarding input image
+- `num_beams` Number,Number of beams for beam search decoding
+- `max_length` Number,Total length of prompt and output in tokens
+- `temperature` Number,Temperature for generating tokens, lower = more predictable results
+- `max_new_tokens` Number,Maximum number of new tokens to generate
+- `repetition_penalty` Number,Penalty for repeated words in generated text; 1 is no penalty, values greater than 1 discourage repetition, less than 1 encourage it.

 ### Notes


--- a/docs/publicModelsAPI/Real-Esrgan API Usage Guide.md
+++ b/docs/publicModelsAPI/Real-Esrgan API Usage Guide.md
@@ -64,7 +64,9 @@ try {

 ### Parameter Description

- 
+- `image` String,Provide the image file that needs optimization
+- `scale` Number,Factor to scale image by
+- `face_enhance` Boolean,Run GFPGAN face enhancement along with upscaling

 ### Notes


--- a/docs/publicModelsAPI/SadTalker API Usage Guide.md
+++ b/docs/publicModelsAPI/SadTalker API Usage Guide.md
@@ -57,8 +57,10 @@ const price = 8; // Cost of the AI call
 try {
    const response = await aonweb.prediction("/predictions/ai/sadtalker", data, price);
    // Handle response
+     console.log("sadtalker result:", response);
 } catch (error) {
    // Error handling
+    console.error("Error generating :", error);
 }
 ```


--- a/docs/publicModelsAPI/Sdxl API Usage Guide.md
+++ b/docs/publicModelsAPI/Sdxl API Usage Guide.md
@@ -58,6 +58,46 @@ const data = {
 };
 ```

+```js
+const data = {
+   input:{
+  "width": 1024,
+  "height": 1024,
+  "prompt": "A studio photo of a rainbow coloured cat",
+  "refine": "expert_ensemble_refiner",
+  "scheduler": "KarrasDPM",
+  "num_outputs": 1,
+  "guidance_scale": 7.5,
+  "high_noise_frac": 0.8,
+  "prompt_strength": 0.8,
+  "num_inference_steps": 50
+}
+};
+```
+
+
+```js
+const data = {
+   input:{
+      "mask": "https://replicate.delivery/pbxt/JF3Ld3yPLVA3JIELHx1uaAV5CQOyr4AoiOfo6mJZn2fofGaT/dog-mask.png",
+      "image": "https://replicate.delivery/pbxt/JF3LddQgRiMM9Q4Smyfw7q7BR9Gn0PwkSWvJjKDPxyvr8Ru0/cool-dog.png",
+      "width": 1024,
+      "height": 1024,
+      "prompt": "An orange cat sitting on a bench",
+      "refine": "base_image_refiner",
+      "scheduler": "KarrasDPM",
+      "num_outputs": 1,
+      "guidance_scale": 7.5,
+      "high_noise_frac": 0.8,
+      "prompt_strength": 0.8,
+      "num_inference_steps": 25
+    }
+};
+```
+
+
+
+
 ### 4. Call the AI Model

 ```js
@@ -74,14 +114,30 @@ try {

 ### Parameter Description

- 
+- `width` Number,Width of output image
+- `height` Number,Height of output image
+- `image` String,Input image for img2img or inpaint mode
+- `prompt` String,Provide the prompt required for model inference
+- `refine` String,Which refine style to use,Enum:["no_refiner", "expert_ensemble_refiner", "base_image_refiner"]
+- `refine_steps` Number,For base_image_refiner, the number of steps to refine, defaults to num_inference_steps
+- `scheduler` String,Enum:["DDIM", "DPMSolverMultistep", "HeunDiscrete", "KarrasDPM", "K_EULER_ANCESTRAL", "K_EULER", "PNDM"]
+- `lora_scale` Number,LoRA additive scale. Only applicable on trained models.
+- `num_outputs` Number,Number of images to output.
+- `guidance_scale` Number,Scale for classifier-free guidance
+- `apply_watermark` Boolean,Applies a watermark to enable determining if an image is generated in downstream applications. If you have other provisions for generating or deploying images safely, you can use this to disable watermarking.
+- `high_noise_frac` Number,For expert_ensemble_refiner, the fraction of noise to use
+- `negative_prompt` String,Input Negative Prompt
+- `seed` Number,Random seed. Leave blank to randomize the seed
+- `prompt_strength` Number,Prompt strength when using img2img / inpaint. 1.0 corresponds to full destruction of information in image
+- `num_inference_steps` Number,Number of denoising steps
+- `mask` String,Input mask for inpaint mode. Black areas will be preserved, white areas will be inpainted.

 ### Notes

- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
+- Ensure that the provided image URL is publicly accessible and of good quality to achieve the best recognition results.
+- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states
 - Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
+- Adhere to the terms of use and privacy regulations, especially when handling image samples of others.

 ### Example Response


--- a/docs/publicModelsAPI/Stable-Diffusion API Usage Guide.md
+++ b/docs/publicModelsAPI/Stable-Diffusion API Usage Guide.md
@@ -45,12 +45,46 @@ const data = {
    "scheduler": "K_EULER",
    "num_outputs": 1,
    "guidance_scale": 7.5,
-    "image_dimensions": "512x512",
    "num_inference_steps": 50
  }
 };
 ```

+```js
+const data = {
+   input:{
+      "prompt": "multicolor hyperspace",
+      "num_outputs": "1",
+      "guidance_scale": 7.5,
+      "num_inference_steps": 50
+    }
+};
+```
+
+```js
+const data = {
+   input: {
+      "prompt": "a gentleman otter in a 19th century portrait",
+      "num_outputs": 1,
+      "guidance_scale": 7.5,
+      "num_inference_steps": 100
+    }
+};
+```
+
+```js
+const data = {
+   input:{
+      "width": 512,
+      "height": 512,
+      "prompt": "eye",
+      "num_outputs": "1",
+      "guidance_scale": 7.5,
+      "num_inference_steps": 50
+    }
+};
+```
+
 ### 4. Call the AI Model

 ```js
@@ -67,14 +101,22 @@ try {

 ### Parameter Description

- 
+- `prompt` String,Provide the prompt required for model inference
+- `scheduler` String,Enum:[]
+- `num_outputs` Number,Number of images to generate.
+- `guidance_scale` Number,Scale for classifier-free guidance
+- `num_inference_steps` Number,Number of denoising steps
+- `height` Number,Height of generated image in pixels. Needs to be a multiple of 64
+- `width` Number,Width of generated image in pixels. Needs to be a multiple of 64
+- `negative_prompt` String,Specify things to not see in the output
+- `seed` Number,Random seed. Leave blank to randomize the seed

 ### Notes

- Ensure that the provided audio URL is publicly accessible and of good quality to achieve the best cloning effect.
- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states.
+- Ensure that the provided image URL is publicly accessible and of good quality to achieve the best recognition results.
+- The API may take some time to process the input and generate the result, consider implementing appropriate wait or loading states
 - Handle possible errors, such as network issues, invalid input, or API limitations.
- Adhere to the terms of use and privacy regulations, especially when handling voice samples of others.
+- Adhere to the terms of use and privacy regulations, especially when handling image samples of others.

 ### Example Response


--- a/docs/publicModelsAPI/Whisper API Usage Guide.md
+++ b/docs/publicModelsAPI/Whisper API Usage Guide.md
@@ -40,7 +40,79 @@ const aonweb = new AI(ai_options);

 ```js
 const data = {
-   input:
+   input:{
+    "audio": "https://replicate.delivery/mgxm/e5159b1b-508a-4be4-b892-e1eb47850bdc/OSR_uk_000_0050_8k.wav",
+    "model": "large-v3",
+    "translate": false,
+    "temperature": 0,
+    "transcription": "plain text",
+    "suppress_tokens": "-1",
+    "logprob_threshold": -1,
+    "no_speech_threshold": 0.6,
+    "condition_on_previous_text": true,
+    "compression_ratio_threshold": 2.4,
+    "temperature_increment_on_fallback": 0.2
+  }
+};
+```
+
+
+```js
+const data = {
+   input:{
+      "audio": "https://replicate.delivery/pbxt/LJr3aqYueyyKOKkIwWWIH67SyvzrAKfCm5tNVYc3uSt7oWy4/4th-dimension-explained-by-a-high-school-student.mp3",
+      "model": "large-v3",
+      "language": "auto",
+      "translate": false,
+      "temperature": 0,
+      "transcription": "plain text",
+      "suppress_tokens": "-1",
+      "logprob_threshold": -1,
+      "no_speech_threshold": 0.6,
+      "condition_on_previous_text": true,
+      "compression_ratio_threshold": 2.4,
+      "temperature_increment_on_fallback": 0.2
+    }
+};
+```
+
+
+```js
+const data = {
+   input:{
+      "audio": "https://replicate.delivery/pbxt/LJr3aqYueyyKOKkIwWWIH67SyvzrAKfCm5tNVYc3uSt7oWy4/4th-dimension-explained-by-a-high-school-student.mp3",
+      "model": "large-v3",
+      "language": "auto",
+      "translate": false,
+      "temperature": 0,
+      "transcription": "plain text",
+      "suppress_tokens": "-1",
+      "logprob_threshold": -1,
+      "no_speech_threshold": 0.6,
+      "condition_on_previous_text": true,
+      "compression_ratio_threshold": 2.4,
+      "temperature_increment_on_fallback": 0.2
+    }
+};
+```
+
+
+```js
+const data = {
+   input:{
+      "audio": "https://replicate.delivery/pbxt/LJr3aqYueyyKOKkIwWWIH67SyvzrAKfCm5tNVYc3uSt7oWy4/4th-dimension-explained-by-a-high-school-student.mp3",
+      "model": "large-v3",
+      "language": "auto",
+      "translate": false,
+      "temperature": 0,
+      "transcription": "plain text",
+      "suppress_tokens": "-1",
+      "logprob_threshold": -1,
+      "no_speech_threshold": 0.6,
+      "condition_on_previous_text": true,
+      "compression_ratio_threshold": 2.4,
+      "temperature_increment_on_fallback": 0.2
+    }
 };
 ```

@@ -61,7 +133,18 @@ try {

 ### Parameter Description

- 
+- `audio` String,Provide the audio file that needs optimization
+- `model` String,Whisper model size (currently only large-v3 is supported).
+- `translate` Boolean,Translate the text to English when set to True
+- `patience` Number,optional patience value to use in beam decoding, as in https://arxiv.org/abs/2204.05424, the default (1.0) is equivalent to conventional beam search
+- `temperature` Number,temperature to use for sampling
+- `transcription` String,Choose the format for the transcription
+- `suppress_tokens` String,comma-separated list of token ids to suppress during sampling; '-1' will suppress most special characters except common punctuations
+- `logprob_threshold`  Number,if the average log probability is lower than this value, treat the decoding as failed
+- `no_speech_threshold` Number,if the probability of the <|nospeech|> token is higher than this value AND the decoding has failed due to `logprob_threshold`, consider the segment as silence
+- `condition_on_previous_text` Boolean,if True, provide the previous output of the model as a prompt for the next window; disabling may make the text inconsistent across windows, but the model becomes less prone to getting stuck in a failure loop
+- `compression_ratio_threshold` Number,if the gzip compression ratio is higher than this value, treat the decoding as failed
+- `temperature_increment_on_fallback` Number,temperature to increase when falling back when the decoding fails to meet either of the thresholds below

 ### Notes