Audio-to-Text

Speech-to-Text

curl --request POST \
  --url https://api.vivgrid.com/v1/audio/transcriptions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form 'model=<string>' \
  --form 'prompt=<string>' \
  --form 'response_format=<string>' \
  --form temperature=123

{
  "text": "<string>"
}

POST

audio

transcriptions

Speech-to-Text

curl --request POST \
  --url https://api.vivgrid.com/v1/audio/transcriptions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form file='@example-file' \
  --form 'model=<string>' \
  --form 'prompt=<string>' \
  --form 'response_format=<string>' \
  --form temperature=123

{
  "text": "<string>"
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

multipart/form-data

file

required

Audio file to transcribe. Supported formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm.

model

string

whisper-1.

prompt

string

Optional prompt to guide the transcription output.

response_format

string

Optional desired format for the output (e.g., "json", "text").

temperature

number<float>

Sampling temperature for randomness (optional).

Response

Successful transcription

text

string

Transcribed text of the uploaded audio.

Vibe Coding

Text-to-Speech

Coding Model API

Audio Model API

Image Model API

Model API Endpoints

Authorizations

Body

Response