Upload and process files
Media
File Upload
Upload and process files with advanced extraction capabilities
POST
Upload and process files
File Upload API
The File Upload API processes various document formats, extracts text and data, and provides specialized processing options for AI workflows. It can handle documents, images, and audio files with different extraction modes.Base URLs
Gately AI offers two upload endpoints:- Standard Upload
- Advanced Processing
Supported File Types
Documents
PDF (
.pdf), Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), Text (.txt)Images
JPG, PNG, WEBP, GIF, SVG, BMP, TIFF
Audio
MP3, WAV, OGG, M4A (requires Azure Speech config)
Data
JSON, XML, CSV
Size Limits
Maximum file size: 50 MB per file
Processing Options
The OpenAPI playground above allows you to test file uploads with various processing options. Use the parameters described below.
Common Parameters
Enable Optical Character Recognition for image-based documents
Enable Vision processing for images and visually-rich content
Save raw files along with processed output to storage
Advanced Parameters
Extract only text content from the document
Process with vision models only (for images)
Return content organized by pages
Extract only images from documents
Extraction mode: ‘default’ or ‘embeddings’
Remove headers/footers from documents
Example Usage
Basic File Upload
With OCR Processing
Extract for AI Embeddings
Page-Based Extraction
Response Formats
Standard Processing
Success status of the upload and processing
Extracted text or data URL
Detected file type (pdf, docx, image, etc.)
Error message (if any)
Embeddings Mode
Unique identifier for the request
Object type (e.g., ‘chunks’)
Unix timestamp when the request was created
Example Response (Standard)
Example Response (Embeddings Mode)
Special Features
Document Structure Preservation
Document Structure Preservation
The API preserves document structure including headers, sections, lists, and tables during extraction.
PDF Processing
PDF Processing
Advanced PDF handling including form extraction, tabular data processing, and header/footer removal.
Image Processing
Image Processing
Extract text from images using OCR or process them with vision models for content understanding.
Audio Transcription
Audio Transcription
Convert audio files to text transcripts (requires Azure Speech configuration).
Embeddings Preparation
Embeddings Preparation
Special formatting for AI embeddings generation with optimized chunking and token counting.
Error Handling
If an error occurs, the API returns a JSON object with the error message:- File size limit exceeded
- Unsupported file type
- OCR service unavailable
- Processing timeout
Integration with AI Services
The extracted content can be used directly with Gately AI’s AI models:For large documents, use the embeddings mode and work with the chunked output for better AI processing.
Authorizations
Enter your API key prefixed with 'Bearer '
Body
multipart/form-data
File to upload
Enable OCR for image processing
Available options:
true, false Enable vision-based processing for images
Available options:
true, false Save file to configured storage
Available options:
true, false Extract only text content
Available options:
true, false Process with vision only
Available options:
true, false Return page-based structured response
Available options:
true, false Extract only images from documents
Available options:
true, false Extraction mode: 'default' or 'embeddings'
Available options:
default, embeddings Remove headers/footers from documents
Available options:
true, false 
