feat: add image input support for Gemini and Vertex providers #421
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This pull request adds image input support to the Google Gemini and Vertex providers in any-llm. The changes include:
Extending the input interface to accept images as base64-encoded strings, URLs, or raw bytes in the [messages] parameter.
Updating the payload construction and validation logic to handle image data according to Gemini and Vertex API requirements.
Implementing error handling for invalid image formats, corrupt data, and unsupported types.
Updating documentation (README and API docs) with clear instructions and examples for sending images.
Adding unit and integration tests for valid and invalid image input scenarios.
Ensuring backward compatibility so existing text-only workflows remain unaffected.
These changes enable multimodal (text + image) support for Gemini and Vertex, making any-llm more flexible for developers.
Checklist