Skip to content

[FEATURE] Support passing image as input to LLM #496

@badmonster0

Description

@badmonster0

Currently our LLM interface and ExtractByLlm builtin function only supports text prompt. We want it to support image inputs too.

  • Extend LlmGenerateRequest to accept image bytes as input:
    pub struct LlmGenerateRequest<'a> {
  • Implement existing LLM clients to pass this to different APIs (usually as base64)
  • Extend the ExtractByLlm function to accept an optional image as input
  • Can be tested by the existing image_search_example example. It's using customized code to send image to LLM now. We should be able to replace it with the ExtractByLlm function.

❤️ Contributors, please refer to 📙Contributing Guide.
Unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like I'm working on it or Can I work on this issue? to avoid duplicating work. Our Discord server is always open and friendly.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requesthelp wantedExtra attention is needed

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions