Add option (`autoselect_llm`) to automatically select the LLM #118

dilithjay · 2025-07-29T00:27:58Z

If parsing in AUTO-routing mode and autoselect_llm=True (can be set as a kwarg to the parse function)

A ranked list of models is obtained based on the similarity of the input doc to the documents in the benchmark. The model ranking for the most similar doc will be used for the input doc. The router will use the highest scoring model for which the API keys have been provided.

This PR also:

Updates benchmark with autoselect_llm=True
Refactors utility functions -> a separate file for conversion utils

…e a page based on page content - Update benchmark with option - Refactor utility functions - create separate file for conversion utils

Copilot

Pull Request Overview

This PR adds an automatic LLM selection feature for the parsing system. When autoselect_llm=True is set in AUTO-routing mode, the system selects the best-performing LLM based on document similarity to benchmark data.

Introduces a new DocumentRankedLLMSelector that ranks models based on document similarity
Refactors conversion utilities into a separate module for better organization
Updates benchmark testing to support the new autoselect feature with comparative results

Reviewed Changes

Copilot reviewed 10 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
lexoid/core/llm_selector.py	New module implementing document similarity-based LLM selection
lexoid/core/conversion_utils.py	Refactored conversion utilities moved from utils.py
lexoid/core/utils.py	Updated router function to support auto LLM selection and removed conversion functions
lexoid/core/parse_type/llm_parser.py	Updated imports and removed OpenAI parameter restrictions
lexoid/api.py	Modified parse function to handle auto-selected models
tests/benchmark.py	Added autoselect_llm parameter support and model name handling
tests/results.csv	Updated benchmark results including auto-selected model performance
tests/api_cost_mapping.json	Added cost mappings for GPT-5 models
docs/benchmark.rst	Updated documentation with new benchmark results
README.md	Updated benchmark table with latest performance data

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

lexoid/core/parse_type/llm_parser.py

lexoid/core/conversion_utils.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

- remove together ai llama models as they are no longer supported

Add option (autoselect_llm) to automatically select the LLM to pars…

702ee1d

…e a page based on page content - Update benchmark with option - Refactor utility functions - create separate file for conversion utils

dilithjay requested a review from pramitchoudhary July 29, 2025 00:27

pramitchoudhary assigned dilithjay Jul 29, 2025

pramitchoudhary added the enhancement New feature or request label Jul 29, 2025

dilithjay added 3 commits August 8, 2025 19:05

Merge branch 'main' into dj/router-model

6eef769

Add support for GPT-5

e096ab0

Update benchmark with 2 additional docs and GPT-5 results

713c8af

pramitchoudhary requested a review from Copilot August 24, 2025 21:36

Copilot AI reviewed Aug 24, 2025

View reviewed changes

lexoid/core/parse_type/llm_parser.py Show resolved Hide resolved

lexoid/core/conversion_utils.py Outdated Show resolved Hide resolved

dilithjay and others added 4 commits August 25, 2025 14:05

Update lexoid/core/conversion_utils.py

7b8aafd

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update benchmark

106912f

- remove together ai llama models as they are no longer supported

Merge branch 'main' into dj/router-model

c207955

Only delete unsupported args for specific models

26cc1a8

dilithjay merged commit 693181c into main Sep 2, 2025

dilithjay deleted the dj/router-model branch September 2, 2025 17:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add option (`autoselect_llm`) to automatically select the LLM #118

Add option (`autoselect_llm`) to automatically select the LLM #118

Uh oh!

dilithjay commented Jul 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add option (autoselect_llm) to automatically select the LLM #118

Add option (autoselect_llm) to automatically select the LLM #118

Uh oh!

Conversation

dilithjay commented Jul 29, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Add option (`autoselect_llm`) to automatically select the LLM #118

Add option (`autoselect_llm`) to automatically select the LLM #118