-
Notifications
You must be signed in to change notification settings - Fork 36
Qwen 3 1.7B Offline tool calling Android #165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Varun Khare <varun.khare@nimbledgehq.ai>
add tokenizer-cpp add jinja template for qwen and dict support for tokenizer:from_json Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varunkhare1234@gmail.com>
f"{library_stubs_dir}/src_gen", | ||
coreruntime_dir, | ||
], | ||
["cp", "-r", f"{library_stubs_dir}/src_template", f"{library_stubs_dir}/src_gen"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Is this accidental change?
cp -R
is the portable form, compared tocp -r
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
* Compares two data types and returns the one with higher precedence | ||
* for automatic type promotion in operations. The precedence order is: | ||
* BOOLEAN (0) < INT32 (3) < INT64 (4) < FLOAT (5) < DOUBLE (6) | ||
* BOOLEAN (0) < INT32 (3) < INT64 (4) < FLOAT16 (4.5) < FLOAT (5) < DOUBLE (6) | ||
* | ||
* @param dataType1 First data type to compare | ||
* @param dataType2 Second data type to compare | ||
* @return The data type with higher precedence | ||
*/ | ||
inline int get_max_dataType(int dataType1, int dataType2) { | ||
std::map<int, int> _typeScore = { | ||
{DATATYPE::BOOLEAN, 0}, {DATATYPE::INT32, 3}, {DATATYPE::INT64, 4}, | ||
{DATATYPE::FLOAT, 5}, {DATATYPE::DOUBLE, 6}, | ||
{DATATYPE::BOOLEAN, 0}, {DATATYPE::INT32, 3}, {DATATYPE::INT64, 4}, | ||
{DATATYPE::FLOAT16, 45}, {DATATYPE::FLOAT, 5}, {DATATYPE::DOUBLE, 6}, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup will update this
# This is the 1st commit message: add support for dictionary indexing in onnx executor Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> # This is the commit message NimbleEdge#2: add dictionary input support to model.run() for kv_cache Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> # This is the commit message NimbleEdge#3: add fp16 support in delitepy Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> # This is the commit message NimbleEdge#4: Qwen with tool calling functional in delitePy Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> # This is the commit message NimbleEdge#5: Implemented enumerate and next in DelitePy (NimbleEdge#162) * Implemented enumerate and next in DelitePy Signed-off-by: Atul Jain <atul.jain@nimbleedgehq.ai> * Cosmetics Signed-off-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai> --------- Signed-off-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai> Co-authored-by: Atul Jain <atul.jain@nimbleedgehq.ai> Co-authored-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai>
Signed-off-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai>
Signed-off-by: Puneet Jindal <puneet.jindal@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbledgehq.ai> modular qwen demo structure Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> wip handle attention cache Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai> resume from last postion for multi-step run Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbledgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>
Here's a comprehensive PR description for all the changes:
Description
This PR adds comprehensive support for Qwen 3 1.7B model with tool calling capabilities, uses native ONNX runtime isntead of onnxruntime_genai, and adds the export script for model enhancements.
Key Features Added
Cpp bindings
Delitepy Bindings
FP16 Support
Enhanced binary operations now support FP16 data type through uint16_t:
Kotlin Interface
Reverse stream of generation from python and subscription in kotlin flows.
Qwen Demo Setup
The Qwen demo uses a zip-based modules in delitePy
Tool Calling Features
<tool_call>
XML tagsChecklist: