A zero-shot NLP toolkit (powered by Instructor) #1485
rmitsch
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey all!
I'm working on https://github.com/mantisai/sieves, a tool making it easy to build a pipeline of NLP tasks only with zero-shot models, using only generative and decoder-only models - i.e. no model training.
My motivation with this is that in most of my projects NLP projects can be done better and faster by breaking them down into a pipeline of tasks. Very often these tasks are the same (the classic NLP tasks + information extraction + question answering + summarization + ...), so the library comes with a bunch of those already implemented.
The idea is to have a library that jumpstarts a modern NLP project by providing a document- and pipeline-based NLP tool (similar to spaCy) that doesn't require any model training to get a quick prototype off the ground. It guarantees correct outputs from generative models by leveraging structured output functionality from libraries like Instructor, Outlines, DSPy, LangChain, etc. It also comes with some useful utilities for NLP pipelines like OCR or exporting model predictions to a HF dataset for fine-tuning.
I'd be excited if you checked it out, especially so about any feedback 🙂
If you're interested in what this looks like, here's a simple snippet to run zero-shot classification (you have to run
pip install sieves
before):Beta Was this translation helpful? Give feedback.
All reactions