-
Notifications
You must be signed in to change notification settings - Fork 7
Add polyfactory framework #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add a framework that generates mock responses using `polyfactory`.
Reviewer's Guide by SourceryThis pull request introduces a new framework, File-Level Changes
Tips
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @adrianeboyd - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 3 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
(This sourcery thing seems noisier and less capable than a linter.) |
Hey I ran your branch locally and have some comments:
So for now, I'll keep this PR open and revisit it again once I have |
I kind of disagree, because I think it's more reasonable to have it available as a comparison for classification tasks that have a sensible random baseline. I think it would be much less interesting for a synthetic generation task (I guess unless the task is boring enough that you should be using faker instead). It doesn't refer to the input text because it's just generating a random list of labels from the provided schema. (In a few tests where I limited the number of possible labels to match the sampling setup and then sampled more data, it had something like 0.1-0.3% accuracy. I would also guess that a majority baseline might be better than a random baseline, but I didn't try that.) And this was all a bit facetious, I didn't necessarily expect it to be merged, since adding any accuracy metrics to the table would make you immediately want to eliminate it. The point was just that you can easily generate the structure and a random baseline from the response model. |
Ah got it. good point! Yep, I'm definitely on the lookout for a suitable dataset to include an accuracy metric that will immediately flag a random label generator as inaccurate. I'm currently prioritizing getting the code up first so that datasets can be easily swapped in and out. |
edba4c1
to
e58f3df
Compare
Use an equivalent six digit postal code field definition that is supported by polyfactory rather than a separate validator method.
e58f3df
to
8ebeae2
Compare
Add a framework that generates mock responses using
polyfactory
.Related to #1.
Summary by Sourcery
This pull request adds a new framework, PolyfactoryFramework, which generates mock responses using the polyfactory library. Configuration for this framework has been added to config.yaml, and the framework is imported in frameworks/init.py.