-
Notifications
You must be signed in to change notification settings - Fork 2k
Measure jailer startup performance #5282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
3b16c6e
to
2e99a59
Compare
Now jailer will not complain if the executable does not contain `firecracker` in it's name. This restriction was unnecessary and it's removal is not a breaking change. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
0b5d3aa
to
65d8ee3
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5282 +/- ##
==========================================
+ Coverage 82.86% 82.92% +0.05%
==========================================
Files 250 250
Lines 26902 26903 +1
==========================================
+ Hits 22292 22308 +16
+ Misses 4610 4595 -15
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
65d8ee3
to
1d2e38c
Compare
The executable passed to the jailer must have exec permissions set. Update the jailer executable check unit tests with `match!`. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Add note about new jailer executable requirements. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Remove explicit panic on error. Let the error be returned from the main. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
This binary just outputs the start and end time of the jailer startup. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
1d2e38c
to
a9464ba
Compare
let metadata = std::fs::metadata(&exec_file_path).unwrap(); | ||
|
||
// Check that it is executable | ||
if metadata.permissions().mode() & 0o111 == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this requiring that's executable by all users, while in reality it just needs to be executable by the jailer? Is there any issues with just having the execve
fail with permission error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checks that there is at least some
executable bit set.
Personally, I added this as I though it was already a requirement, just not explicitly checked and failing with a bit nicer message is better. I can drop it if people find it not necessary.
@pytest.mark.parametrize("jailers", [1, 100, 200, 300, 400]) | ||
@pytest.mark.parametrize("mounts", [0, 100, 300, 500]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how long does it take to run? there's a bunch of combinations here, I doubt we'll be plotting all these in our graphs. Maybe it's best to identify a handful of combinations we're interested in (1/many jailers x no/many mounts,).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
running the test gives me:
4.40s call integration_tests/performance/test_jailer.py::test_jailer_startup[500-300]
4.20s call integration_tests/performance/test_jailer.py::test_jailer_startup[500-200]
4.00s call integration_tests/performance/test_jailer.py::test_jailer_startup[500-100]
3.84s call integration_tests/performance/test_jailer.py::test_jailer_startup[500-1]
2.49s call integration_tests/performance/test_jailer.py::test_jailer_startup[300-400]
2.33s call integration_tests/performance/test_jailer.py::test_jailer_startup[300-300]
2.17s call integration_tests/performance/test_jailer.py::test_jailer_startup[300-200]
2.01s call integration_tests/performance/test_jailer.py::test_jailer_startup[300-100]
1.86s call integration_tests/performance/test_jailer.py::test_jailer_startup[300-1]
So in total the test takes ~30 seconds.
|
||
# Create bind mount points. The exact location of them | ||
# does not matter, they just need to exist. | ||
mounts_paths = "/tmp/mounts" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't really matter as perf tests are run sequentially, but you could use a tempdir from mktemp
(iirc we have a util for that already that also takes care of cleaning up).
for i, p in enumerate(processes): | ||
r = p.communicate()[0] | ||
e, s = r.split() | ||
deltas.append(int(e) - int(s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we just have the binary spit out the delta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I can do this, but I don't see why do this. I mean, does it matter who subtracts 2 numbers?
) | ||
|
||
processes = [] | ||
for i in range(jailers): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably do a few rounds for when we have a single jailer. Maybe do 100 rounds? or maybe do 500 sequential in one test and 500 at the same time in another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added 100 iterations for 1 jailer case
} | ||
) | ||
metrics.put_metric( | ||
"p50", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should emit the datapoints directly as we do in all other tests and let the consumer compute the statistics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, fixed
do you have a link where I could see the metrics? |
The test measures p50 and p90 of jailer startup time. It is parametrized by the number of jailers starting up and the number of bind mount points present in the system. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
a9464ba
to
9e5b112
Compare
Changes
Update jailer requirements on the executable name and permissions. Now it does not need to contain
firecracker
in it's name, but must be marked as executable.Add jailer startup performance test. The test is parametrized on the number of jailers started and the number of bind mounts present in the system.
Reason
We want to know what is the overhead of a jailer in isolation. Since usually there are multiple jailed VMs on the system, the test is parametrized by the number of jailers started.
The reason for adding measurements with bind mounts is because it slows down the jailer startup time. The more bind mounts are present, the more contention there is in the kernel. At the same time, bind mounts are frequently created per VM in the production environment.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md
.Runbook for Firecracker API changes.
integration tests.
TODO
.rust-vmm
.