-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Use thread_local for loader_life_support to improve performance #5830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
a912648
728a8d9
efc678f
f40c22b
6bdc29f
fb13fdd
508f408
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -45,17 +45,48 @@ class loader_life_support { | |
| loader_life_support *parent = nullptr; | ||
| std::unordered_set<PyObject *> keep_alive; | ||
|
|
||
| #if defined(PYBIND11_CAN_USE_THREAD_LOCAL) | ||
| struct fake_thread_specific_storage { | ||
| loader_life_support *instance = nullptr; | ||
| loader_life_support *get() const { return instance; } | ||
|
|
||
| fake_thread_specific_storage &operator=(loader_life_support *pval) { | ||
| instance = pval; | ||
| return *this; | ||
| } | ||
| }; | ||
| using loader_storage = fake_thread_specific_storage; | ||
| #else | ||
| using loader_storage = thread_specific_storage<loader_life_support>; | ||
| #endif | ||
|
|
||
| static loader_storage &get_stack_top() { | ||
| #if defined(PYBIND11_CAN_USE_THREAD_LOCAL) | ||
| // Without this branch, loader_life_support destruction is a | ||
| // significant cost per function call. | ||
| // | ||
| // Observation: loader_life_support needs to be thread-local, but | ||
| // we don't need to go to extra effort to keep it per-interpreter | ||
| // (i.e., by putting it in internals) since function calls are | ||
| // already isolated to a single interpreter. | ||
|
swolchok marked this conversation as resolved.
Outdated
|
||
| static thread_local fake_thread_specific_storage storage; | ||
| return storage; | ||
| #else | ||
| return get_internals().loader_life_support_tls; | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pybind11's So, why not just use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think this is correct, but I'll get clearer data about it and come back.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. here's the code generated for
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair enough.... Then, I think then someone should fix the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Wouldn't that break ABI? Is that OK?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Two uses of thread_specific_storage are members of I suppose one could fix the other usages of thread_specific_storage and leave these two until its time for an ABI break.... I'm willing to refactor that in a separate PR using whatever method is decided here for conditionally compiling these thread_locals out.
It might make sense to add some mechanism to accumulate ABI breaks within a pp flag like that, rather than forget about them. Maybe out of scope for this PR.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
What we had was pretty basic, easy to see in one view here (when support for old internals versions was removed): https://github.com/pybind/pybind11/pull/5530/files Not pretty, but it served the purpose; for ~2 years actually. We'll have to be careful about test coverage, in ci.yml. I think it would fit in this PR. Another route is to get #5800 reviewed and merged. It's an XXL review job though, and we have to determine who's taking on maintaining that code long term. @oremanj for viz (btw @oremanj: did you see my direct email from Aug 21?)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This was misleading; adding -fPIC (as you would generally do to build a shared library such Python extension IIUC, though pybind11_benchmark doesn't seem to do it) causes this code to use
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rwgk I did not see such an email, nor have I been able to find it searching just now - can you resend it and/or tell me what subject to search for?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @oremanj I sent another email, from my corp address to your corp address, subject "pybind11 hello". |
||
| #endif | ||
| } | ||
|
|
||
| public: | ||
| /// A new patient frame is created when a function is entered | ||
| loader_life_support() { | ||
| auto &stack_top = get_internals().loader_life_support_tls; | ||
| auto &stack_top = get_stack_top(); | ||
| parent = stack_top.get(); | ||
| stack_top = this; | ||
| } | ||
|
|
||
| /// ... and destroyed after it returns | ||
| ~loader_life_support() { | ||
| auto &stack_top = get_internals().loader_life_support_tls; | ||
| auto &stack_top = get_stack_top(); | ||
| if (stack_top.get() != this) { | ||
| pybind11_fail("loader_life_support: internal error"); | ||
| } | ||
|
|
@@ -68,7 +99,7 @@ class loader_life_support { | |
| /// This can only be used inside a pybind11-bound function, either by `argument_loader` | ||
| /// at argument preparation time or by `py::cast()` at execution time. | ||
| PYBIND11_NOINLINE static void add_patient(handle h) { | ||
| loader_life_support *frame = get_internals().loader_life_support_tls.get(); | ||
| loader_life_support *frame = get_stack_top().get(); | ||
| if (!frame) { | ||
| // NOTE: It would be nice to include the stack frames here, as this indicates | ||
| // use of pybind11::cast<> outside the normal call framework, finding such | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was one of the iOS platforms. But that should've failed one of the PR Checks (since this is unconditionally 1 right now) and didn't. Meaning all of the Checks pybind11 currently has support
thread_localanyway.... can that be right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might have fixed the issue with iOS, you have to set some minimum version to be able to use it and it wasn't letting us set the minimum version. But now I think it's working.