Respect kernel.arguments in non-executable targets' lowered code#650
Respect kernel.arguments in non-executable targets' lowered code#650kaushikcfd wants to merge 3 commits intomainfrom
kernel.arguments in non-executable targets' lowered code#650Conversation
inducer
left a comment
There was a problem hiding this comment.
Thanks! Some thoughts below, mostly conceptual/documentation. Good to go once those are adressed. Mostly rename/philosophy/documentation.
loopy/target/c/__init__.py
Outdated
| skai = get_subkernel_arg_info(kernel, subkernel_name) | ||
| passed_names = skai.passed_names | ||
| passed_names = (skai.passed_names | ||
| if self.target.is_executable |
There was a problem hiding this comment.
I feel that the real issue is a bit different from what's being handled here:
For GPU-ish targets, a kernel is a collection of "subkernel" enqueues. The interface between the "entrypoint/invoker" and the "subkernels" is internal. Meanwhile, for the C target as used by Firedrake, this distinction is not fully realized at the moment: Users will typically directly call the subkernels, and the "entrypoint/invoker" level just doesn't exist. That makes sense, because there's (always?) only one subkernel, and the invoker, if it existed, would just pass through arguments to the subkernel and call it a day. The flag you're introducing here effectively codifies the ability to skip the "entrypoint/invoker" level.
Could you also add some semblance of this discussion to the docs of single_subkernel_is_entrypoint?
There was a problem hiding this comment.
Added some docs on these lines in 9bfef04.
|
I addressed the code review in https://github.com/inducer/loopy/pull/701/files, could you please re-review @inducer? |
…_entrypoint, add docs and error when there is more than one subkernel. Update how a subkernel is detected.
Close #648