Add env DeviceTopK without temp_storage args#8982
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughSummary by CodeRabbit
important: WalkthroughAdded environment-dispatched DeviceTopK overloads (Max/Min Keys and Pairs, decomposer and non-decomposer variants) that allocate/manage temporary storage via the execution environment; added Catch2 tests validating these env-based overloads with built-in and custom types. ChangesEnvironment-Based TopK API
Assessment against linked issues
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
cub/test/catch2_test_device_topk_env_api.cu (1)
4-4: 💤 Low valuesuggestion: The include
insert_nested_NVTX_range_guard.his not used in this file. Consider removing it.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0a4f69d0-88d3-41d1-bc72-f0a079b4fc3e
📒 Files selected for processing (3)
cub/cub/device/device_topk.cuhcub/test/catch2_test_device_topk_api.cucub/test/catch2_test_device_topk_env_api.cu
There was a problem hiding this comment.
Actionable comments posted: 1
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1c53e6de-7677-45e1-9484-0d81da4be3ae
📒 Files selected for processing (2)
cub/cub/device/device_topk.cuhcub/test/catch2_test_device_topk_env.cu
🚧 Files skipped from review as they are similar to previous changes (1)
- cub/cub/device/device_topk.cuh
This comment has been minimized.
This comment has been minimized.
🥳 CI Workflow Results🟩 Finished in 3h 05m: Pass: 100%/283 | Total: 3d 09h | Max: 47m 02s | Hits: 90%/214854See results here. |
| [[nodiscard]] CUB_RUNTIME_FUNCTION static // | ||
| ::cuda::std::enable_if_t<detail::radix::is_valid_decomposer<detail::it_value_t<KeyInputIteratorT>, DecomposerT>, | ||
| cudaError_t> |
There was a problem hiding this comment.
Suggestion: Please put requirements (enable_if) into the template head instead of using the return type.
| [[nodiscard]] CUB_RUNTIME_FUNCTION static // | ||
| ::cuda::std::enable_if_t<detail::radix::is_valid_decomposer<detail::it_value_t<KeyInputIteratorT>, DecomposerT>, | ||
| cudaError_t> |
| thrust::host_vector<T> h_in(num_items); | ||
| for (int i = 0; i < num_items; ++i) | ||
| { | ||
| h_in[i] = static_cast<T>((i * 1664525 + 1013904223) % 251); | ||
| } | ||
| thrust::device_vector<T> d_in = h_in; |
There was a problem hiding this comment.
Important: in ordinary unit tests, please use c2h::host_vector etc.
| using topk_element_types = c2h::type_list<int8_t, int16_t, int32_t, uint32_t, int64_t, float, double>; | ||
|
|
||
| C2H_TEST("DeviceTopK::MaxKeys env-alloc returns correct top K", "[topk][env]", topk_element_types) |
There was a problem hiding this comment.
Q: Why do we need to cover different element types when just testing whether the env-overload works correctly? We should not test the topk implementation itself, those tests already exist elsewhere. The current state adds a lot of compile-time for little gain IMO.
| auto error = cub::DeviceTopK::MaxKeys(d_in.begin(), d_out.begin(), static_cast<int>(d_in.size()), k, env); | ||
| if (error != cudaSuccess) | ||
| { | ||
| std::cerr << "cub::DeviceTopK::MaxKeys failed with status: " << error << '\n'; | ||
| } | ||
| // example-end topk-max-keys-env |
There was a problem hiding this comment.
Important: we should list at least one possible expected outcome in the documentation. Like:
| auto error = cub::DeviceTopK::MaxKeys(d_in.begin(), d_out.begin(), static_cast<int>(d_in.size()), k, env); | |
| if (error != cudaSuccess) | |
| { | |
| std::cerr << "cub::DeviceTopK::MaxKeys failed with status: " << error << '\n'; | |
| } | |
| // example-end topk-max-keys-env | |
| auto error = cub::DeviceTopK::MaxKeys(d_in.begin(), d_out.begin(), static_cast<int>(d_in.size()), k, env); | |
| if (error != cudaSuccess) | |
| { | |
| std::cerr << "cub::DeviceTopK::MaxKeys failed with status: " << error << '\n'; | |
| } | |
| thrust::device_vector<int> expected{9, 8, 7}; // possibly in different order | |
| // example-end topk-max-keys-env |
fixes #8969