Skip to content

Hardcode DeviceTransform offset type to I64 or U64#9040

Draft
bernhardmgruber wants to merge 2 commits into
NVIDIA:mainfrom
bernhardmgruber:transform64
Draft

Hardcode DeviceTransform offset type to I64 or U64#9040
bernhardmgruber wants to merge 2 commits into
NVIDIA:mainfrom
bernhardmgruber:transform64

Conversation

@bernhardmgruber
Copy link
Copy Markdown
Contributor

@bernhardmgruber bernhardmgruber commented May 15, 2026

Fixes: #8805

@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot Bot commented May 15, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@bernhardmgruber bernhardmgruber changed the title Hardcode DeviceTransform offset type to I64 Hardcode DeviceTransform offset type to I64 or U64 May 15, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Progress in CCCL May 15, 2026
@miscco
Copy link
Copy Markdown
Contributor

miscco commented May 17, 2026

Should this go into 3.4?

Comment on lines +559 to +561
// static_assert(
// ::cuda::std::is_same_v<Offset, ::cuda::std::int32_t> || ::cuda::std::is_same_v<Offset, ::cuda::std::int64_t>,
// "cub::DeviceTransform is only tested and tuned for 32-bit or 64-bit signed offset types");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldnt this still be checking that we only use int64?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will eventually just go away, since I want to replace the template parameter Offset by just the hardcoded 64-bit type that wins the race. I still need data for Blackwell and probably Ampere.

@github-project-automation github-project-automation Bot moved this from In Progress to In Review in CCCL May 17, 2026
@bernhardmgruber
Copy link
Copy Markdown
Contributor Author

Should this go into 3.4?

It's fine to ship later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

Consider hardcoding a 64-bit offset type for DeviceTransform

2 participants