Skip to content

RFC: accscalar_t for float on CPU #20053

@t-vi

Description

@t-vi

Issue

Currently accscalar_t for float is double on the CPU:

https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/AccumulateType.h#L34

I suggest to have a small discussion whether it'd be better to switch to float.

Motivation

This is prompted by @wanchaol asking about a UndefinedBehaviorSanitizer: float-cast-overflow error, but I think there are three possible reasons to consider changing this:

  • Consistency with GPU,
  • support platforms on which float is faster (e.g. arm32)
  • get rid of UB.

I seem to recall @apaszke preferring the current behaviour a year ago or so back (in the context of #6855, which always dispatched a double auxilliary function on CPU or so).

Pitch

Switch to float generally.

Alternatives

Switch to float only for specific platforms (e.g. arm32), somehow get rid of the UB warning.

Additional context

We (e.g. @ljk53 and me) might be interested in changing this for Android specifically if we don't generally.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureA request for a proper, new feature.module: androidRelated to Android supportmodule: cpuCPU specific problem (e.g., perf, algorithm)triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions