Improves sort_by_key instantiations #1397

shehzan10 · 2016-04-26T17:24:21Z

CPU and OpenCL use single source file to compile to separate object files.
CUDA requires generating files using CMake based on a template file which is then added as source.

umar456 · 2016-04-26T18:16:20Z

There HAS to be a better way to do this.

shehzan10 · 2016-04-26T18:19:51Z

I'm open to ANY other way you may have for this. I don't like it either. But right now, this is the only way its going to compile well on all architectures.

umar456 · 2016-04-26T18:22:09Z

You can make multiple .o targets and pass a different -D flag to each one using the same source file. It will require some changes to the CMake files but it will be hell of a lot better then adding 20 files.

shehzan10 · 2016-04-26T18:24:28Z

Ok, that might work. I'll look into it.

shehzan10 · 2016-04-26T23:00:46Z

build arrayfire tegrak1 ci
build arrayfire tegrax1 ci

umar456 · 2016-04-27T00:20:43Z

src/backend/cpu/kernel/sort_by_key/sort_by_key_impl.cpp


 #include <kernel/sort_by_key_impl.hpp>

+// SBK_TYPES:float double int uint intl uintl short ushort char uchar


FilipeMaia · 2016-04-30T12:24:39Z

I don't know if there's anything that can be done, but it became extremely slow to compile arrayfire after this merge.

shehzan10 · 2016-04-30T17:13:35Z

Well, actually it became slow after #1373 because its like adding a bunch of new kernels as sort_by_key requires All-to-All mapping for types. With #1373 the number of instantiations per file were so many that NVCC was crashing even on the best machines. This fixed the crashing part.

I have been thinking about this and it might make a difference if we remove the dim template from the batched function. The dim template is not actually used in any of the actual sort calls. I'm looking into the time saved - its about a minute and a half reduction (mostly on the CUDA backend).

pavanky · 2016-04-30T17:52:17Z

@FilipeMaia Can you verify if it is after this PR ? We had other PRs with similar names.

FilipeMaia · 2016-04-30T20:54:31Z

@shehzan10 is probably correct. I haven't tried just after #1373 but it's the new sort kernels that cause the slowness, so the reason must be #1373.
I have no solution, just pointing out that the slowness is at a level where it becomes a practical concern.

pavanky · 2016-04-30T21:03:19Z

@FilipeMaia that is a problem we have to deal with because of using a template based library like thrust.. :-/ Too many instantiations.

Need to figure out if there is any easier way to deal with this.

shehzan10 added the CUDA label Apr 26, 2016

shehzan10 added this to the 3.4.0 milestone Apr 26, 2016

shehzan10 added 2 commits April 26, 2016 17:46

Fix compiler warning in tests

60a9c8b

CPU: Compile sort_by_key instantiations from single file into objects

b8c506d

shehzan10 changed the title ~~Split CUDA sort_by_key instantiations into 2 files for each type~~ Improves sort_by_key instantiations Apr 26, 2016

shehzan10 added 2 commits April 26, 2016 19:40

CUDA: Generate sort_by_key instantiations using CMake

c354015

OpenCL: Compile sort_by_key instantiations from single file into objects

da69c30

shehzan10 added build and removed CUDA labels Apr 27, 2016

umar456 reviewed Apr 27, 2016
View reviewed changes

pavanky merged commit 4041410 into arrayfire:devel Apr 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improves sort_by_key instantiations #1397

Improves sort_by_key instantiations #1397

Uh oh!

shehzan10 commented Apr 26, 2016 •

edited

Loading

Uh oh!

umar456 commented Apr 26, 2016

Uh oh!

shehzan10 commented Apr 26, 2016

Uh oh!

umar456 commented Apr 26, 2016

Uh oh!

shehzan10 commented Apr 26, 2016

Uh oh!

shehzan10 commented Apr 26, 2016

Uh oh!

umar456 Apr 27, 2016

Uh oh!

FilipeMaia commented Apr 30, 2016

Uh oh!

shehzan10 commented Apr 30, 2016 •

edited

Loading

Uh oh!

pavanky commented Apr 30, 2016

Uh oh!

FilipeMaia commented Apr 30, 2016

Uh oh!

pavanky commented Apr 30, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		#include <kernel/sort_by_key_impl.hpp>

		// SBK_TYPES:float double int uint intl uintl short ushort char uchar

Improves sort_by_key instantiations #1397

Improves sort_by_key instantiations #1397

Uh oh!

Conversation

shehzan10 commented Apr 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

umar456 commented Apr 26, 2016

Uh oh!

shehzan10 commented Apr 26, 2016

Uh oh!

umar456 commented Apr 26, 2016

Uh oh!

shehzan10 commented Apr 26, 2016

Uh oh!

shehzan10 commented Apr 26, 2016

Uh oh!

umar456 Apr 27, 2016

Choose a reason for hiding this comment

Uh oh!

FilipeMaia commented Apr 30, 2016

Uh oh!

shehzan10 commented Apr 30, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pavanky commented Apr 30, 2016

Uh oh!

FilipeMaia commented Apr 30, 2016

Uh oh!

pavanky commented Apr 30, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shehzan10 commented Apr 26, 2016 •

edited

Loading

shehzan10 commented Apr 30, 2016 •

edited

Loading