Refactor data structures to reduce memory usage by ufownl · Pull Request #142 · google/gemma.cpp

ufownl · 2024-04-10T11:56:24Z

This PR moves the Gemma attention weights and Griffin weights into a union to share memory and reduces the additional overhead when not using the Griffin model.

jan-wassenberg

Nice, thanks for resolving this TODO :) Some suggestions:

jan-wassenberg · 2024-04-10T12:18:33Z

gemma/gemma.cc

  static constexpr size_t kGatingEinsumWSize = 2 * kFFHiddenDim * kModelDim;
+  static constexpr size_t kConv1dWidth = TConfig::kConv1dWidth;
+  static constexpr bool kFFBiases = TConfig::kFFBiases;
+  static constexpr size_t kAOBiaseDim =


Typo (extra e), maybe instead kAttnBiasDim?

Oops, my bad.

jan-wassenberg · 2024-04-10T12:20:47Z

gemma/gemma.cc

+    };
+
+    struct {
+      ArrayT<float, kGriffinDim * kGriffinDim> griffin_linear_x_w;


Consider naming the struct member griffin, so we can remove the griffin_ prefix from its names?

I agree, your solution should be better.

jan-wassenberg · 2024-04-10T12:22:20Z

gemma/gemma.cc

+  static constexpr size_t kGatingEinsumWSize = TLayer::kGatingEinsumWSize;
+  static constexpr size_t kConv1dWidth = TLayer::kConv1dWidth;
+  static constexpr bool kFFBiases = TLayer::kFFBiases;
+  static constexpr size_t kAOBiaseDim = TLayer::kAOBiaseDim;


Here also rename?

jan-wassenberg

Nice, thanks for making the change :)

Refactor data structures to reduce memory usage

809bd07

jan-wassenberg requested changes Apr 10, 2024

View reviewed changes

ufownl added 2 commits April 10, 2024 20:38

Fix typos

4e960d6

Rename the fields of Griffin weights

e541707

jan-wassenberg approved these changes Apr 10, 2024

View reviewed changes

jan-wassenberg added the copybara-import Trigger Copybara for merging pull requests label Apr 10, 2024

copybara-service bot merged commit 342e998 into google:dev Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor data structures to reduce memory usage#142

Refactor data structures to reduce memory usage#142
copybara-service[bot] merged 3 commits intogoogle:devfrom
ufownl:refactor/data_structures

ufownl commented Apr 10, 2024

Uh oh!

jan-wassenberg left a comment

Uh oh!

jan-wassenberg Apr 10, 2024

Uh oh!

ufownl Apr 10, 2024

Uh oh!

jan-wassenberg Apr 10, 2024

Uh oh!

ufownl Apr 10, 2024

Uh oh!

jan-wassenberg Apr 10, 2024

Uh oh!

jan-wassenberg left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ufownl commented Apr 10, 2024

Uh oh!

jan-wassenberg left a comment

Choose a reason for hiding this comment

Uh oh!

jan-wassenberg Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

ufownl Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

jan-wassenberg Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

ufownl Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

jan-wassenberg Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

jan-wassenberg left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants