Intro to data-oriented design
Stoyan Nikolov
@stoyannk
stoyannk.wordpress.com
github.com/stoyannk
What we do?
● Game UI middleware
○ Coherent UI 2.x
○ Coherent GT 1.x
○ Unannounced project
● Stoyan Nikolov - Co-Founder & Software Architect
● Introduces “real-world” abstractions
● Couples “data” with the operations (code) on
them
● Treats objects as black boxes
● Promises easier code reuse and
maintenance
Quick OOP overview
But..
● Was born in an era when machines were
different than the ones we now have
● Tries to hide the data instead of embracing
them
● Often reuse and maintainability are hurt
through excessive coupling
Cache misses.. ouch!
Image by Chris Adkin - http://chrisadkin.org/
Data-oriented design
● Relatively recent trend primarily in game
development (but the idea is old)
● Think about the data & flow first
● Think about the hardware your code runs on
● Build software around data & transforms on
it, instead of an artificial abstraction
Goals
● Higher performance
● Easier to parallelise
● Easier to maintain
Sounds good, but..
● Although simple in essence data-oriented
design can be difficult to achieve
● Probably we need more time to shake-off
years of OOP indoctrination
● Many “text-book” examples in presentations
are too obvious
Classic examples
● Breaking classes into pieces for better cache
utilization
● AoS -> SoA
● Arrays of Components in a game engine
● “Where there’s one - there are many”
An example from practice
● Rendering backend in our products is a
custom class for every API we support (9
graphic platforms)
● Library calls methods of an interface that is
implemented for every API
Classic approach
Library
IRenderer
Dx 11
Renderer
OpenGL
Renderer
...
class IRenderer
{
public:
virtual bool CreateVertexBuffer(...)
= 0;
// .. other res. management
virtual void SetRenderTarget(...) =
0;
virtual void SetVertexBuffer(...) =
0;
virtual void SetTexture(...) = 0;
virtual void DrawIndexed(...) = 0;
// .. etc.
};
Classic flow
void LibraryDrawScene(const RenderingOpsVec& operations)
{
m_Renderer->SetRenderTarget(...);
for(const auto& op : operations) {
auto& mesh = FindMesh(op);
EnsureBuffers(mesh);
m_Renderer->SetVertexBuffer(...);
m_Renderer->SetIndexBuffer(...);
DecideShaders(op);
m_Renderer->UpdateCB(.., op->GetTransform());
// .. etc.
m_Renderer->DrawIndexed(..);
}
}
Likely cache misses :(
What is happening?
● We have interleaved the processing of both the library data and the
Renderer
● The data footprint of them is large
● We jump in the Renderer but the cache is full of Library data (cache miss)
● The Renderer does a lot of computations and populates the cache with its
data
● We jump back in the Library but the cache is full of Renderer stuff -> again
a cache miss
● … and so on ...
Data-oriented version
class IRenderer
{
public:
virtual bool CreateVertexBuffer(...) = 0;
// .. other res. management
virtual void ExecuteRenderingCommands(
const void* commands, unsigned count) = 0;
// .. etc.
};
Data-oriented flow
void LibraryDrawScene(const RenderingOpsVec& operations)
{
CommandBuffer buffer;
buffer->AddCommand(SetRenderTarget{..});
for(const auto& op : operations) {
auto& mesh = FindMesh(op);
EnsureBuffers(mesh);
buffer->AddCommand(SetVertexBuffer{});
buffer->AddCommand(SetIndexBuffer{});
DecideShaders(op);
buffer->AddCommand(UpdateCB{.., op->GetTransform()});
// .. etc.
buffer->AddCommand(DrawIndexed{});
}
m_Renderer->ExecuteRenderingCommands(buffer.data(), buffer.size());
}
Why it works?
● We stay in the Library - big chance for data
to stay in cache
● Control is transferred to Renderer once with
all the commands
● “Where there’s one, there are many”
● ~15% improvement JUST by changing the
API!
Key take-away
Think what is happening on the machine when
it executes your code.
Algorithmic complexity is rarely the problem -
constant factors often hit performance!
Thank you!
@stoyannk
stoyannk.wordpress.com
github.com/stoyannk
Come talk to us if you are interested in what we do!
References
● Data-Oriented design, Richard Fabian, http://www.dataorienteddesign.com/dodmain/
● Pitfalls of Object Oriented Programming, Tony Albrecht, http://harmful.cat-
v.org/software/OO_programming/_pdf/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf
● Data-Oriented Design in C++, Mike Acton, https://www.youtube.com/watch?v=rX0ItVEVjHc
● Typical C++ Bullshit, Mike Acton,
http://macton.smugmug.com/gallery/8936708_T6zQX#!i=593426709&k=BrHWXdJ
● Introduction to Data-Oriented Design, DICE, http://www.dice.se/wp-
content/uploads/2014/12/Introduction_to_Data-Oriented_Design.pdf
● Culling the Battlefield: Data Oriented Design in Practice, Daniel Collin,
http://www.slideshare.net/DICEStudio/culling-the-battlefield-data-oriented-design-in-practice
● Adventures in data-oriented design, Stefan Reinalter,
https://molecularmusings.wordpress.com/2011/11/03/adventures-in-data-oriented-design-part-1-mesh-data-3/
● Building a Data-Oriented Entity System, Niklas Frykholm, http://bitsquid.blogspot.com/2014/08/building-data-
oriented-entity-system.html

Intro to data oriented design

  • 1.
    Intro to data-orienteddesign Stoyan Nikolov @stoyannk stoyannk.wordpress.com github.com/stoyannk
  • 2.
    What we do? ●Game UI middleware ○ Coherent UI 2.x ○ Coherent GT 1.x ○ Unannounced project ● Stoyan Nikolov - Co-Founder & Software Architect
  • 3.
    ● Introduces “real-world”abstractions ● Couples “data” with the operations (code) on them ● Treats objects as black boxes ● Promises easier code reuse and maintenance Quick OOP overview
  • 4.
    But.. ● Was bornin an era when machines were different than the ones we now have ● Tries to hide the data instead of embracing them ● Often reuse and maintainability are hurt through excessive coupling
  • 5.
    Cache misses.. ouch! Imageby Chris Adkin - http://chrisadkin.org/
  • 6.
    Data-oriented design ● Relativelyrecent trend primarily in game development (but the idea is old) ● Think about the data & flow first ● Think about the hardware your code runs on ● Build software around data & transforms on it, instead of an artificial abstraction
  • 7.
    Goals ● Higher performance ●Easier to parallelise ● Easier to maintain
  • 8.
    Sounds good, but.. ●Although simple in essence data-oriented design can be difficult to achieve ● Probably we need more time to shake-off years of OOP indoctrination ● Many “text-book” examples in presentations are too obvious
  • 9.
    Classic examples ● Breakingclasses into pieces for better cache utilization ● AoS -> SoA ● Arrays of Components in a game engine ● “Where there’s one - there are many”
  • 10.
    An example frompractice ● Rendering backend in our products is a custom class for every API we support (9 graphic platforms) ● Library calls methods of an interface that is implemented for every API
  • 11.
    Classic approach Library IRenderer Dx 11 Renderer OpenGL Renderer ... classIRenderer { public: virtual bool CreateVertexBuffer(...) = 0; // .. other res. management virtual void SetRenderTarget(...) = 0; virtual void SetVertexBuffer(...) = 0; virtual void SetTexture(...) = 0; virtual void DrawIndexed(...) = 0; // .. etc. };
  • 12.
    Classic flow void LibraryDrawScene(constRenderingOpsVec& operations) { m_Renderer->SetRenderTarget(...); for(const auto& op : operations) { auto& mesh = FindMesh(op); EnsureBuffers(mesh); m_Renderer->SetVertexBuffer(...); m_Renderer->SetIndexBuffer(...); DecideShaders(op); m_Renderer->UpdateCB(.., op->GetTransform()); // .. etc. m_Renderer->DrawIndexed(..); } } Likely cache misses :(
  • 13.
    What is happening? ●We have interleaved the processing of both the library data and the Renderer ● The data footprint of them is large ● We jump in the Renderer but the cache is full of Library data (cache miss) ● The Renderer does a lot of computations and populates the cache with its data ● We jump back in the Library but the cache is full of Renderer stuff -> again a cache miss ● … and so on ...
  • 14.
    Data-oriented version class IRenderer { public: virtualbool CreateVertexBuffer(...) = 0; // .. other res. management virtual void ExecuteRenderingCommands( const void* commands, unsigned count) = 0; // .. etc. };
  • 15.
    Data-oriented flow void LibraryDrawScene(constRenderingOpsVec& operations) { CommandBuffer buffer; buffer->AddCommand(SetRenderTarget{..}); for(const auto& op : operations) { auto& mesh = FindMesh(op); EnsureBuffers(mesh); buffer->AddCommand(SetVertexBuffer{}); buffer->AddCommand(SetIndexBuffer{}); DecideShaders(op); buffer->AddCommand(UpdateCB{.., op->GetTransform()}); // .. etc. buffer->AddCommand(DrawIndexed{}); } m_Renderer->ExecuteRenderingCommands(buffer.data(), buffer.size()); }
  • 16.
    Why it works? ●We stay in the Library - big chance for data to stay in cache ● Control is transferred to Renderer once with all the commands ● “Where there’s one, there are many” ● ~15% improvement JUST by changing the API!
  • 17.
    Key take-away Think whatis happening on the machine when it executes your code. Algorithmic complexity is rarely the problem - constant factors often hit performance!
  • 18.
  • 19.
    References ● Data-Oriented design,Richard Fabian, http://www.dataorienteddesign.com/dodmain/ ● Pitfalls of Object Oriented Programming, Tony Albrecht, http://harmful.cat- v.org/software/OO_programming/_pdf/Pitfalls_of_Object_Oriented_Programming_GCAP_09.pdf ● Data-Oriented Design in C++, Mike Acton, https://www.youtube.com/watch?v=rX0ItVEVjHc ● Typical C++ Bullshit, Mike Acton, http://macton.smugmug.com/gallery/8936708_T6zQX#!i=593426709&k=BrHWXdJ ● Introduction to Data-Oriented Design, DICE, http://www.dice.se/wp- content/uploads/2014/12/Introduction_to_Data-Oriented_Design.pdf ● Culling the Battlefield: Data Oriented Design in Practice, Daniel Collin, http://www.slideshare.net/DICEStudio/culling-the-battlefield-data-oriented-design-in-practice ● Adventures in data-oriented design, Stefan Reinalter, https://molecularmusings.wordpress.com/2011/11/03/adventures-in-data-oriented-design-part-1-mesh-data-3/ ● Building a Data-Oriented Entity System, Niklas Frykholm, http://bitsquid.blogspot.com/2014/08/building-data- oriented-entity-system.html