Skip to main content
Filter by
Sorted by
Tagged with
Advice
0 votes
1 replies
71 views

While analyzing the Spectre vulnerability, I ran into a question about how branch prediction training works. My understanding is that the CPU accumulates prediction history for a specific conditional ...
Nikolay Isaev's user avatar
Tooling
0 votes
1 replies
63 views

I am working on a microarchitectural tooling project, and as part of a heuristic I need the ability to observe and manipulate the internal state of a branch predictor. Specifically, I am looking for ...
Gal Kaptsenel's user avatar
2 votes
0 answers
290 views

So there is this original question I assume most of the C++ developers familiar with : Why is processing a sorted array faster than processing an unsorted array? Answer: branch prediction Then I tried ...
OopsUser's user avatar
  • 4,894
25 votes
2 answers
4k views

The C++ standard [dcl.attr.likelihood] says: [Note 2: Excessive usage of either of these attributes is liable to result in performance degradation. — end note] I’m trying to understand what “...
Artyom Fedosov's user avatar
3 votes
0 answers
119 views

I have a performance-critical C++ code base, and I want to improve (or at least measure if it's worth improving) the likelihood that clang assigns to branches, and in general understand what it's ...
meisel's user avatar
  • 2,625
3 votes
1 answer
140 views

How jumps and call-ret pairs affect the CPU front-end decoder in the best case scenario when there are few instructions, they are well cached, and branches are well predicted? For example, I run a ...
xealits's user avatar
  • 4,818
4 votes
1 answer
120 views

Sometimes we purposefully leave NOPs in a function for later runtime patching. Instead of: .nops 16 Why not: jmp 0f .nops 14 0: Or, if the amount that you need to patch in, varies up to a maximum: ....
Joseph Garvin's user avatar
7 votes
1 answer
216 views

While trying to measure the impact of branch miss prediction, I've noticed that there is no penalty at all to branch miss prediction. Based on the famous stack overflow question : Why is processing a ...
OopsUser's user avatar
  • 4,894
2 votes
1 answer
104 views

I have the following code in nanoMips: loop: lw $t1, A($t0) lw $t2, B($t0) sub $t3, $t1, $t2 beq $t3, $r0, else sw $t2, A($t0) b end The exercise asks me to implement the no-taken branch prediction ...
papitas's user avatar
  • 21
5 votes
0 answers
372 views

There was recently news/benchmarks showing that Ryzen processors (Ryzen 4 and 5) benefit in games from the Windows update. AMD in their blog wrote this is because of branch-prediction changes to ...
NoSenseEtAl's user avatar
  • 30.9k
0 votes
1 answer
247 views

When learning about the basic 5-stage pipeline processor that does in-order execution the number of wasted cycles per branch misprediction is a constant number when the processor is flushed. But what ...
Gehaktmolen's user avatar
3 votes
1 answer
387 views

As noted in the Intel optimization manual: The default predicted target for indirect branches and calls is the fall-through path. Fall-through prediction is overridden if and when a hardware ...
Changbin Du's user avatar
0 votes
0 answers
34 views

There are many questions on checking finding a GUID in a list etc. But I could not find any for just determining if a message was seen before or not. I have an API which receives requests with a ...
pooya13's user avatar
  • 2,859
2 votes
1 answer
152 views

This is my code. I can totally understand that benchIfAndSwitch is faster than benchSiwtch because “branch prediction”, but why is benchEnum not the fastest one? There is no if or switch statementtt. ...
FressMonster's user avatar
0 votes
1 answer
244 views

I have performance critical code which calculates inter-atomic forcefield. It is controled by variables like bPBC, shifts, doBonds, doPiSigma, doPiPiI which can be switched on and off by user which ...
Prokop Hapala's user avatar
1 vote
0 answers
590 views

I don't think this is a duplicate, as this question is regarding how to write optimal code to cater to the branch predictor, as well as validating my personal understanding of how it works in general. ...
Jam's user avatar
  • 594
0 votes
0 answers
56 views

https://developers.google.com/admob/ios/privacy class ViewController: UIViewController { // Use a boolean to initialize the Google Mobile Ads SDK and load ads once. private var ...
Gargo's user avatar
  • 1,378
1 vote
0 answers
121 views

i'm studying MIPS pipeline in Patterson and Hennesy TextBook this picture below shows the edits for beq instruction : The idea is to calculate branch target and detect if taken or not in decode stage,...
Mr.Robot's user avatar
0 votes
1 answer
136 views

I am currently implementing selectionsort. Using the code below and the driver file to test it below. I am currently trying to do micro optimizations to see what speeds it up. public static void ...
Ooh Ben's user avatar
  • 11
1 vote
1 answer
163 views

In O3, only one algorithm, bpred_unit, is used, and gem5 also provides several other branch prediction algorithms. I want to compare the prediction accuracy of different algorithms, what should I do?...
Gdnxn Dhfjc's user avatar
1 vote
4 answers
401 views

I know a little something about branch prediction. This happens at the CPU and has nothing to do with compilation. Although you might be able to tell the compiler if one branch is more likely than the ...
Joel's user avatar
  • 1,777
1 vote
1 answer
97 views

In computer architecture class, I learned that when the "if" statement is executed in assembly language, it involves the use of branch prediction strategies. Furthermore, it was emphasized ...
이윤수's user avatar
1 vote
3 answers
241 views

I have the following logic: struct Range { int start; int end; }; bool prev = false; Range range; std::vector<Range> result; for (int i = 0; i < n; i++) { bool curr = ...; // this is ...
ra1nsq's user avatar
  • 11
2 votes
1 answer
218 views

There are cases where (logically at least) it makes no difference if I leave out the else keyword, for example: int func(int num) { if(num == 10) return 99999; **else** return -1; } Question ...
penguin2213's user avatar
1 vote
1 answer
435 views

EDIT x 2 Added more comprehensive function which returns an abstract register class: the function outputs a register full of floats. I don't care the actual length - SSE, AVX... - because Google ...
stuckoverlow's user avatar
0 votes
0 answers
19 views

The use of builtin_expect_with_probability gcc function is for condition check with probability like in below example __builtin_expect_with_probability(!!(x),1,1.0) can someone tell me what is the ...
Naval's user avatar
  • 1
2 votes
1 answer
1k views

On x86-64 whatever micro architecture and ARM64 devices, how many clock cycles does a mispredicted conditional branch cost? And I suppose I should also ask what the figure is for a successfully ...
Cecil Ward's user avatar
1 vote
1 answer
337 views

let's say I have a function that accepts a callback argument (example given in rust and C) void foo(void (*bar)(int)) { // lots of computation bar(3); } fn foo(bar: fn(u32)) { // lots of ...
ajp's user avatar
  • 2,575
1 vote
0 answers
186 views

Modern CPUs since at least the 486 ¹) have a tightly-pipelined design, so conditional branches can cause "stalls" in which the pipeline has to be flushed and the code restarted on a ...
Coder's user avatar
  • 247
0 votes
1 answer
548 views

i need to get the prediction details of batch prediction job which are stored on google cloud storage, however to get that i need to get JOB ID from BatchPredictionJob i tired to write the results to ...
krissy's user avatar
  • 11
1 vote
0 answers
284 views

For branch prediction, the BHT(Branch history table) is indexed by branch virtual address. Aliasing problem happens when two or more branches hash to the same entry in the BHT(Branch history table), ...
Changbin Du's user avatar
0 votes
2 answers
1k views

Does 2-bit prediction always better than 1-bit? And from wikipedia, how ‘a loop-closing conditional jump is mispredicted once rather than twice.’ with 2-bit prediction? According to this answer, 2-bit ...
An5Drama's user avatar
  • 774
3 votes
0 answers
186 views

I am currently looking for answers to why gcc generates strange instructions like "rep ret" in the generated assembly code. I came across a question on Stack Overflow where someone raised a ...
Michael Coleman's user avatar
3 votes
1 answer
894 views

I am learning about pipelining and was reading about control hazards from the book Computer Organization and Design: The Hardware/Software Interface (MIPS Edition). There is a paragraph in the book (...
Prithvidiamond's user avatar
0 votes
0 answers
39 views

If CPU is already in the path of a branch A speculatively, will it continue to speculatively execute the next branch B? or wait until branch A retire? if (A) { /* body of branch A */ if(B) { ...
Changbin Du's user avatar
0 votes
0 answers
141 views

I was wondering if I have a branch bool condition = x > y; // just an example if(condition) { // do the thing... } else { // do the other thing... } It can be optimized to something like this ...
M.kazem Akhgary's user avatar
3 votes
0 answers
160 views

I have an AVL tree and I need to traverse it in ascending and descending order. I implemented a simple algorithm, where knowing the tree size in advance, I allocate an array and assign 0 to a counter, ...
Serge Rogatch's user avatar
1 vote
0 answers
137 views

Before I begin, yes, I'm aware of the compiler built-ins __builtin_expect and __builtin_unpredictable (Clang). They do solve the issue to some extent, but my question is about something neither ...
Mona the Monad's user avatar
3 votes
1 answer
237 views

I know that most modern processors maintain a branch prediction table (BPT). I have read the gdb documentation but I could not found any command that should give desired results. Based on this, I have ...
Taimoor Zaeem's user avatar
7 votes
3 answers
2k views

I came across this very nice infographic which gives a rough estimation about the CPU-cylces used for certain operations. While studying I noticed an entry "Right branch of if" which I ...
glades's user avatar
  • 5,392
-1 votes
1 answer
484 views

In go standard package src/sync/once.go, a recent revision change the snippets if atomic.LoadUint32(&o.done) == 1 { return } //otherwise ... to: //if atomic.LoadUint32(&o.done) == ...
agnes's user avatar
  • 21
7 votes
1 answer
735 views

While trying to benchmark implementations of a simple sparse unit lower triangular backward solve in CSC format, I observe strange behavior. The performance seems to vary drastically, depending on ...
mjacobse's user avatar
  • 367
0 votes
1 answer
546 views

I know that modern CPUs do OoO execution and got advanced branch predictors that may fail, how does the debugger deal with that? So, if the cpu fails in predicting a branch how does the debugger know ...
Ahmed Ehab's user avatar
5 votes
1 answer
2k views

Using C++ template and if constexpr I found a trick that I like a lot: suppose you have a function with some tunable option that are known compile-time, I can write something like template <bool ...
MaPo's user avatar
  • 887
0 votes
3 answers
553 views

Here is some c++ pseudo-code as an example: bool importantFlag = false; for (SomeObject obj : arr) { if (obj.someBool) { importantFlag = true; } obj.doSomethingUnrelated(); } ...
Greg's user avatar
  • 63
3 votes
0 answers
3k views

I want to understand branch prediction behavior of a program I work on. For this, I use the perf tool. I recorded with: perf record -e branches,branch-misses and visualizing it with perf report --...
Konstantin Solomatov's user avatar
5 votes
0 answers
150 views

Consider this code: .globl _non_tail, _tail .text .code32 _non_tail: lcall $0x33, $_non_tail.heavensgate ret .code64 _non_tail.heavensgate: # do stuff. there's 12 bytes on the stack ...
Joseph Sible-Reinstate Monica's user avatar
6 votes
2 answers
969 views

I looked at the wiki article on branch target predictor; it's somewhat confusing: I thought the branch target predictor comes into play when a CPU decides which instruction(s) to fetch next (into the ...
ledonter's user avatar
  • 1,769
0 votes
0 answers
332 views

So I have this code snippet in C int unit_test_case08(int a, int b) { int success = 1336; if(a != b) { success = 1337; } else { success = -1; } return ...
BBBBBBBBBBBBBBBBBBBBBBBBB's user avatar
0 votes
1 answer
198 views

Is there a tool available to profile java applications regarding branch (mis)prediction statistics for if statements? I know VisualVM and JDK Mission Control but did not find such functionality.
Mahatma_Fatal_Error's user avatar

1
2 3 4 5
8