21 questions
1
vote
2
answers
251
views
Can ARM exclusive load-store implementing lock-free atomics?
Eventhough exclusions are cleared on trap boundaries, they appear atomic to application codes in the functional regard.
But I'm not certain about the progress guarantee that'd otherwise be made by ...
1
vote
3
answers
253
views
Will reading by other cores clear the exclusive status(ldrex) on arm smp?
I am writing an assembly program in the SMP kernel, which may run on armv7-a or AArch64 architecture.
This program is run with irq_disabled, so if I ldrex a memory address, the exclusive status will ...
1
vote
1
answer
243
views
LDAXRB and STXRB instructions - what is the "exclusive access to the memory address" in ARM64?
I'm trying to better understand atomic instructions on the ARM64 architecture.
So I'm testing this simple C code, using MSFT intrinsic (compiled with VS C++ 2022):
long v = 0;
...
0
votes
1
answer
129
views
Can you snoop cache coherence traffic to implement linked-load and store-conditional?
I kind of want to implement a form of LL/SC for x86-64 (Saphire/Emerald Rapids most likely). It seems that the cache has all the info needed to do this if, but I need to know when a cache line is ...
0
votes
0
answers
277
views
What are costs of disabling interrupts vs LDREX/STREX on Arm Cortex M?
On ARM Cortex M, I'm aware of only two ways to achieve atomicity:
LDREX/STREX
Disable interrupts
Both can be used in a very similar way: For example, define volatile bool is_locked, and check / set ...
1
vote
1
answer
84
views
__sync_add_and_fetch triggers an sError interrupt on raspberry pi 4b
When I use gcc's __sync_add_and_fetch to atomically increment an integer on my raspberry pi4b, the following code is generated:
172e4: c85f7e60 ldxr x0, [x19]
172e8: 91000400 ...
1
vote
1
answer
167
views
Does lock can avoid lr/sc 'spuriously fail'
I learn 'Computer Organization and Design' RISC-V version by David A. Patterson, and on page 254 Elaboration have below code
below is book contents and related code:
While the code above implemented ...
1
vote
1
answer
833
views
Implementing global monitor for exclusive access
Am implementing a global monitor for exclusive access (for ARM cores).
Query- if a particular exclusive transaction is successful, should I signal a clear on the global monitor?
In the case above is ...
2
votes
0
answers
1k
views
In risc-v architecture, how does store conditional instruction realize that the memory is modified?
The following code snippet is from Computer Organization and Design, RISC-V edition, 2nd edition.
Suppose that the memory location that is addressed by x20 register is modified after execution of lr.w ...
4
votes
2
answers
4k
views
What's 'reservation' in RISC-V's 'lr' instruction?
From 8.2 Load-Reserved/Store-Conditional Instructions chapter in RISC-V's unprivileged ISA Manual,
LR.W loads a word from the address in rs1, places the sign-extended value in rd, and registers a ...
2
votes
1
answer
775
views
How is this a guarantee a value has been atomically updated in ARM?
ARM provides LDREX/STREX to atomically load/store values, but I feel like I'm missing something in how this is still an atomic operation. The following below is generally how an increment by one would ...
2
votes
1
answer
550
views
atomic linked-list LIFO in AArch64 assembly, using load or store between ldxr / stxr
I had implemented a LIFO for shared memory context using assembly for ARMv8 64bit.
The LIFO inserts a node in beginning and each node structure's first attribute must be next pointer.
Is this correct ...
9
votes
2
answers
3k
views
When is CLREX actually needed on ARM Cortex M7?
I found a couple of places online which state that CLREX "must" be called whenever an interrupt routine is entered, which I don't understand. The docs for CLREX state (added the numbering ...
1
vote
0
answers
695
views
ARM Cortex-M4/7: Do regular memory accesses between LDREX/STREX invalidate the exclusive monitor
I am trying to rewrite a code section that currently works with disabling/enabling interrupts with LDREX/STREX on a STM32F7(single core, microcontroller).
May sound like a straightforward question, ...
6
votes
2
answers
3k
views
How is a spin lock woken up in Linux/ARM64?
In the Linux kernel, arch_spin_lock() is implemented as follows:
static inline void arch_spin_lock(arch_spinlock_t *lock)
{
unsigned int tmp;
arch_spinlock_t lockval, newval;
asm ...
5
votes
1
answer
3k
views
What' s the advantage of LL/SC when compared with CAS (compare-and-swap)?
What' s the advantage of LL/SC comparing with CAS(compare and swap) in computer architecture? I think LL/SC can case livelock in many-core system, and case ABA problem, but CAS does not. I can not ...
3
votes
1
answer
2k
views
Lock-free C++11 example using Load-link/store-conditional to prevent ABA?
When writing lock-free code using the Compare-and-Swap (CAS) technique there is a problem called the ABA problem:
http://en.wikipedia.org/wiki/ABA_problem
whereby comparing just on the value "A" is ...
1
vote
2
answers
4k
views
How do ldrex / strex make atomic_add in ARM an atomic operation?
As per http://lxr.free-electrons.com/source/arch/arm/include/asm/atomic.h#L31
static inline void atomic_add(int i, atomic_t *v)
41 {
42 unsigned long tmp;
43 int result;
44
45 ...
4
votes
2
answers
3k
views
ARM LL/SC exclusive access by register width or cache line width?
I'm working on the next release of my lock-free data structure library, using LL/SC on ARM.
For my use-case of LL/SC, I need to use it with a single STR between the LDREX and STREX. (Rather than ...
11
votes
2
answers
8k
views
compare-and-swap atomic operation vs Load-link/store-conditional operation
Under an x86 processor I am not sure of the difference between compare-and-swap atomic operation and Load-link/store-conditional operation. Is the latter safer than the former? Is it the case that the ...
1
vote
3
answers
2k
views
How does x86 handle store conditional instructions?
I am trying to find out what an x86 processor does when it encounters a store conditional instruction. For instance does it stall the front end of the pipeline and wait for the ROB buffer to become ...