Showing posts with label read-modify-write. Show all posts
Showing posts with label read-modify-write. Show all posts

Monday, September 12, 2011

Atomic read-modify-write operation on multicore

Recently, while working on our Shotgun large scale sparse logistic regression code, I learned some cool programming trick from Aapo Kyrola, who in turn learned it from Prof. Guy Blelloch from CMU.

The problem arises when you have an array, and you want multiple cores to add values to the same array position concurrently. This of course may result in undetermined behavior of the needed precautions are not taken.

A nice way to solve this problem is the following. Define the following
scary assembler procedure:

bool CAS(long *ptr, long oldv, long newv) {
      unsigned char ret;
      /* Note that sete sets a 'byte' not the word */
      __asm__ __volatile__ (
                    "  lock\n"
                    "  cmpxchgq %2,%1\n"
                    "  sete %0\n"
                    : "=q" (ret), "=m" (*ptr)
                    : "r" (newv), "m" (*ptr), "a" (oldv)
                    : "memory");
      return ret;
    }
The above procedure defines a read-modify-write lock on the array, and permits
only one thread at a time to write to the specific array value given in ptr.
The way to use this procedure is as follows:
void add(int idx, double fact) {
        volatile double prev;
        volatile double newval;
        volatile double oldval;
        do {
            prev = arr[idx];
            oldval = prev;
            newval = prev+fact;
        } while (!CAS(reinterpret_cast<long *>(&arr[idx]), *reinterpret_cast<volatile long *>(&prev), *reinterpret_cast<volatile long*>(&newval)));
    }
And here is some more detailed explanation from Guy Blelloch:
The CAS instruction is one of the machine instructions on the x86 processors (the first function is just calling the instruction, which has name cmpxchgq). Probably the best book that describes its various applications is Herlihy and Shavit's book titled "The Art of Multiprocessor Programming". The general use is for implementing atomic read-modify-write operations. The idea is to read the value, make some modification to it (e.g. increment it) and then write it back if the value has not changed in the meantime. The CAS(ptr, a, b)
function conditionally writes a value b into ptr if the current value equals a.