3

I have a function that takes as input an array of unsigned char (16 values exactly) and creates a GUID (from GUID struct) by parsing all values to hexadecimal and passing a guid-formatted string to UuidFromStringA().

My code is as follows:

GUID CreateGuid(const uint8_t* data)
{
    createGuidFromBufferData(hexValues, data);
    int uuidCreationReturnCode = UuidFromStringA((RPC_CSTR)hexValues, &guid);
    return guid;
}
            
inline void createGuidFromBufferData(char* hexValues, const uint8_t* data)
{
    decimalToHexadecimal(data[3], hexValues, 0);
    decimalToHexadecimal(data[2], hexValues, 2);
    decimalToHexadecimal(data[1], hexValues, 4);
    decimalToHexadecimal(data[0], hexValues, 6);
    hexValues[8] = '-';
    decimalToHexadecimal(data[5], hexValues, 9);
    decimalToHexadecimal(data[4], hexValues, 11);
    hexValues[13] = '-';
    decimalToHexadecimal(data[6], hexValues, 14);
    decimalToHexadecimal(data[7], hexValues, 16);
    hexValues[18] = '-';
    decimalToHexadecimal(data[8], hexValues, 19);
    decimalToHexadecimal(data[9], hexValues, 21);
    hexValues[23] = '-';
    decimalToHexadecimal(data[10], hexValues, 24);
    decimalToHexadecimal(data[11], hexValues, 26);
    decimalToHexadecimal(data[12], hexValues, 28);
    decimalToHexadecimal(data[13], hexValues, 30);
    decimalToHexadecimal(data[14], hexValues, 32);
    decimalToHexadecimal(data[15], hexValues, 34);
}

inline void decimalToHexadecimal(uint8_t decimalValue, char* outputBuffer, int currentIndex)
{
    const char hexValues[] = "0123456789abcdef";
    outputBuffer[currentIndex] = hexValues[decimalValue >> 4];
    outputBuffer[currentIndex + 1] = hexValues[decimalValue & 0xf];
}

This works fine, but I would like to do something a bit more efficient, and create the GUID directly using my input char array with something like this:

GUID CreateGuid(const uint8_t* data)
{
    GUID guid = { 
        *reinterpret_cast<const unsigned long*>(data), 
        *reinterpret_cast<const unsigned short*>(data + 4), 
        *reinterpret_cast<const unsigned short*>(data + 6), 
        *reinterpret_cast<const unsigned char*>(data + 8)
    };
    return guid;
}

When doing so, only one of the last 8 bytes is set and the rest are 0; For example using unsigned char array [38, 150, 233, 16, 43, 188, 117, 76, 187, 62, 254, 96, 109, 226, 87, 0]

when I should be getting:

10e99626-bc2b-754c-bb3e-fe606de25700

I'm getting instead:

10e99626-bc2b-75dc-bb00-000000000000}

3
  • 2
    Please show a minimal reproducible example Commented Aug 25, 2021 at 7:54
  • maybe use sprintf Commented Aug 25, 2021 at 8:11
  • @Bodo that would still be creating a string from my unsigned char array instead of using the unsigned char array directly Commented Aug 25, 2021 at 8:15

2 Answers 2

3

GUID is an aggregate with a trivial copy assignment. Because of that, you should be able to do this directly.

GUID CreateGuid(const uint8_t* data)
{
    return *reinterpret_cast<GUID*>(data)
}
Sign up to request clarification or add additional context in comments.

2 Comments

Isn't that UB? In C++20 you could use std::bit_cast instead of reinterpret_cast.
Also note that, although the GUID structure is well-designed (long, short, short, char[]) in regards to padding, this method is not guaranteed to work on any 'trivial' aggregate, because of potential padding between elements.
1

You can't fill an 8-character array simply by dereferencing a cast pointer to your source data and assigning that value to its first element, as you are attempting. As you have noticed, doing so will only assign a value to that first element.

For the 8-character array at the end of the GUID structure, you can initialise each element individually, extending the cast + offset approach:

GUID CreateGuid(const uint8_t* data)
{
    GUID guid = {
        *reinterpret_cast<const unsigned long*>(data),
        *reinterpret_cast<const unsigned short*>(data + 4),
        *reinterpret_cast<const unsigned short*>(data + 6),
        { // We need to assign each of the 8 elements of the array ...
            *reinterpret_cast<const unsigned char*>(data + 8),
            *reinterpret_cast<const unsigned char*>(data + 9),
            *reinterpret_cast<const unsigned char*>(data + 10),
            *reinterpret_cast<const unsigned char*>(data + 11),
            *reinterpret_cast<const unsigned char*>(data + 12),
            *reinterpret_cast<const unsigned char*>(data + 13),
            *reinterpret_cast<const unsigned char*>(data + 14),
            *reinterpret_cast<const unsigned char*>(data + 15),
        }
    };
    return guid;
}

Note: As pointed out in the comments (thanks, Timo), using this direct initialization approach risks invoking undefined behaviour, especially if the passed source data buffer isn't suitably aligned for the corresponding destination types, because of the Strict Aliasing Rule. To avoid this, you should use memcpy to transfer data from the source buffer to the destination structure's elements:

GUID CreateGuid(const uint8_t* data)
{
    GUID guid;
    std::memcpy(&guid.Data1, data, sizeof(long));
    std::memcpy(&guid.Data2, data + 4, sizeof(short));
    std::memcpy(&guid.Data3, data + 6, sizeof(short));
    std::memcpy(guid.Data4, data + 8, 8); // sizeof(char) == 1
    return guid;
}

Further, as the GUID structure is defined in such a way that there should be no padding bytes added between its elements, you could copy the data in one fell swoop:

GUID CreateGuid(const uint8_t* data)
{
    GUID guid;
    std::memcpy(&guid, data, sizeof(GUID)); // Assuming no padding!
    return guid;
}

4 Comments

Aren't those reinterpret_casts UB because of the type aliasing rule? At least the non-char ones
@Timo Possibly. But, if the passed data as a whole are properly aligned (i.e. the 0 element is suitably aligned for a long), then there should be no issues. But you're right - using memcpy() rather than direct initialisation would be better/safer ...
"if the passed source data buffer isn't suitably aligned". I think, strictly speaking, this doesn't matter for the UB case because you're always violating the strict aliasing rule (unless data points to a GUID object). Even float f = 1.0; int32_t x = *reinterpret_cast<int32_t*>(&f) is UB according to strict aliasing, even though they have the same alignment.
@Timo Edited again ... and now I'm going for some much-needed coffee! :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.