Skip to content

add simdutf::validate_base64#568

Open
anonrig wants to merge 3 commits intomasterfrom
yagiz/add-validate-base64
Open

add simdutf::validate_base64#568
anonrig wants to merge 3 commits intomasterfrom
yagiz/add-validate-base64

Conversation

@anonrig
Copy link
Member

@anonrig anonrig commented Sep 27, 2024

Let's add a simplistic implementation where we can iterate for performance later.

Fixes #566

@anonrig anonrig requested a review from lemire September 27, 2024 16:10
@anonrig anonrig force-pushed the yagiz/add-validate-base64 branch 6 times, most recently from 1c4b717 to 2e82bc5 Compare September 27, 2024 16:23
@anonrig anonrig force-pushed the yagiz/add-validate-base64 branch from 2e82bc5 to f27ca67 Compare September 27, 2024 16:27
@anonrig
Copy link
Member Author

anonrig commented Sep 29, 2024

@lemire any objections/concerns for landing this?

@lemire
Copy link
Member

lemire commented Sep 30, 2024

@anonrig Happy to merge if you recommend it.

  • Do we want to pass a parameter to enable (for example) base64url?
  • Should we update the README to include a reference to this function?

@lemire
Copy link
Member

lemire commented Sep 30, 2024

Also, please do make sure that you want to exact functionality. :-) As once we merge it in, we might be stuck supporting it as it is. :-)

I am just careful about adding new functionality.

@lemire
Copy link
Member

lemire commented Oct 1, 2024

@anonrig

There is a trivial conflict.

@anonrig
Copy link
Member Author

anonrig commented Oct 1, 2024

There is a trivial conflict.

I'll take a look @lemire

@WojciechMula
Copy link
Collaborator

@anonrig @lemire is there anything that prevents us from merging? (apart from conflicts)

@lemire
Copy link
Member

lemire commented Feb 19, 2025

@WojciechMula

There is no objection on my part. See my comments. I am mostly just concerned about making sure that we offer the right functionality.

@anonrig
Copy link
Member Author

anonrig commented Apr 26, 2025

I can rebase this pull-request, if we're ok with it? Can you respond with a thumbs up or a thumbs down?


simdutf_warn_unused bool validate_base64(const char* input, size_t size) noexcept {
if (simdutf_unlikely(size == 0)) {
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this return true?

if (padding > 2 || i < size - 2) {
return false;
}
} else if (base64_lookup[c] == 0 && c != 'A') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the 'A' check? if c is 'A' (0x41) the table gives 0? please explain.

}
}

void validate_base64(std::span<const char> chardata) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file implements a differential fuzzer which does not make sense when all implementations are using the same fallback function.
I would rather see unit tests added.

* @param len the length of the string in bytes.
* @return true if and only if the string is valid base64.
*/
simdutf_warn_unused bool validate_base64(const char *buf, size_t len) noexcept;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good with a span overload for this function!

return (length + 2)/3 * 4; // We use padding to make the length a multiple of 4.
}

// Lookup table for valid base64 characters.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be moved inside the function, to reduce scope and visibility?

@anonrig
Copy link
Member Author

anonrig commented Apr 26, 2025

Thanks for the review @pauldreik. I'll address them soon.

@lemire
Copy link
Member

lemire commented Apr 28, 2025

I can rebase this pull-request, if we're ok with it? Can you respond with a thumbs up or a thumbs down?

Sure, sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adding is_base64(const char*, size_t) API

5 participants