I'm trying to calculate a hash from a string for best-effort ordering and partioning purposes in Athena. There is no String to hashCode() similar in Athena, so as a best effort, I try to get the 2nd character and calculate its codepoint and get the modulus. (As I said, best effort, maybe a nice effort)
Consider the query:
SELECT
doc_id,
substring(doc_id, 2, 1),
typeof(split(substring(doc_id, 2, 1)))
FROM events LIMIT 100
The 3rd row returns a varchar but the codepoint function expects a varchar(1) and casting it does not work as cast(substring(doc_id, 2, 1) as varchar(1)).
FUNCTION_NOT_FOUND: line 6:5: Unexpected parameters (varchar) for function codepoint. Expected: codepoint(varchar(1))
How can I accomplish this task without modifiying the data source? I'm open to ideas.