Java Efficient Conversion of LocalDateTime to byte[] [closed]

Question

Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Because this question may lead to opinionated discussion, debate, and answers, it has been closed. You may edit the question if you feel you can improve it so that it requires answers that include facts and citations or a detailed explanation of the proposed solution. If edited, the question will be reviewed and might be reopened.

Closed last year.

Improve this question

I need to read a database table (sorted rows) with 300+ million TIMESTAMP as LocalDateTime in Java, and I need to get a single hash of all of them. Then, I need to get the same hash from the migrated database (different brand and all) and get the hash to compare.

I think I can use LocalDateTime.toString() to get a String, then get their bytes and use these to update the hash.

However, it's 300 millions values... twice. I'll run this during the database migration so hopefully it should be fast.

What's a good efficient way of getting a byte representation of a LocalDateTime?

By efficient I mean, that I could compare both databases in a reduced time frame, to avoid delaying the whole migration.

OK. Maybe something like MessageDigest m = MessageDigest.getInstance("SHA256"); ByteBuffer bb = ByteBuffer.allocate(8);bb.putLong(LocalDateTime.now().toEpochSecond(ZoneOffset.UTC));m.update(bb); bb.rewind(); //Repeat n million times — g00se
– g00se, Commented Jul 29, 2024 at 17:43
A lot depends on DB engine, you should really update this question and include the exact DB engine and version you use; you should always do that when you're asking for optimization, as most abstractions (such as SQL) no longer hold at that point). Until exact DB engine is available, a directly workable answer 'do THIS, because it will be faster than all alternatives' is not possible to provide. — rzwitserloot
– rzwitserloot, Commented Jul 29, 2024 at 17:49
@user85421 I'm not sure how solid that would be, but I'm guessing probably not solid enough — g00se
– g00se, Commented Jul 29, 2024 at 18:06

rzwitserloot · Accepted Answer · 2024-07-29 18:04:36Z

3

A lot depends on your DB engine. That's generally the case when asking 'I need to operation X a few million times twice a day with a DB engine'. SQL is a standard for a syntax, not for a performance profile.

For the rest of this answer I shall assume postgreSQL.

PostgreSQL adheres to the SQL standard on this and treats the type timestamp as short for timestamp without time zone, which, indeed, matches java class java.time.LocalDateTime the best.

However, a ton of conversion has to happen even if you just invoke rs.getObject(1, java.time.LocalDateTime.class) on your JDBC ResultSet. Yes, the JDBC 4.2 spec will guarantee this works, and, yes, this causes guaranteed lossless conversion. However, Java's LDT type has a boatload of fields (one for year, one for month, and so on), whereas psql bitpacks the data into an 8-byte sequence. Hence, if you so much as ask JDBC to give you a LocalDateTime object, you've already pretty much lost the game then and there - that's doing a boatload of work that isn't required. In fact, it's actively painful if the goal is to produce a hash.

So, don't, if you can. Let the DB to the work:

SELECT EXTRACT(epoch FROM TIMESTAMP '1999-01-08 04:05:06')

You can then get that via rs.getLong(1) via JDBC.

This gets you 915768306. Which is the amount of seconds that have passed since the epoch (midnight, jan 1st, 1970), for the UTC timezone. If you find the millisecond value relevant, you'd have to select 1000 * EXTRACT instead:

try (var stmt = con.createStatement()) {
  try (var rs = stmt.executeQuery("SELECT 1000 * EXTRACT(epoch FROM TIMESTAMP '1999-01-08 04:05:06')")) {
    rs.next();
    long a = rs.getLong(1);
    long b = LocalDateTime.of(1999, 1, 8, 4, 5, 6).toInstant(ZoneOffset.UTC).toEpochMilli();
    assertEquals(a, b); // this will hold.
  }
}

Is that faster? Probably. Certainly converting an LDT to a string and hashing that is making it worse. just call .hashCode() on your LDT if you must.

answered Jul 29, 2024 at 18:04

rzwitserloot

107k6 gold badges74 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Joe DiNottra Over a year ago

That's not a bad idea, actually. I would need to do the same thing in the source database (DB2) making sure I implement the same logic implemented to compare.

g00se Over a year ago

That's OK if you can be sure that both RDBMSs are going to respond similarly to such a function call. I would guess that "give me a LocaDateTime" is going to be a lot safer when asked of both

David Conrad Over a year ago

If you really want to let the DB do the work remove Java and JDBC from the picture and generate the hash on the DB side.

rzwitserloot Over a year ago

What would be vastly superior is if you can ask postgres to just give you those 8 bytes verbatim as a long, possibly endian-adjusted but no other conversion applied to it. I scoured the manuals and couldn't find it. But, hey, that's specifically for psql, maybe for other DB engines that is possible. And maybe there's some obscure way, possibly with stored procedures, to get that number.

Collectives™ on Stack Overflow

Java Efficient Conversion of LocalDateTime to byte[] [closed]

1 Answer 1

4 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Related