2

I'm working on a database importer tool. I'm reading from a Postgres DB by setting the JDBC fetch size. Note that I've set it to 1 for testing purposes. I'm using this snippet of code to read from the underlying Postgres DB:

try (final Connection connection = dataSource.getConnection()) {
    final DatabaseMetaData metadata = connection.getMetaData();
    connection.setAutoCommit(false);
    final PreparedStatement statement = connection
        .prepareStatement("SELECT * FROM \"public\".\"table\" ORDER BY \"cursor\" ASC");
    statement.setFetchSize(1);
    final ResultSet rs = statement.executeQuery();
    final ResultSetMetaData metaData = rs.getMetaData();
    while (rs.next()) {
        processResultSet(rs);
    }
    rs.close();
}

I wanted to understand what happens while running this tool on a database with live updates.

To do this, I set a breakpoint while processing the result set and concurrently updated values in the new source database. My initial theory was that the fetch size is enforced by a combination of limits and offsets. My expectation was that the new changes to the database would be picked up by my code, since I'm setting the fetch size (causing the query to be executed in streaming mode). I also verified that each call to rs.next() goes over the network to grab data from the Postgres DB.

However, this assumption was wrong. What I found is that the new values are not read in my simple JDBC iterator. Is that because the given SQL query is executing in a single transaction? If so, how does this work? Is Postgres keeping track of stale data until a transaction ends. Does this even apply for large queries e.g. select * from table? Appreciate the help in advance - trying to get a better understanding of this

1
  • Reading dirty / changed data usually depends on type of CURSOR you're using. I don't know why you're using fetchSize, or what you want to achieve, but i don't think fetchSize is it Commented Feb 28, 2023 at 23:42

1 Answer 1

2

The JDBC driver speaks the PostgreSQL Frontend/Backend Protocol directly. The message that is sent is Execute:

Execute (F)

     Byte1('E')

          Identifies the message as an Execute command.

     Int32

          Length of message contents in bytes, including self.

     String

          The name of the portal to execute (an empty string selects the unnamed portal).

     Int32

          Maximum number of rows to return, if portal contains a query that returns rows
          (ignored otherwise). Zero denotes “no limit”.

The fetch size is simply sent as the final 4-byte integer, and the server returns only that many result rows. The next Execute on the same portal returns the next batch, and so on.

Sign up to request clarification or add additional context in comments.

2 Comments

Is the fetch size always honored? I think I've seen cases where the database ignored my setting and just uses other [better] value.
@TheImpaler I'd say that would be a bug.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.