1

I have local data with id's from an external database. I want to then select data from that external database using those id so I do something like:

SELECT * FROM table WHERE id IN (:listofids)

However I've noticed (I'm not sure if it's a DB by DB setting or the type of DB or what) there can be a max number of values allowed in that IN statement. When possible of course I do WHERE id IN (SELECT id FROM ...) but sometimes with external data it's not possible.

My questions:

  1. What are my options here besides what I am doing in the above?
  2. Are the limitations of max items set at the DB level, the type of DB, what? Right now I am specifically concerned with an external RedShift DB but I looked at their docs and they don't mention a limit. Our local DB has a limit of 2100 but another one I use is 9999. Also, once the list gets large it can get slow, so i'm looking for a performance boost as well.
  3. Should I do in () OR in () OR in(). That doesn't seem too slick. Are all of these options viable?
  4. I've read a bit about possibly using temp tables to do this but without any examples. How would I load data into a temp table to then join into the table I'm wanting to get the data from?

2 Answers 2

1

A pretty simple solution, that avoids temporary tables, is to do something like this:

SELECT 
    t.* 
FROM 
    [table] t
    INNER JOIN (
        SELECT 'id1' AS id
        UNION ALL
        SELECT 'id2'
        UNION ALL
        SELECT 'id3') i ON i.Id = t.Id;

it's obviously a little bit more work to create a sub-query using lots of UNIONs, but this effectively creates a temporary table that you can use without having to worry about materialising it.

Sign up to request clarification or add additional context in comments.

Comments

0

What are my options here besides what I am doing in the above?

They are limited - you could chain a LONG sequence of OR id = X calls but you may have SIGNIFICANT performance problems compared to IN.

The best option performance-wise is to join to a table (parameter, temporary, CTE, inline UNION, or static) on the server side. However that's not always an option depending on how you're executing queries and what permissions you have.

Of course, you can also execute a separate query for each value, which would be easier to code but might have performance problems as well.

Are the limitations of max items set at the DB level, the type of DB, what?

That is platform-specific so there's not a universal answer

Should I do in () OR in () OR in(). That doesn't seem too slick. Are all of these options viable?

That is one option to add to the others above - you'd have to try it to see what the performance impact is.

I've read a bit about possibly using temp tables to do this but without any examples. How would I load data into a temp table to then join into the table I'm wanting to get the data from?

That depends on the platform, your permissions, your API available, etc.

2 Comments

Thank you for your comments. So when you say the max items is platform-specific, do you mean MSSQL will always be 2100 max, oracle will always be 9999, etc or something else?
possibly - or the max could depend on the overall size (in characters) of the query, or something else altogether.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.