3

Suppose I use DuckDB with python, for querying an Apache parquet file test.pq with a table containing two columns f1 and f2.

r1 = duckdb.query("""
SELECT f1 FROM parquet_scan('test.pq') WHERE f2 > 1
""")

Now I would like to use r1 result in another query, like:

duckdb.query("""SELECT * FROM r1 WHERE f1 > 10""")

However, last instruction gives: RuntimeError: Catalog Error: Table with name r1 does not exist!

Am I missing a DuckDB method equivalent to Apache Spark registerTempTable() ?

1 Answer 1

4
r1 = duckdb.query("""
SELECT f1 FROM parquet_scan('test.pq') WHERE f2 > 1
""") 

Does not create a table called r1, but actually creates a relation that is nothing else than an execution plan. Hence if you call an execute on that, would actually execute a query that scans the parquet file

result = r1.execute()

If you want to query it as a table, you basically have two options.

  1. You create a view from your relation
r1.create_view('table_name')
  1. You change your SQL query to create a duckdb table
conn = duckdb.connect()
conn.execute("create table t as SELECT f1 FROM parquet_scan('test.pq') where f2 > 1 ")

Note that in 1 you will actually load the parquet data to a Duck table, while with 2 you will be constantly reading the parquet data itself.

Finally, if you just want to stack up filters, then you could do:

r2 = r1.filter("f1>10")

There is more info on the Python Relational Api on Duckdb's website, more specifically at: https://duckdb.org/docs/api/python

https://github.com/duckdb/duckdb/blob/master/examples/python/duckdb-python.py

Hopefully, that was helpful! ;-)

Sign up to request clarification or add additional context in comments.

1 Comment

Isn't it exactly the other way around, creating a view as in 1 will read the parquet again and again for each query, and 2 will create and in-memory duckdb table and query that? Otherwise, this would completely counter to the intuition I have for a "view".

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.