-
Notifications
You must be signed in to change notification settings - Fork 108
py: implement SQLContext.wait_for_completion
#1872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
e1d8333
py: implement `SQLContext.wait_for_completion`
abhizer 6711340
py: remove run_to_completion
abhizer b240040
py: check pipeline state before starting pipeline
abhizer 10940d8
py: refactor(connect_input_pandas) -> input_pandas
abhizer f89d127
suggested fixes to docs
abhizer 7fb7507
py: Improve doc comments.
3ae2484
clippy
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,17 +1,17 @@ | ||
| Examples | ||
| ======== | ||
|
|
||
| Pandas | ||
| ******* | ||
| Using Pandas DataFrames as Input / Output | ||
| ******************************************* | ||
|
|
||
|
|
||
| Working wth pandas DataFrames in Feldera is fairly straight forward. | ||
| You can use :meth:`.SQLContext.connect_source_pandas` to connect a | ||
| You can use :meth:`.SQLContext.input_pandas` to connect a | ||
| DataFrame to a feldera table as the data source. | ||
|
|
||
| To listen for response from feldera, in the form of DataFrames, it is necessary | ||
| to to call :meth:`.SQLContext.listen` before you call | ||
| :meth:`.SQLContext.start` or :meth:`.SQLContext.run_to_completion`. | ||
| To listen for response from feldera, in the form of DataFrames | ||
| call :meth:`.SQLContext.listen`. | ||
| To ensure all data is received start listening before calling | ||
| :meth:`.SQLContext.start`. | ||
|
|
||
| .. highlight:: python | ||
| .. code-block:: python | ||
|
|
@@ -47,17 +47,20 @@ to to call :meth:`.SQLContext.listen` before you call | |
| query = f"SELECT name, ((science + maths + art) / 3) as average FROM {TBL_NAMES[0]} JOIN {TBL_NAMES[1]} on id = student_id ORDER BY average DESC" | ||
| sql.register_output_view(view_name, query) | ||
|
|
||
| # connect the source (a pandas Dataframe in this case) to the tables | ||
| sql.connect_source_pandas(TBL_NAMES[0], df_students) | ||
| sql.connect_source_pandas(TBL_NAMES[1], df_grades) | ||
|
|
||
| # listen for the output of the view here in the notebook | ||
| # you do not need to call this if you are forwarding the data to a sink | ||
| out = sql.listen(view_name) | ||
|
|
||
| # run this to completion | ||
| # start the pipeline | ||
| sql.start() | ||
|
|
||
| # connect the source (a pandas Dataframe in this case) to the tables | ||
| sql.input_pandas(TBL_NAMES[0], df_students) | ||
| sql.input_pandas(TBL_NAMES[1], df_grades) | ||
|
|
||
| # wait for the pipeline to complete | ||
| # note that if the source is a stream, this will run indefinitely | ||
| sql.run_to_completion() | ||
| sql.wait_for_completion(shutdown=True) | ||
|
|
||
| # finally, convert the output to a pandas Dataframe | ||
| df = out.to_pandas() | ||
|
|
@@ -66,8 +69,8 @@ to to call :meth:`.SQLContext.listen` before you call | |
| print(df) | ||
|
|
||
|
|
||
| Kafka | ||
| ****** | ||
| Using Kafka as Data Source / Sink | ||
| *********************************** | ||
|
|
||
| To setup Kafka as the source use :meth:`.SQLContext.connect_source_kafka` and as the sink use | ||
| :meth:`.SQLContext.connect_sink_kafka`. | ||
|
|
@@ -115,7 +118,7 @@ Here the only notable difference is: | |
| More on Kafka as the output connector at: https://www.feldera.com/docs/connectors/sinks/kafka | ||
|
|
||
| .. warning:: | ||
| Kafka is a streaming data source, therefore running: :meth:`.SQLContext.run_to_completion` will run forever. | ||
| Kafka is a streaming data source, therefore running: :meth:`.SQLContext.wait_for_completion` will block forever. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally, not this PR, an error should be thrown if a streaming data source is defined and the user tries to call |
||
|
|
||
| .. highlight:: python | ||
| .. code-block:: python | ||
|
|
@@ -156,8 +159,8 @@ More on Kafka as the output connector at: https://www.feldera.com/docs/connector | |
| df = out.to_pandas() | ||
|
|
||
|
|
||
| HTTP GET | ||
| ********* | ||
| Ingesting data from a URL | ||
| ************************** | ||
|
|
||
|
|
||
| Feldera can ingest data from a user-provided URL into a SQL table. | ||
|
|
@@ -192,7 +195,8 @@ More on the HTTP GET connector at: https://www.feldera.com/docs/connectors/sourc | |
|
|
||
| out = sql.listen(VIEW_NAME) | ||
|
|
||
| sql.run_to_completion() | ||
| sql.start() | ||
| sql.wait_for_completion(shutdown=True) | ||
|
|
||
| df = out.to_pandas() | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be enforced via a state check in the
listen()command?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also start listening at some arbitrary point after starting a pipeline.
Or listen to an already running pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, that's possible, my question is more whether a Python user with
SQLContextever wants specifically that, as there are no guarantees at what point of the stream listening will start. Probably out of scope for this PR.