feat: support automatic per-cell execution history filtering and isolated callbacks#17144
feat: support automatic per-cell execution history filtering and isolated callbacks#17144shuoweil wants to merge 25 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a callback parameter to the _read_gbq_colab function, allowing users to receive query execution events. The changes include updates to the public API, the session implementation, and the addition of a unit test to verify the callback functionality. A review comment suggests refactoring the _read_gbq_colab method to eliminate code duplication in the query execution logic by using a local helper function.
96d6693 to
d3bf48c
Compare
22ff695 to
b9c9336
Compare
ec7a56a to
5f9d824
Compare
| class EventEnvelope: | ||
| event: Event | ||
| progress_bar: ProgressBarType = _DEFAULT | ||
| cell_execution_count: Optional[int] = None |
There was a problem hiding this comment.
Can't figure out what does it mean by just reading the params. Add doc strings to explain this and other parameters.
There was a problem hiding this comment.
Added a clear, comprehensive docstring to the EventEnvelope class in packages/bigframes/bigframes/core/events.py. It now explicitly documents what the wrapper does and describes each parameter, including how cell_execution_count is captured and used to filter and scope execution history on a per-cell basis.
| job_ids: Optional[Iterable[str]] = None, | ||
| all_cells: bool = True, | ||
| ) -> "bigframes.session._ExecutionHistory": | ||
| import pandas # noqa: F401 |
There was a problem hiding this comment.
Removed the unused and redundant import pandas # noqa: F401 statement in packages/bigframes/bigframes/core/global_session.py under the execution_history function.
| def _read_gbq_colab( # type: ignore[overload-overlap] | ||
| query_or_table: str, | ||
| *, | ||
| callback: Optional[Callable[[bigframes.core.events.EventEnvelope], None]] = ..., |
There was a problem hiding this comment.
why default is ... ? Doesn't comply with type hint.
There was a problem hiding this comment.
Standardized the overloads of _read_gbq_colab in packages/bigframes/bigframes/pandas/io/api.py. The default values in the overloads are now None and False to explicitly match their type annotations (Optional[...] and Literal[False]) and prevent any potential type checker warnings.
|
don't include internal http links in PRs. |
|
|
||
| ipy = IPython.get_ipython() | ||
| if ipy is not None and hasattr(ipy, "execution_count"): | ||
| return ipy.execution_count |
There was a problem hiding this comment.
Don't rely on monkey patching, may be consider some existing settings or properties.
| progress_bar: | ||
| Specifies the style of progress bar to display during execution. | ||
| cell_execution_count: | ||
| The 1-indexed execution count of the notebook cell that triggered the event. |
There was a problem hiding this comment.
Is it job count? Please be explicit.
This change introduces scoped query tracking and event callback management for BigQuery DataFrames within interactive notebook environments (Jupyter/Colab).
Key Changes
Jupyter Cell Scoping: Resolves and carries the active IPython cell execution count (cell_execution_count) through TableWidget, ExecutionSpec, query executors, and final JobMetadata.
Execution History Filtering: Adds events, job_ids, and all_cells parameters to session.execution_history(). When all_cells=False, it filters query logs down to the current active notebook cell.
Scoped Callback Support: Adds a callback parameter to _read_gbq_colab that automatically subscribes to the query progress publisher during execution and automatically unsubscribes upon completion.
Robustness Fixes:
Verified at:
go/scrcast/NjQzOTAzMTUwMzA2MDk5MnwzZWQ2MTMzYS0xYg
Colab notebook test: screen/7d6Yt3C28BUAKEH
Fixes #<513337964> 🦕