Skip to content

Conversation

@hbcarlos
Copy link
Member

@hbcarlos hbcarlos commented Sep 28, 2022

Even though the notebook format doesn't specify that a notebook should have at least one cell, JupyterLab imposes this restriction for a better UX by adding a new empty cell every time the Notebook is blank.

This restriction conflicts with RTC because it is done in the front-end. When a client detects the Notebook is empty, it adds a new cell. We could end up in a situation where we add multiple empty cells because, after a removal operation, multiple clients detect that the Notebook is empty. To prevent this situation, at the moment, we check if the Notebook is empty after each operation which could lead to an empty notebook and creating an empty cell in the same transaction. For example, the action deleteCells:

sharedModel.transact(() => {
// Delete cells in reverse order to maintain the correct indices.
toDelete.reverse().forEach(index => {
sharedModel.deleteCell(index);
});
// Add a new cell if the notebook is empty. This is done
// within the compound operation to make the deletion of
// a notebook's last cell undoable.
if (!sharedModel.cells.length) {
sharedModel.insertCell(
sharedModel.cells.length,
sharedModels.createCell({
cell_type: notebook.notebookConfig.defaultCell
})
);
}
});

Another example is the YNotebook, where when instantiating the Notebook, we need to create the initial cell as a Yjs update message to prevent multiple clients create the same initial cell.
const yCodeCellTemplate =
'AAAWo+3Yxg2IqMqfBePt2MYNBMioyp8FBQIBCgEAD0cAJwAoAycABwAnACgDJ3Fjc291cmNlbWV0YWRhdGFjZWxsX3R5cGVpZGV4ZWN1dGlvbl9jb3VudG91dHB1dHNjZWxsc3NvdXJjZW1ldGFkYXRhY2VsbF90eXBlaWRleGVjdXRpb25fY291bnRvdXRwdXRzBggJAg8HBQYICQIPBwUABQEAAAYBAgABAgACQQYCBwB2AHcEY29kZXckNTZhODQwMTItZWRlNS00MmVjLTg1OTAtMTQyMWVlNjE3NDk3fQAHAHYAdwRjb2RldyRlNTEzZTc5Ni0zODZiLTQ4N2YtYTRhYy00OTI2ZDU3MmRiYjV9AAGIlOXPAgEABg==';
const yMdCellTemplate =
'AAAF+8KuWAMCAQIABQcAJwAoJB5jZWxsc3NvdXJjZW1ldGFkYXRhY2VsbF90eXBlaWQFBggJAgMBAAACAQICQQEBBQB2AHcIbWFya2Rvd253JDE2MWUyNDFjLTI3ZWEtNDQ3NC1hMTljLWI5NzZkZDJhN2YxZQA=';
const yRawCellTemplate =
'AAAGyp/SnR8DAgECAAUHACcAKCQeY2VsbHNzb3VyY2VtZXRhZGF0YWNlbGxfdHlwZWlkBQYICQIDAQAAAgECAkEBAQUAdgB3A3Jhd3ckN2M4MGMyMWYtNjIxZi00MWFlLTg5YTAtZGE2YzAzOGE2NjQzAA==';

let template: string | null = null;
switch (options.initialCellType ?? 'code') {
case 'raw':
template = yRawCellTemplate;
break;
case 'markdown':
template = yMdCellTemplate;
break;
case 'code':
template = yCodeCellTemplate;
break;
}
if (template) {
Y.applyUpdateV2(this.ydoc, buffer.fromBase64(template));
}

By forcing the Notebook to have at least one cell, we are spreading this logic to multiple places, NotebookActions, StaticNotebook, NotebookModel, and YNotebook, plus adding the complexity of dealing with Yjs messages like in the case of the YNotebook.

References

Code changes

1- To make sure the Notebook is not empty when we open it in non-collaborative mode, we add an empty cell during the initialization of the Notebook here:

if (!collab && !cells.length) {
newValue.sharedModel.insertCell(
0,
sharedModels.createCell({
cell_type: this.notebookConfig.defaultCell
})
);
}

because at this point, the model is not initialized yet (it is empty) and here:
if (!ycells.length) {
// Create cell when notebook is empty
// (non collaborative)
ycells.push(sharedModels.createCell({ cell_type: 'code' }));
}

in case we are initializing the model without cells.

2- In collaborative mode, we add the cell from the backend while loading the document here:
https://github.com/jupyter-server/jupyter_ydoc/blob/7d67a75a386521445794e3958747c07cb0e79ebb/jupyter_ydoc/ydoc.py#L169-L178

3- Once the Notebook is initialized, to improve re-adding the cell, in both cases (collaborative mode or not), we re-add the cell in the same transaction we are removing all the cells, here:
https://github.com/hbcarlos/jupyterlab/blob/edef2f45afd83bc057eb2af739791db600fae8b3/packages/notebook/src/actions.tsx#L2451-L2462
this way, re-adding the cell is an atomic operation with the deletion, and we mitigate propagating an update where the Notebook is empty. This will prevent the user from removing all the cells, but extensions will be able to remove them by interacting directly with the shared model.

The reason to re-add the cell in the notebook action "deleteCells" instead of the shared model is that re-adding the cell in the shared model is an unexpected behavior since extension developers might use the delete operation in conjunction with other operations like inserting a cell. For example, the notebook action "changeCellType" doesn't modify the cell, it removes the cell and creates a new one of the desired type (markdown, raw, or code). By re-adding the cell in the delete operation of the shared model, the developer needs to be aware that deleting the last cell will create a new one. Otherwise, they will end with two cells. This is weird behavior for a list that breaks multiple notebook actions, "changeCellType", "mergeCells", "splitCells", etc., and I'm sure it will break some extensions.

4- There is one corner case where the Notebook could end in an empty state. If we have N clients and N cells, and each client removes one cell simultaneously, we can have a blank notebook once all the updates are synced. We can solve this corner case by adding a cell from the backend (here jupyter-server/jupyter_ydoc#62).

5- In addition, in case the notebook end empty, we are adding a message to the Notebook here:

if (!this.model!.sharedModel.cells.length) {
this.addHeader();
}

In my opinion, we should not implement 4 because modifying a document automatically by reacting to a change in the document is dangerous and could create future issues. At the same time, if we don't implement 4, we should not implement 3. Otherwise, if the situation described in 4 occurs, it will look like a bug. Instead, we should allow the user to remove all the cells and add a button to the Notebook so the user knows the Notebook is empty and can easily add a new cell (this is the approach used by collab and vscode).

User-facing changes

Backwards-incompatible changes

@jupyterlab-probot
Copy link

Thanks for making a pull request to jupyterlab!
To try out this branch on binder, follow this link: Binder

@davidbrochart
Copy link
Contributor

We also create an empty cell if there is no cell in jupyter_ydoc. Maybe the solution is to only keep the logic there and remove it from the front-end in collaborative mode? There would be no need for the button.

@hbcarlos
Copy link
Member Author

hbcarlos commented Sep 28, 2022

Yes, but we add it only when opening/creating a new notebook. We don't check if the client removed all the cells. Besides that, now that we have the ystore, we only add the cell from the backend when loading the document from the disk and not when the client opens a document that was already on the ystore.
In addition, we can not keep distinguishing between collaborative mode or not, the code is getting more complex and challenging to maintain.

@krassowski
Copy link
Member

JupyterLab imposes this restriction for a better UX by adding a new empty cell every time the notebook is empty.

Can we just change the implementation of the restriction in JupyterLab? What if instead we disallow deleting the last cell and if user asks us to delete it we would just clean it rather than re-insert an empty cell?

I slightly prefer keeping the old behaviour over adding a button, as in my view it will add a new user interaction that we need to maintain with no clear user benefit (mind you some users are already annoyed by changes which subtly broke their workflows due to the way RTC was introduced, and only a subset of users will take advantage of RTC).

@davidbrochart
Copy link
Contributor

I also think that we should not make the notebook one more click away from entering some code. It would be like in a text editor, you needed to click a button to add a first line.

@davidbrochart
Copy link
Contributor

Note that Observable notebooks appear to not allow empty notebooks either (you cannot delete the cell if it's the only one).

@GabrielaVives
Copy link

Another solution could be to

  • keep the current behaviour of new notebooks: a default code cell is available when you open a new notebook
  • add the empty notebook state (as illustrated in UI design for an empty notebook #13142 ), that is displayed only when users delete all cells (intentionally or not)

However, this would work if most users start by using a code cell first. Which I am not sure of. It can be annoying to have to change the type of your cell every time you create a notebook because the default one does not fit your need.

I think the root question here is: does having a code cell by default meet our users' needs? If yes, which users? Do they represent a majority?

The answers to these questions should drive the decision of the solution to this issue.

@hbcarlos
Copy link
Member Author

What if instead we disallow deleting the last cell and if user asks us to delete it we would just clean it rather than re-insert an empty cell?

It is better than re-inserting an empty cell, but we will still deal with specific corner cases, which we are trying to avoid.
We would need to add a special case on every operation that may end in a blank notebook and make sure that the cleaning of the last cell is done in the same transaction to make sure other clients don't modify a cell that will be cleaned (kind of what we have right now). The only difference would be that every other client won't need to add a cell when detecting an empty notebook so that we won't end on a notebook with multiple blank cells.

I slightly prefer keeping the old behaviour over adding a button, as in my view it will add a new user interaction that we need to maintain with no clear user benefit

I believe we do it for selecting a kernel. When opening/creating a new notebook, Instead of defaulting to the first available kernel, we show a dialog to the user to choose a kernel.

How many notebooks are out there that start with a code cell? Most notebooks I have seen start with a markdown cell with a title or a description of the work done in that notebook. In my case, when I create a new notebook, I first change the cell to a markdown cell and write down a description of what I'm going to do in that notebook.

@hbcarlos
Copy link
Member Author

  • keep the current behaviour of new notebooks: a default code cell is available when you open a new notebook
  • add the empty notebook state (as illustrated in UI design for an empty notebook #13142 ), that is displayed only when users delete all cells (intentionally or not)

That will work.

@davidbrochart
Copy link
Contributor

However, this would work if most users start by using a code cell first.

The default cell type could be configurable in the settings. Observable starts with a Markdown cell, probably because it should be a title for your notebook. But it's true that in Jupyter it has been a code cell for a long time, and people are used to it.

I think the root question here is: does having a code cell by default meet our users' needs? If yes, which users? Do they represent a majority?

If it's justified yes, why not changing this behavior. But I don't recall people complaining about that. And if other successful notebooks like Observable also disable empty notebooks, maybe we should not make that change?

@davidbrochart
Copy link
Contributor

@hbcarlos Clearing a cell is not different from anything we already do in RTC, it's just equivalent to replacing a content with a new one.

@jasongrout
Copy link
Contributor

FYI, there probably are assumptions through the code that assume there is at least on cell. For example, the notebook active cell I think is always assumed to point to some cell.

@jasongrout
Copy link
Contributor

@williamstein mentioned in the dev call:

cocalc solves this empty cell problem by making so ONLY the backend does the "add 1 cell if empty". It works.

@fcollonval
Copy link
Member

What if instead we disallow deleting the last cell and if user asks us to delete it we would just clean it rather than re-insert an empty cell?

This will be hard because there is gonna be edge cases like if a notebook has two cells that get deleted simultaneously by two users (like user A delete cell 1 and user B delete cell 2). You still end up in an empty notebook.

@davidbrochart
Copy link
Contributor

You still end up in an empty notebook.

No because we can make sure that in the back-end we never end up with no cell.

@jasongrout
Copy link
Contributor

If there is a way to preserve the user experience of always having at least one cell in a notebook (e.g., creating a cell automatically so that the user perception is that a cell always exists), I think that will make a huge difference to users.

If I ever am in the situation where all the cells are deleted, I think there is one reasonable path forward, and making me click something to get back to a working state is annoying friction.

@krassowski
Copy link
Member

"New View for Notebook" seems broken with this PR (TypeError: Cannot read properties of null (reading 'open'), just leaving a note to check it manually once it comes out of draft).

@fcollonval
Copy link
Member

"New View for Notebook" seems broken with this PR (TypeError: Cannot read properties of null (reading 'open'), just leaving a note to check it manually once it comes out of draft).

This is not related to this PR: #12965

@ellisonbg
Copy link
Contributor

Reading up on this PR a bit more. I am trying to wrap my head around the different cases. Seems like there are two situations:

  • New notebook. I think the UX for this should be the same as it has always been: that all users see a single, empty code cell that has keyboard focus. This enables them to immediately start typing.
  • Existing notebook that may, through some combination of user actions, has no cells. For this case, I am having a hard time finding anything that is better what William is doing for cocalc (backend has logic that inserts a single cell when the cell count drops to zero).

In both cases, it seems like the backend should own this decision and handle the edge case. The frontend would need to be updated to allow for no cells for a short period or time, but otherwise, things would remain the same from the user's perspective.

@davidbrochart
Copy link
Contributor

Note that @dmonad suggested not inserting a new cell from the backend:

By all means, it should be avoided to rely on a central server that "cleans up" the document. It might make some things easier but will hurt in the long-run and lead to issues that are hard to debug.

Copy link
Member

@fcollonval fcollonval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hbcarlos

I left some comments. And the test failures in js-notebook are relevant.

@ellisonbg
Copy link
Contributor

I believe that if we want to preserve the UX of Jupyter, then this amounts to a leader election problem. @dmonad do you have recommendations on how to do leader election using only Yjs data structures? Picking the server as the leader always is one way of solving the problem, but there may be others.

@hbcarlos hbcarlos marked this pull request as draft October 12, 2022 11:06
@hbcarlos hbcarlos marked this pull request as ready for review October 12, 2022 17:54
@hbcarlos hbcarlos marked this pull request as draft October 12, 2022 21:21
@hbcarlos hbcarlos marked this pull request as ready for review October 13, 2022 10:52
Copy link
Member

@fcollonval fcollonval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hbcarlos

Let's move forward with this PR and see what testers think of this curated handling of the last cell deletion.

@fcollonval fcollonval merged commit 011ddb8 into jupyterlab:master Oct 13, 2022
@hbcarlos hbcarlos deleted the rtc branch October 13, 2022 13:21
martinRenou pushed a commit to martinRenou/jupyterlab that referenced this pull request Oct 24, 2022
* Allows empty notebook

* Modify command insertBelow and insertAbove, allow activeCell None in NotebookAdapter and NotebookTrustStatus

* Fixes integrity check and linting

* Add cell when the notebook is empty in non collaborative mode

* Fixes notebook tests

* Improves adding a new cell when the notebook is empty in non collaborative mode

* Adds a message to the notebook, just in case we end with an empty notebook

* Review

* Fixes
martinRenou pushed a commit to martinRenou/jupyterlab that referenced this pull request Oct 25, 2022
* Allows empty notebook

* Modify command insertBelow and insertAbove, allow activeCell None in NotebookAdapter and NotebookTrustStatus

* Fixes integrity check and linting

* Add cell when the notebook is empty in non collaborative mode

* Fixes notebook tests

* Improves adding a new cell when the notebook is empty in non collaborative mode

* Adds a message to the notebook, just in case we end with an empty notebook

* Review

* Fixes
fcollonval pushed a commit that referenced this pull request Oct 26, 2022
* Allow empty notebook (#13141)

* Allows empty notebook

* Modify command insertBelow and insertAbove, allow activeCell None in NotebookAdapter and NotebookTrustStatus

* Fixes integrity check and linting

* Add cell when the notebook is empty in non collaborative mode

* Fixes notebook tests

* Improves adding a new cell when the notebook is empty in non collaborative mode

* Adds a message to the notebook, just in case we end with an empty notebook

* Review

* Fixes

* Linter

* Remove unused import

Co-authored-by: Carlos Herrero <contact@carloshb.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants