Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add async contents manager support #1328

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Darshan808
Copy link

Fixes #1316

This PR introduces support for an async contents manager. As suggested by @mwouts in #1316, a separate function, build_jupytext_async_contents_manager, has been implemented in a new file, async_contentsmanager.py.

To ensure full test coverage, the existing tests have been duplicated into test_async_contentsmanager.py. Currently, all tests are passing! ✅

Copy link

Thank you for making this pull request.

Did you know? You can try it on Binder: Binder:lab or Binder:notebook.

Also, the version of Jupytext developed in this PR can be installed with pip:

HATCH_BUILD_HOOKS_ENABLE=true pip install git+https://github.com/Darshan808/jupytext.git@async_support

(this requires nodejs, see more at Developing Jupytext)

@Darshan808
Copy link
Author

@mwouts
Could you please review this? I’d appreciate your feedback before proceeding further.

As we begin merging these two files, should we wrap all asynchronous functions (that might be synchronous in the synchronous contents manager) using ensure_async(), similar to what was done in #1021?

Looking forward to your thoughts!

@mwouts
Copy link
Owner

mwouts commented Feb 20, 2025

Fixes #1316

This PR introduces support for an async contents manager. As suggested by @mwouts in #1316, a separate function, build_jupytext_async_contents_manager, has been implemented in a new file, async_contentsmanager.py.

To ensure full test coverage, the existing tests have been duplicated into test_async_contentsmanager.py. Currently, all tests are passing! ✅

Wow, thank you @Darshan808 , this is truly amazing!! Let me have a look!

Copy link
Owner

@mwouts mwouts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Darshan808 , I love what I see so far!

As we begin merging these two files, should we wrap all asynchronous functions (that might be synchronous in the synchronous contents manager) using ensure_async(), similar to what was done in #1021?

Do you think it would be possible to do so? It would be great indeed if we had a more local change - easier to review, and just one file to maintain later on. But I must admit that it's precisely what I was not able to do two years ago so no obligation at all!

try:
from jupytext.async_contentsmanager import build_jupytext_async_contents_manager_class
except ImportError as err:
build_jupytext_async_contents_manager = reraise(err)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's the most prudent approach, thank you! The import problem was years ago though, so maybe it's gone, but it's safer that I follow-up on that later on.

if asyncio.iscoroutinefunction(base_class.get):
app.contents_manager_class = build_jupytext_async_contents_manager_class(base_class)
else:
app.contents_manager_class = build_jupytext_contents_manager_class(base_class)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly what I had in mind! And we can delete the paragraph above.

@mwouts
Copy link
Owner

mwouts commented Feb 20, 2025

Re the failing pre-commit hook on the CI you probably just need pre-commit install - see also https://jupytext.readthedocs.io/en/latest/developing.html#install-and-develop-jupytext-locally

@Darshan808 Darshan808 marked this pull request as ready for review February 21, 2025 16:31
@Darshan808
Copy link
Author

Darshan808 commented Feb 21, 2025

@mwouts
I merged the two files, wrapping methods with ensure_async(), and merged the tests as well. To support both synchronous and asynchronous ContentsManager, I added a pytest fixture that provides both managers to all tests. All tests are passing successfully.
Let me know what you think about this change!

@Darshan808 Darshan808 requested a review from mwouts February 22, 2025 02:23
@Darshan808
Copy link
Author

Locally I checked only test_contentsmanager.py, but it looks like other tests are affected as well. I'll look into it.

@mwouts
Copy link
Owner

mwouts commented Feb 22, 2025

Hi @Darshan808 , thanks for the update, and no worries for the remaining failures, that's a big change you know! Also I hope you're fine with me pushing a few commits here.

One thing that we need to confirm is the following: are we fine with build_jupytext_contents_manager_class always returning an async contents manager?

Async contents manager are a bit more complicated to use in tests, so that could be an incentive to provide a sync version, but I think that it has been a while since async contents manager were introduced so it's probably fine to have just them.

In case we wanted both, do you think we could write a second function e.g. build_sync_contents_manager_class(async_base_class) to convert the async cm to a sync one? Do you see any other way to avoid duplicating the code?

which is derivated from AsyncLargeFileManager
@mwouts
Copy link
Owner

mwouts commented Feb 22, 2025

Locally I checked only test_contentsmanager.py, but it looks like other tests are affected as well. I'll look into it.

Yes - I think that relates to my comment above, now every contents manager class returned by build_jupytext_contents_manager_class is async so we might need to update all the tests that involve a contents manager. Or alternatively figure a way to provide a sync contents manager.

I'd recommend to not worry about the remaining tests for now, but rather chat a bit more about whether we want/can provide a sync contents manager (assuming e.g. the base class is sync).

@mwouts
Copy link
Owner

mwouts commented Feb 22, 2025

Re creating a class with synchronous methods I have tried this: e198928:

def build_jupytext_contents_manager_class(base_contents_manager_class):
    """
    Derives an sync TextFileContentsManager class from the given base class.
    The base class can either be sync or async.
    """

    AsyncJupytextContentsManager = build_async_jupytext_contents_manager_class(
        base_contents_manager_class
    )

    class SyncJupytextContentsManager(AsyncJupytextContentsManager):
        pass

    for method_name in dir(AsyncJupytextContentsManager):
        if method_name.startswith("_"):
            continue
        async_method = getattr(AsyncJupytextContentsManager, method_name)
        if inspect.iscoroutinefunction(async_method):
            async_method_name = "_async_" + method_name
            original_signature = inspect.signature(async_method)
            setattr(SyncJupytextContentsManager, async_method_name, async_method)

            def get_sync_method(async_method_name, original_signature):
                def sync_method(self, *args, **kwargs):
                    loop = asyncio.get_event_loop()
                    return loop.run_until_complete(
                        getattr(self, async_method_name)(*args, **kwargs)
                    )

                sync_method.__signature__ = original_signature
                return sync_method

            setattr(
                SyncJupytextContentsManager,
                method_name,
                get_sync_method(async_method_name, original_signature),
            )

    return SyncJupytextContentsManager

but then I run into RuntimeError: This event loop is already running. Do you think that's a direction worth exploring?

@mwouts
Copy link
Owner

mwouts commented Feb 23, 2025

I've been thinking about this and I am tempted to try the following:

  • keep your current build_jupytext_contents_manager_class and rename it to build_async_jupytext_contents_manager_class, move it to an async_contents_manager.py file. That function will take as input an async contents manager, and return an async text contents manager
  • write a test that copies async_contents_manager.py to contents_manager.py and removes everything relating to async - this way we get a second function build_jupytext_contents_manager_class that takes as input a synchronous contents manager, and returns a synchronous text contents manager. While this approach does duplicate the code, at least we make sure that the duplicated version is up-to-date (the test will check that).

Let me know what you think!

=> I've created that PR: #1329 (did not have the time to run all tests yet)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support AsyncContentsManager?
2 participants