You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our use case is to use SparkMagic wrapper kernels with PaperMill job.
Most of the functions are working as expected except the %%sql magic which will get stuck during execution. The SparkMagic works properly when executed in interactive mode in JupyterLab and issue only happens for %%sql magic when running with PaperMill.
From the debugging log(attached), I can see the %%sql logic had been executed and response was retrieved back. The execution state was back to idle at the end. But the output of %%sql cell was not updated properly and the following cells were not executed.
Following content was printed by PaperMill, which shows the %%sql has been executed properly. This content was not rendered into cell output.
conda create --name py310 python=3.10
conda activate pyenv310
pip install sparkmagic
pip install papermill
# install kernelspecs
SITE_PACKAGES_LOC=$(pip show sparkmagic | grep Location | awk '{print $2}')
cd $SITE_PACKAGES_LOC
jupyter-kernelspec install sparkmagic/kernels/sparkkernel --user
jupyter-kernelspec install sparkmagic/kernels/pysparkkernel --user
jupyter-kernelspec install sparkmagic/kernels/sparkrkernel --user
jupyter nbextension enable --py --sys-prefix widgetsnbextension
pip install notebook==6.5.1 (Downgrade rom 7.0.3 to 6.5.1 due to ModuleNotFoundError: No module named 'notebook.utils')
# Run papermill job(notebook is also uploaded)
# Before run this, an EMR cluster is needed and the sparkmagic configure is also needed.
# If it's not possible/easy to create it, please comment for any testing/verification needed, I can help. Also, you can check the uploaded the papermill debugging log.
papermill pm_sparkmagic_test.ipynb output1.ipynb --kernel pysparkkernel --log-level DEBUG
Following is package list which might be highly related. I also attached one text contains all the packages.
my_test_env_requirements.txt - full list of package in the conda env
pm_sparkmagic_test.ipynb - the notebook executed in jupyterlab and it's also the input of papermill job
output1.ipynb - output notebook from the papermill job
I am not very sure if this is a issue of papermill or sparkmagic. I will also copy this issue to SparkMagic to see if there happend to be any expert who can provide advice. If it's confirmed it's caused by SparkMagic, please feel free to close this issue.
Thanks in advance.
The text was updated successfully, but these errors were encountered:
🐛 Bug
Hi,
Issue
Our use case is to use SparkMagic wrapper kernels with PaperMill job.
Most of the functions are working as expected except the %%sql magic which will get stuck during execution. The SparkMagic works properly when executed in interactive mode in JupyterLab and issue only happens for %%sql magic when running with PaperMill.
From the debugging log(attached), I can see the %%sql logic had been executed and response was retrieved back. The execution state was back to idle at the end. But the output of %%sql cell was not updated properly and the following cells were not executed.
Following content was printed by PaperMill, which shows the %%sql has been executed properly. This content was not rendered into cell output.
Output notebook of papermill:
Expected output(from JupyterLab)
Reproducing steps
Following is package list which might be highly related. I also attached one text contains all the packages.
log and other files.zip contains:
I am not very sure if this is a issue of papermill or sparkmagic. I will also copy this issue to SparkMagic to see if there happend to be any expert who can provide advice. If it's confirmed it's caused by SparkMagic, please feel free to close this issue.
Thanks in advance.
The text was updated successfully, but these errors were encountered: