-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: The submission_environment_dependencies.txt file does not get staged when running with Flink runner on Dataproc #32743
Comments
Do we crash? Seems like we should just print sth: beam/sdks/python/container/boot.go Line 416 in 2049e6b
|
Flink on Dataproc returns this: |
ok, then the problem concerns the area of materialization of staged artifacts - we have a file that is being added to a manifest, but then not available when we try to materialize it. It should either not be staged (and not included in the manifest), or be available in the staging location. |
Workaround: supply --experiments=disable_logging_submission_environment
|
Thanks, @tvalentyn. Let me investigate this later when I have time. |
I also ran into this bug when trying to run Beam 2.59.0 using PortableRunner on Kubernetes with Apache Flink Operator. When looking into task manager pods then the |
You can disable this by using |
I solved the issue for myself, not sure how relevant it is to the issue at hand here. In my case, when trying to use PortableRunner with flink using Apache Flink Operator, the staging volume was not accessible/same for job manager and task managers/workers. For some reason this causes empty files for My issue was solved when I was able to create a working shared staging volume across pods. Fun side note that might be helpful for someone: When you try to create host mounted PersistantVolume with ReadWriteMany access mode on Googles GKE and use it as a volume then it never actually tells you that you can't do it, but will simply mount random (different) volumes across all pods. Docs mention that it is not supposed to be supported :D. I went with FUSE CSI driver that solved the issue for GKE. |
Some notes: Line 114 in 53080f1
--semi_persist_dir - > https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/environment/DockerCommand.java#L111
|
https://github.com/liferoad/beam-ml-flink/blob/main/Makefile#L159 another workaroud for my case |
I was just trying to understand this but I am really naive here - is this file not just transmitted over artifact API? we should really never rely on shared storage or sharing with docker containers, etc. |
What is "semi persist dir" anyhow? |
What happened?
In some cases, "submission_environment_dependencies.txt" might not be staged.
#32752 added a workaround to ignore the error for a missing artifact, but we should rootcause why it didn't get staged.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: