Skip to content
This repository has been archived by the owner on Dec 9, 2022. It is now read-only.

build_all_targets: limit concurrent jobs #11

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Akira25
Copy link
Member

@Akira25 Akira25 commented Aug 5, 2021

This commit adds some code that limits the amount of docker
containers that run in parallel. This will hopefully fix the
nondeterministically missing files that we have seen in the
packagefeed.

Signed-off-by: Martin Hübner [email protected]

@PolynomialDivision
Copy link
Contributor

PolynomialDivision commented Aug 5, 2021

This will hopefully fix the
nondeterministically missing files that we have seen in the
packagefeed.

Why?

@Akira25
Copy link
Member Author

Akira25 commented Aug 5, 2021

I think that thousands of containers spawned at the same time do some strange racing stuff on the buildbot workers. @simzon already mentioned that, when we had this code pre-merge. I suppose that limiting the amount of parallel containers to something reasonable will avoid that issue that packages miss in the finished builds.

Even without that its a good idea to limit the amount of parallel containers. Some unknowing user might start the script in parallel without knowing how much processing power it will take. When I tested the parallel option on my machine, it was unresponsive due to the too high workload.

@PolynomialDivision
Copy link
Contributor

Even without that its a good idea to limit the amount of parallel containers. Some unknowing user might start the script in parallel without knowing how much processing power it will take. When I tested the parallel option on my machine, it was unresponsive due to the too high workload.

That is a very good reason.

I suppose that limiting the amount of parallel containers to something reasonable will avoid that issue that packages miss in the finished builds.

I would argue that it can still happen that we have a race codition since we have concurrent build processes. But I don't have any other good solution for this problem. Maybe also our architecture list is wrong?

@Akira25
Copy link
Member Author

Akira25 commented Aug 5, 2021

I would argue that it can still happen that we have a race codition since we have concurrent build processes. But I don't have any other good solution for this problem. Maybe also our architecture list is wrong?

That could apply too. I'm not sure on that though. Maybe this will fix that problem too, maybe not. Probably we should observer that topic anyway.

@PolynomialDivision
Copy link
Contributor

I would argue that it can still happen that we have a race codition since we have concurrent build processes. But I don't have any other good solution for this problem. Maybe also our architecture list is wrong?

That could apply too. I'm not sure on that though. Maybe this will fix that problem too, maybe not. Probably we should observer that topic anyway.

But I don't understand it why we have race condition at all. :/ Maybe we have some architecture doubled? I will look into this build script again.

@@ -53,9 +56,13 @@ archs+=( ["x86_64"]=x86-64 )
for key in ${!archs[@]}; do
SDK="${archs[${key}]}"
[ "$OPENWRT_VERSION" != "snapshot" ] && SDK+="-$OPENWRT_VERSION"
if [ -n "$BUILD_PARALLEL" ]; then
if [ -n "$BUILD_PARALLEL" ] ; then
CUR_JOBS=$(( $CUR_JOBS + 1 ))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUR_JOBS is only increased? When is CUR_JOBS decreased again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I had totally forget this. I added that in the new version. Every process decreases the counter in the temp-file whenever it finishes.

This commit adds some code that limits the amount of docker
containers that run in parallel. This will hopefully fix the
nondeterministically missing files that we have seen in the
packagefeed.

Signed-off-by: Martin Hübner <[email protected]>
@pmelange
Copy link
Contributor

pmelange commented Oct 3, 2021

I'm not sure that this is the correct way to go. Since there will be multiple simultaneous executions of build_feed accessing the file /tmp/count.tmp, then there could be race conditions. One option would be to lock the file before each use, and to unlock it afterwards. Without a lock, NACK

@pmelange
Copy link
Contributor

pmelange commented Oct 3, 2021

Also, I just noticed that COUNT_FILE is not defined anywhere in build_feed.

Also, calling build_feed automatically will make a senseless temp file

Also, if two or more build_all_targets are running on the same machine, then they will be competing for the temp file.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants