Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache - excluding files or folders with ! not working #713

Open
dhadka opened this issue Feb 8, 2021 · 7 comments · May be fixed by #1854
Open

Cache - excluding files or folders with ! not working #713

dhadka opened this issue Feb 8, 2021 · 7 comments · May be fixed by #1854
Labels
bug Something isn't working cache

Comments

@dhadka
Copy link
Member

dhadka commented Feb 8, 2021

Describe the bug

Excluding a file or folder does not work as expected with caching. For example,

~/cache
!~/cache/subfolder

is matching:

../../../cache

I think this is because caching uses implicitDescendants: false. Since we're not traversing any descendants, the negation pattern does not match anything.

One option is to just set implicitDescendants: true, but this may add inefficiencies as we will need to enumerate all the files and generate a large "includes file" for tar. We essentially need it to scan the descendants but stop whenever the entire folder is included. For example, the following files:

~/cache/foo.txt
~/cache/bar.txt
~/cache/src/main.c
~/cache/src/lib/baz.dll
~/cache/bin/main.exe

with the glob patterns of ~/cache and !~/cache/bin should produce the list:

~/cache/foo.txt
~/cache/bar.txt
~/cache/src

To Reproduce

See https://github.com/dhadka/cache-test-exclusion

Expected behavior

Negated files / folders are excluded from the cache.

@mustaphazorgati
Copy link

Any updates here?

@lhotari
Copy link

lhotari commented May 28, 2021

There is a workaround. Excluding works if you have all patterns in the same "level", for example

          path: |
            ~/.m2/repository/*/*/*
            !~/.m2/repository/org/apache/pulsar

this is what we use in apache/pulsar, full example here:

https://github.com/apache/pulsar/blob/master/.github/workflows/ci-integration-process.yaml#L55-L65

@swarajpeppermint
Copy link

swarajpeppermint commented Jul 1, 2021

There is a workaround. Excluding works if you have all patterns in the same "level", for example

          path: |
            ~/.m2/repository/*/*/*
            !~/.m2/repository/org/apache/pulsar

this is what we use in apache/pulsar, full example here:

https://github.com/apache/pulsar/blob/master/.github/workflows/ci-integration-process.yaml#L55-L65

Was trying to cache apt, using

path: |
    /var/cache/apt/archives/*
    !/var/cache/apt/archives/lock
    !/var/cache/apt/archives/partial

But it still gives me a permission error for trying to access the partial folder in the Post Cache Dependencies Step:
EACCES: permission denied, scandir '/var/cache/apt/archives/partial'

@motss
Copy link

motss commented Feb 5, 2022

Thanks for the workaround in #713 (comment). It works like a charm.

Here are my findings:

# 1. This will cache everything recursively inside `/a` excluding `/a/b`
path: |
    /a/*
    !/a/b

# 2. This will cache everything recursively inside `/a` including `/a/b`
path: |
    /a
    !/a/b

# 3. This will cache everything recursively inside `/a` including `/a/b`
#    with an unexpectedly huge total cache size (~6x, your mileage might vary)
#    and I don't see any visible differences when compared to 1. and 2.
path: |
    /a/**
    !/a/b

# Conclusion
# Just use 1. for the best of both worlds until the glob issue is fixed.
# Worth mentioning that modifying the value of `path` will affect the caches with the exact same `key`,
# so each cache is unique by its own `key` and `path`.

@robinp
Copy link

robinp commented Feb 5, 2022

Hi guys. Was hitting this as well. Some digging in the code about the likely reason:

(Note: can observe this by enabling step debug logging, see https://docs.github.com/en/actions/monitoring-and-troubleshooting-workflows/enabling-debug-logging)

Other observation: the negative patterns must come after the positive one, otherwise a last positive match will always win (see

for (const pattern of patterns) {
).

Fix suggestion: pass matchDirectories as false in

implicitDescendants: false
- worst downside is that empty directories won't be tarred up, which sounds fine for caching use-case.

In the mean time, the workaround of expanding the paths with /* to the level of the excludes is doable as others greatly suggested. But is inconvenient and has some hard to cover edge cases.

knapsu added a commit to knapsu/plex-media-player-flatpak that referenced this issue Feb 11, 2022
EagleoutIce added a commit to EagleoutIce/uulm-eidi-tut-ss2022-slides that referenced this issue Aug 12, 2022
ChanTsune added a commit to ChanTsune/Portable-Network-Archive that referenced this issue Aug 11, 2024
ChanTsune added a commit to ChanTsune/Portable-Network-Archive that referenced this issue Aug 11, 2024
@kamzil
Copy link

kamzil commented Oct 3, 2024

Just wasted way too much time trying to figure this out. GitHub should really pay more attention to the inconsistencies of glob patterns between different parts of Actions.

@jonkoops jonkoops linked a pull request Oct 16, 2024 that will close this issue
@jonkoops
Copy link

Note that none of the fixes here work if the levels are deeper and there are multiple paths that need to be matched, for example:

# Works
path: |
    /a/*
    !/a/b

# Doesn't work
path: |
    /a/*
    !/a/b/c
    
# Works
path: |
    /a/*/*
    !/a/b/c
    
path: |
    /a/*/*
    !/a/b/c
    !/a/b/c/d # No longer works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cache
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants