Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache gives relative paths all the way to root due to erroneous path resolution #1831

Open
deitch opened this issue Sep 24, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@deitch
Copy link

deitch commented Sep 24, 2024

Thank you 🙇‍♀ for wanting to create an issue in this repository. Before you do, please ensure you are filing the issue in the right place. Issues should only be opened on if the issue relates to code in this repository.

If your issue is relevant to this repository, please include the information below:

Describe the bug

There are several issues reported in actions/cache related to saving and restoring caches between runners that do not have the exact same directory path, e.g. container to hosted runner on one OS; hosted runner to self-hosted runner; runners between OSes; etc.

When you try to actions/restore, you end up with things in the wrong directory in the best of cases; in the worst, something like:

/usr/bin/tar: ../../../../foo: Cannot mkdir: Permission denied
/usr/bin/tar: ../../../../foo/file: Cannot open: No such file or directory

Essentially, all of the files in the tar as loaded into cache do not have paths relative to the workspace root, which would allow it to work cross-runner. Instead, they have relative path structures to get from workspace root all the way to / and then down again as needed. This obviously fails across runners with even a slightly different directory structure.

I believe that this is due to how the paths are resolved in this library here:

  const workspace = process.env['GITHUB_WORKSPACE'] ?? process.cwd()
  const globber = await glob.create(patterns.join('\n'), {
    implicitDescendants: false
  })

  for await (const file of globber.globGenerator()) {
    const relativeFile = path
      .relative(workspace, file)
      .replace(new RegExp(`\\${path.sep}`, 'g'), '/')
    core.debug(`Matched: ${relativeFile}`)

It uses path.relative() to resolve file relative to workspace, but that is not exactly what path.relative does. The docs state:

The path.relative() method returns the relative path from from to to based on the current working directory. If from and to each resolve to the same path (after calling path.resolve() on each), a zero-length string is returned.

It is doing the relative path to something based on the current working directory. Here is a sample.

$ mkdir /tmp/foo && cd /tmp/foo
$ node
> path.relative("/a/b/c", "../q")
'../../../tmp/q'

That is useful if I always will have /a/b/c, but not if I want to extract my tar file in a place where I have /foo/bar. It is brittle.

I think what this should do is resolve it relative to the workspace, but such that if path is below workspace, it is just a relative path. Only bother with absolute path if path is absolute.

  • foo/bar -> foo/bar
  • ../foo -> ../foo
  • /a/b -> /a/b

Which would mean replacing:

    const relativeFile = path
      .relative(workspace, file)
      .replace(new RegExp(`\\${path.sep}`, 'g'), '/');

with something like:

    var fullPathFile = file;
    if (!path.isAbsolute(file)) {
        fullPathFile = path.relative(workspace, path.join(workspace, path))
    }   
    const relativeFile = path.normalize(fullPathFile).replace(new RegExp(`\\${path.sep}`, 'g'), '/')

The problem is that I think we were using path.relative() to resolve relative paths.

  • If the file is absolute, e.g. /foo/bar, we shouldn't try doing relative. The config gave an absolute path, let's not second-guess it (and break things)
  • If the file is relative, then we need to join it to the workspace, and then we can call path.relative on it

To Reproduce

Several issues on actions/cache, here is a good one

Expected behavior

Running cache save to a directory relative to workspace, e.g. path: mydir on one kind of runner, and then cache restore to a directory relative to workspace, same or otherwise, e.g. path: foo or even path: mydir on a runner with a different setup to get to the workspace, should work. It only should fail if it explicitly goes to places it might fail.

Also, expect that usage of absolute paths should work, e.g. /tmp/foo, when it does not, because the referenced code in here will turn that into ../../../../tmp/foo which may or may not be the right path on a different kind of runner.

Additional context

I hope I got everything there. I am happy to open a PR for it.

@deitch deitch added the bug Something isn't working label Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant