Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement retina shell CLI command #962

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wedaly
Copy link
Member

@wedaly wedaly commented Nov 6, 2024

Description

Implement retina shell CLI command for adhoc network debugging of nodes and pods.

Related Issue

#910

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...). See this documentation on signing commits.
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • I have updated the documentation, if necessary.
  • I have added tests, if applicable.

Screenshots (if applicable) or Testing Completed

image

Additional Notes

For testing, need to set RETINA_SHELL_IMAGE_REPO and RETINA_SHELL_IMAGE_VERSION until the retina-shell image is published.


Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.

@wedaly wedaly requested a review from a team as a code owner November 6, 2024 21:29
timeout time.Duration
)

const defaultRetinaShellImageRepo = "ghcr.io/microsoft/retina/retina-shell"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few questions about this part:

  • I'd prefer to use MCR instead of ghcr, since AKS clusters are required to have access to MCR. Everything else in this repo seems to be using ghcr, but is it possible to use MCR instead? (can make sure the image is replicated there before merging this).
  • Defaulting to the Retina CLI version for the image tag. This works when the version is set in release builds, but will fail otherwise since we're defaulting to 0.0.5 here
    // This variable is used by the "version" command and is set during build.
    // Defaults to a safe value if not set.
    var Version = "v0.0.5"
    could we bump this later to a version that includes retina shell? (once it exists)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make it a var, have it set by default to GHCR and then patch this var at link time with MCR if needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's an option. Do we do any patching like that today when building the retina kubectl plugin? (Ideally the official plugin release would use MCR so it will "just work" in any AKS cluster)

shellCmd.Flags().BoolVarP(&allowHostFilesystemWrite, "allow-host-filesystem-write", "w", false, "Allow write access to the host filesystem. Implies --mount-host-filesystem. Applies only to nodes, not pods.")
shellCmd.Flags().BoolVar(&hostPID, "host-pid", false, "Set HostPID on the shell container. Applies only to nodes, not pods.")
shellCmd.Flags().StringSliceVarP(&capabilities, "capabilities", "c", []string{}, "Add capabilities to the shell container")
shellCmd.Flags().DurationVar(&timeout, "timeout", 30*time.Second, "The maximum time to wait for the shell container to start")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking for feedback about how to structure these flags. This was the simplest way I could think of that supported all the use cases I wanted to support (see docs/06-Troubleshooting/shell.md)

@@ -0,0 +1,91 @@
package shell
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure how to test the other parts of this, as it requires a k8s cluster and interactive shell / tty, so I added unit tests for just the manifest generation. Open to suggestions about how to test this better, maybe adding E2E in a followup PR.

}

osLabel := node.Labels["kubernetes.io/os"]
if osLabel != "linux" { // Only Linux supported for now.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried supporting Windows, but got confused with how host process containers work, so I'm leaving this as a followup.

@wedaly wedaly force-pushed the retina-shell-feature-branch branch from 663ed7f to c20afa6 Compare November 6, 2024 21:40
return err
}

config := shell.Config{
Copy link
Member Author

@wedaly wedaly Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

golang-ci lint is complaining that config isn't used, but it's definitely used on lines 106 and 108 below. Maybe it's getting confused because it's used in a closure?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably upset because it can't prove that the closure gets invoked, and it's trying to call r.Visit a liar. I'd just silence it since it's trying to be too smart and failing at it.

@wedaly wedaly force-pushed the retina-shell-feature-branch branch from c20afa6 to 0c17452 Compare November 6, 2024 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants