Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlocks and process username issue #1738

Merged
merged 3 commits into from
Nov 12, 2024
Merged

Fix deadlocks and process username issue #1738

merged 3 commits into from
Nov 12, 2024

Conversation

dhaavi
Copy link
Member

@dhaavi dhaavi commented Nov 12, 2024

  • Warn instead of failing when process username cannot be found
  • Notify packet issues asynchronously
  • Make saving IP and CNAMEs more defensive

Summary by CodeRabbit

  • New Features

    • Enhanced asynchronous reporting for DNS bypass and multi-peer tunnel issues.
    • Improved handling of open DNS requests by saving them regardless of connection verdicts.
  • Bug Fixes

    • Resolved potential infinite loops in CNAME resolution by limiting iterations to 50.
  • Refactor

    • Streamlined error handling in process loading to log warnings instead of returning errors for username retrieval.
  • Documentation

    • Updated comments to reflect new limits in CNAME resolution logic.

@dhaavi dhaavi requested a review from vlabo November 12, 2024 14:14
Copy link
Contributor

coderabbitai bot commented Nov 12, 2024

📝 Walkthrough

Walkthrough

The changes in this pull request involve modifications to several functions across different files. Key updates include transitioning the reporting mechanisms in callbacks.go to an asynchronous model using module.mgr.Go, implementing a limit on CNAME resolution iterations in dns.go, ensuring open DNS requests are saved regardless of verdict in nameserver.go, and adjusting error handling in process.go to log warnings instead of returning errors for specific cases. These changes collectively enhance the functionality and reliability of the service.

Changes

File Path Change Summary
service/compat/callbacks.go Modified ReportSecureDNSBypassIssue and ReportMultiPeerUDPTunnelIssue to use module.mgr.Go for asynchronous reporting; added import for mgr.
service/firewall/dns.go Updated UpdateIPsAndCNAMEs to limit CNAME resolution to a maximum of 50 iterations; comments updated to reflect this limit.
service/nameserver/nameserver.go Added a call to network.SaveOpenDNSRequest in handleRequest to save open DNS requests regardless of verdict.
service/process/process.go Changed error handling in loadProcess to log warnings instead of returning errors when fetching the username fails.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Service
    participant Worker
    User->>Service: ReportSecureDNSBypassIssue(p)
    Service->>Worker: Go("Report Secure DNS Bypass", function)
    Worker->>Service: Notify Secure DNS Bypass
    Service->>User: Acknowledgment

    User->>Service: ReportMultiPeerUDPTunnelIssue(p)
    Service->>Worker: Go("Report Multi Peer UDP Tunnel", function)
    Worker->>Service: Notify Multi Peer UDP Tunnel
    Service->>User: Acknowledgment
Loading

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)
service/compat/callbacks.go (2)

35-38: Consider adding error handling and context cancellation

While the async execution helps prevent deadlocks, consider these improvements for better robustness:

  1. Handle potential errors from notify()
  2. Respect context cancellation from the worker
 module.mgr.Go("report secure dns bypass issue", func(w *mgr.WorkerCtx) error {
+    select {
+    case <-w.Ctx.Done():
+        return w.Ctx.Err()
+    default:
+    }
     secureDNSBypassIssue.notify(p)
     return nil
 })

35-46: Consider implementing structured notification handling

The move to async notifications is good for preventing deadlocks, but consider these architectural improvements:

  1. Add metrics/monitoring for async notifications
  2. Implement a notification queue with backpressure handling
  3. Define a consistent error handling strategy across all notifications

Would you like me to provide a detailed design proposal for these improvements?

service/process/process.go (1)

259-261: Consider enhancing error handling further.

While the current change improves robustness, consider these additional enhancements:

  1. Set a default/fallback username (e.g., "unknown") when retrieval fails
  2. Add structured error details to help diagnose username retrieval issues

Here's a suggested implementation:

-		log.Tracer(ctx).Warningf("process: failed to get username (PID %d): %s", pInfo.Pid, err)
+		process.UserName = "unknown"  // Set fallback value
+		log.Tracer(ctx).Warningf("process: failed to get username (PID %d): %s (using fallback)", pInfo.Pid, err)
+		// Optionally store error details for debugging
+		process.Error = fmt.Sprintf("username retrieval error: %v", err)
service/firewall/dns.go (1)

305-309: Good defensive programming! Consider a few enhancements.

The introduction of a limit on CNAME resolution iterations is a good defensive measure against potential infinite loops from circular references. However, we can improve it further:

Consider these enhancements:

  1. Define the magic number as a named constant
  2. Add a warning log when the 50-iteration limit is hit
+const maxCNAMEResolutionDepth = 50
+
 // UpdateIPsAndCNAMEs saves all the IP->Name mappings to the cache database and
 // updates the CNAMEs in the Connection's Entity.
 func UpdateIPsAndCNAMEs(q *resolver.Query, rrCache *resolver.RRCache, conn *network.Connection) {
     // ... existing code ...
     domain := q.FQDN
-    for range 50 {
+    for i := 0; i < maxCNAMEResolutionDepth; i++ {
         nextDomain, isCNAME := cnames[domain]
         if !isCNAME || nextDomain == domain {
             break
         }
+        if i == maxCNAMEResolutionDepth-1 {
+            log.Warningf("CNAME resolution depth exceeded %d levels for domain %s", maxCNAMEResolutionDepth, q.FQDN)
+        }
         record.CNAMEs = append(record.CNAMEs, nextDomain)
         domain = nextDomain
     }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between af3bb80 and f4b96e1.

📒 Files selected for processing (4)
  • service/compat/callbacks.go (2 hunks)
  • service/firewall/dns.go (1 hunks)
  • service/nameserver/nameserver.go (1 hunks)
  • service/process/process.go (1 hunks)
🔇 Additional comments (5)
service/compat/callbacks.go (2)

6-6: LGTM: Required import for async execution

The added import is necessary for the async execution changes and is correctly placed.


43-46: Apply same improvements as ReportSecureDNSBypassIssue

The implementation follows the same pattern as ReportSecureDNSBypassIssue. Please apply the same error handling and context cancellation improvements suggested above.

service/process/process.go (1)

259-261: LGTM: Improved error handling for username retrieval.

The change from returning an error to logging a warning aligns with the PR objective of handling missing usernames gracefully. This should help prevent potential deadlocks in the process loading chain.

Let's verify the impact on features that might depend on the username:

service/nameserver/nameserver.go (2)

228-228: LGTM! Consistent DNS request tracking improves reliability.

The addition of SaveOpenDNSRequest ensures that DNS requests are consistently tracked regardless of their verdict state, which aligns with the PR's objective of making the process more defensive. The placement after UpdateIPsAndCNAMEs is correct as it ensures all metadata is properly updated before saving.


228-228: Verify thread-safety of SaveOpenDNSRequest implementation.

Let's ensure the implementation of SaveOpenDNSRequest is thread-safe and properly manages memory to prevent potential resource leaks.

✅ Verification successful

SaveOpenDNSRequest is thread-safe and manages resources properly.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check SaveOpenDNSRequest implementation for thread-safety mechanisms
# and resource management.

# Search for the implementation
ast-grep --pattern 'func SaveOpenDNSRequest($_, $_, $_) {
  $$$
}'

# Look for mutex usage and cleanup mechanisms
rg -A 10 "SaveOpenDNSRequest" 

Length of output: 3093

@dhaavi dhaavi merged commit ee58324 into develop Nov 12, 2024
6 checks passed
@dhaavi dhaavi deleted the fix/core-deadlocks branch November 12, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants