Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimised the guassian sampling function. #2336

Merged

Conversation

vedantdalimkar
Copy link
Contributor

@vedantdalimkar vedantdalimkar commented Feb 9, 2025

Tries to address #2295

The "sample_gaussian" was using a numpy function to sample gaussian noise. Changed it to the equivalent cv2 operation.

Also noticed that the "mean_range" and "std_range" parameters for GaussNoise is always a tuple of 2 equal values. "sample_gaussian" function randomly samples the mean and standard deviation from these 2 tuples. This is redudant and as sampling from an interval having one point always returns that point itself. This is also relatively expensive as compared to simply selecting that one point, as sampling uses a numpy generator object.

Summary by Sourcery

Replace the NumPy-based Gaussian noise sampling with an equivalent OpenCV (cv2) implementation. Simplify the mean and standard deviation selection process by directly using the provided values when the ranges consist of identical elements, improving efficiency.

New Features:

  • Add a seed parameter to control the random number generation in cv2 for Gaussian noise sampling.

Enhancements:

  • Improve performance of Gaussian noise sampling by simplifying mean and standard deviation selection.

Copy link
Contributor

sourcery-ai bot commented Feb 9, 2025

Reviewer's Guide by Sourcery

This PR optimises the Gaussian noise sampling process by replacing the inefficient numpy operations with cv2 operations, and by simplifying the sampling of constant mean and standard deviation values. Additionally, a seed parameter has been integrated and propagated across various functions to ensure consistent random number generation using cv2's RNG.

Sequence diagram for Gaussian Noise Sampling Process

sequenceDiagram
    participant Caller
    participant generate_noise
    participant sample_noise
    participant sample_gaussian
    participant CV2

    Caller->>generate_noise: call generate_noise(seed)
    generate_noise->>sample_noise: call sample_noise(seed)
    sample_noise->>sample_gaussian: call sample_gaussian(seed)
    sample_gaussian->>CV2: cv2.setRNGSeed(seed)
    sample_gaussian->>CV2: cv2.randn(dst, mean_vector, std_dev_vector)
    CV2-->>sample_gaussian: noise array
    sample_gaussian-->>sample_noise: gaussian noise array
    sample_noise-->>generate_noise: noise array
    generate_noise-->>Caller: noise array
Loading

Class diagram for Seed Propagation in Augmentation Transforms

classDiagram
    class Composition {
        -seed: int | None
        -random_generator: np.random.Generator
        -py_random: Random
        +__init__(seed)
        +set_random_seed(seed)
    }
    class TransformsInterface {
        -seed: int | None
        -random_generator: np.random.Generator
        -py_random: Random
        +__init__(seed)
        +set_random_seed(seed)
    }
    note for Composition "Calls cv2.setRNGSeed(seed) in __init__ and set_random_seed"
    note for TransformsInterface "Calls cv2.setRNGSeed(seed) during initialization and seed updates"
Loading

File-Level Changes

Change Details Files
Refactored Gaussian noise sampling.
  • Replaced numpy-based sampling in sample_gaussian with cv2.randn for improved performance.
  • Implemented cv2.setRNGSeed(seed) to correctly initialize the RNG for cv2 operations.
  • Modified the logic to directly assign constant mean and standard deviation values when the provided range has identical endpoints.
  • Created mean and standard deviation vectors to support multi-channel noise generation.
albumentations/augmentations/functional.py
Integrated and propagated seed parameter across noise generation and transforms.
  • Added an optional seed parameter to noise generator functions (generate_noise, generate_constant_noise, generate_per_pixel_noise, sample_noise, sample_gaussian, and generate_shared_noise).
  • Propagated the seed in transform functions (e.g., get_params_dependent_on_data) to ensure consistency.
  • Set cv2 RNG seed in core initialization methods in composition and transforms interface to maintain deterministic behaviour across modules.
albumentations/augmentations/functional.py
albumentations/augmentations/transforms.py
albumentations/core/composition.py
albumentations/core/transforms_interface.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!
  • Generate a plan of action for an issue: Comment @sourcery-ai plan on
    an issue to generate a plan of action for it.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @vedantdalimkar - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider checking if 'seed' is not None before calling cv2.setRNGSeed to prevent potential errors when seed is None.
  • Centralize or abstract the repeated cv2.setRNGSeed calls to reduce duplication across functions and classes.
Here's what I looked at during the review
  • 🟢 General issues: all looks good
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@ternaus ternaus merged commit ef505c6 into albumentations-team:main Feb 10, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants