Releases: albumentations-team/albumentations
Albumentations 2.0.4 Release Notes
- Support Our Work
- Transforms
- Bug fixes and speedups
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Transforms
Added HEStain transform
Applies H&E (Hematoxylin and Eosin) stain augmentation to histopathology images.
This transform simulates different H&E staining conditions using either:
1. Predefined stain matrices (8 standard references)
2. Vahadane method for stain extraction
3. Macenko method for stain extraction
4. Custom stain matrices
![Screenshot 2025-02-09 at 6 15 50 PM](https://private-user-images.githubusercontent.com/5481618/412177426-cdb9d823-aef3-4045-8f79-75a0b3334b1b.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzQxMjE3NzQyNi1jZGI5ZDgyMy1hZWYzLTQwNDUtOGY3OS03NWEwYjMzMzRiMWIucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzZjMjE5MDY4MWM2YTFiMGRmZWJiOTY1MzljODA3YjM1ZGQ0Mjg0MjYwMjkwZjcyN2E0NzIzZjgwZDUxMTMzOSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.jJYz86hGJ6fIMaEGegZq6FfO6-DEgmb-QmjJ93Jf9eE)
Bug fixes and speedups
- Fix in docstring in Rotate by @MalteEbner
- Extended and clarified docstring in Compose on how to fix random seed also added this info to FAQ
- Bugfix in RandomRain
- 1.5 times speedup in GaussNoise by @vedantdalimkar
Albumentations 2.0.3 Release Notes
- Support Our Work
- Core
- Bug fixes and speedups
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Core
Extended the functionality of the strict
parameter in Compose.
Now, if strict=True
and you pass incorrect arguments to transforms in Compose => will get error.
if strict=False
you will get only a warning
There was a lot of deprecation in the last year. It may happen that your augmentation pipeline does not behave as expected, as parameters that use in transforms are ignored, and default values are used instead.
Bug fixes and speedups
- Bugfix in filtering of bounding boxes based on aspect ratio by @CristoJV
- Speedup In SaltAndPepper
- Speedup in AutoContrast
- Speedup in Illumination Illumination
- Speedup in ElasticTransform
- Speedup in RandomRain
- Bugfix in passing int
np.array
as labels in BboxParams
Albumentations 2.0.2 Release Notes
- Support Our Work
- Core
- Bug fixes and speedups
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Core
Added parameter max_accept_ratio
to BBoxParams
max_accept_ratio (float | None): Maximum allowed aspect ratio for bounding boxes.
The aspect ratio is calculated as max(width/height, height/width), so it's always >= 1.
Boxes with aspect ratio greater than this value will be filtered out.
For example, if `max_accept_ratio=3.0`, boxes with width:height or height:width ratios
greater than 3:1 will be removed. Set to None to disable aspect ratio filtering. Default: None.
Bugfixes and Speedups
- Bugfix in
clip=True
inBboxParams
, was clipping not only boxes, but class labels if passed as numpy array - BugFix in keypoints in all distortions: Elastic, Optical, Grid, ThinPlateSpline,
- Speedup in
PlasmaShadow
,PlasmaBrightnessContrast
,ChannelShuffle
Albumentations 2.0.1 Release Notes
- Support Our Work
- Core
- Bugfixes and speedups
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Core
Added parameter filter_invalid_bboxes
to BboxParams
.
If True, filters out invalid bounding boxes (e.g., boxes with negative dimensions or boxes where x_max < x_min
or y_max < y_min
at the beginning of the pipeline. If clip=True
, filtering is applied after clipping. Default: False.
Bugfixes and speedups:
- Speedup in CubicSymmetry by @ternaus
- Speedup in
FromFloat
, when applied toimages
,volume
,volumes
- Bugfix in PixelDropout, was not supporting sequence as
fill
- BugFix in GaussianBlur. Did not preserve brightness and did not scale properly at large
sigma
. Fixed. Also now matches behavior inPIL
pretty close - Bugfix in distortions: OpticalDistortion, GridDistortion, ElasticTransform, ThinPlateSpline. Only bounding boxes and keypoints were affected. by @RhysEvan
Albumentations 2.0.0 Release Notes
This is major release, meaning
-
only one new transform
-
a lot of changes.
- all parameter renaming was moved through deprecations => you got deprecation warning for months
- A few transform have change of default parameters. If you always specify parameters for each augmentations => it will not affect you.
If you have questions or proposals:
If you have complaints:
- Will be happy to see you as one of our sponsors at https://github.com/sponsors/albumentations-team
New transform
![Screenshot 2025-01-03 at 5 58 27 PM](https://private-user-images.githubusercontent.com/5481618/401298644-71669a29-fb4d-43a9-9e77-d0746cfdcfb2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzQwMTI5ODY0NC03MTY2OWEyOS1mYjRkLTQzYTktOWU3Ny1kMDc0NmNmZGNmYjIucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzZhNzZkMWYzY2RmNDkxM2JiZWY4ZjM1Yjg5NjQxMjUwN2I2ZDY2NzZhZjJlZjdiNWZlYTMyMjljNzExMGIwZCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.63QUCTurQZl1id3ms2UtXaZG-TrmdDkgtHnim-YhZlc)
Core
- Deleted
always_apply
=> usep=1
to always apply andp=0
for not applying. - Deleted
update_params
,get_params_dependent_on_targets
=> useget_params_dependent_on_data
Transforms
GaussNoise
- Deleted:
var_limit
,mean
- Use:
std_range
,mean_range
It is not just a renaming, var_limit
and std_range
sample from different distributions. Sampling from std_range
matches with other libraries like torchvision.
AdvancedBlur
- Deleted:
sigmaX_limit
,sigmaY_limit
- Use:
sigma_x_limit
,sigma_y_limit
RandomCrop
- Deleted
pad_mode
,pad_val_mask
,pad_cvl
- Use:
border_mode
,fill_mask
,fill
CenterCrop
- Deleted
pad_mode
,pad_val_mask
,pad_cvl
- Use:
border_mode
,fill_mask
,fill
Crop
- Deleted
pad_mode
,pad_val_mask
,pad_cvl
- Use:
border_mode
,fill_mask
,fill
RandomResizedCrop
- Deleted:
height
,width
- Use:
size
RandomSizedCrop
- Deleted:
height
,width
- Use:
size
RandomCropNearBBox
- Deleted:
cropping_box_key
- Use:
cropping_bbox_key
CropAndPad
- Deleted:
pad_mode
,pad_val_mask
,pad_cvl
- Use:
border_mode
,fill_mask
,fill
TemplateTransform
- Deleted:
template_weight
ChannelDropout
- Deleted:
fill_value
- Use:
fill
CoarseDropout
- Deleted:
min_holes
,max_holes
,min_height
,max_height
,min_width
,max_width
,mask_fill_value
,fill_value
- Use:
num_holes_range
,hole_height_range
,hole_width_range
,fill
,fill_mask
Also default parameters changed:
num_height_range = (8, 8)
=> num_height_range = (0.1, 0.2)
num_width_range = (8, 8)
=> num_width_range = (0.1, 0.2)
GridDropout
- Deleted:
unit_size_min
,unit_size_max
,holes_number_x
,holes_number_y
,shift_x
,shift_y
,fill_value
,mask_fill_value
- Use:
unit_size_range
,holes_number_xy
,fill
,fill_mask
MaskDropout
- Deleted:
image_fill_value
,mask_fill_value
- Use:
fill
,fill_mask
XYMasking
- Deleted:
mask_fill_value
,fill_value
- Use:
fill
,fill_mask
Rotate
- Deleted:
value
,mask_value
- Use:
fill
,fill_mask
Changed default value for border_mode from cv2.BORDER_REFLECT_101
to cv2.BORDER_CONSTANT
SafeRotate
- Deleted:
value
,mask_value
- Use:
fill
,fill_mask
Changed default value for border_mode from cv2.BORDER_REFLECT_101
to cv2.BORDER_CONSTANT
ElasticTransform
- Deleted:
border_mode
,value
,mask_value
Perspective
- Deleted:
pad_mode
,pad_val
,mask_pad_val
Affine
- Deleted:
cval
,cval_mask
,mode
- Use:
fill
,fill_mask
,border_mode
ShiftScaleRotate
- Deleted:
value
,mask_value
- Use:
fill
,fill_mask
Changed default border_mode from cv2.BORDER_REFLECT_101
to cv2.BORDER_CONSTANT
PiesewiseAffine
- Deleted:
cval
,cval_mask
,mode
,keypoints_threshold
OpticalDistortion
- Deleted:
shift_limit
,value
,mask_value
,border_mode
GridDistortion
- Deleted:
value
,mask_value
,border_mode
RandomRotate90
Changed default probability from p=0.5
to p=1
PadIfNeeded
- Deleted:
value
,mask_value
- Use:
fill
,fill_mask
Changed default value for border_mode
from cv2.BORDER_REFLECT_101
to cv2.BORDER_CONSTANT
ImageCompression
- Deleted:
quality_lower
,quality_upper
- Use:
quality_range
RandomSnow
- Deleted:
snow_point_lower
,snow_point_upper
- Use:
snow_point_range
RandomRain
- Deleted:
slant_lower
,slant_upper
- Use:
slant_range
RandomFog
- Deleted:
fog_coef_lower
,fog_coef_upper
- Use:
fog_coef_range
RandomSunFlare
- Deleted:
angle_lower
,angle_upper
,num_flare_circles_lower
,num_flare_circles_upper
- Use:
num_flare_circles_range
,angle_range
RandomShadow
- Deleted:
num_shadows_lower
,num_shadows_upper
- Use:
num_shadows_limit
Solarize
- Deleted:
threshold
- Use:
threshold_range
Downscale
- Deleted
interpolation
,scale_min
,scale_max
- Use:
interpolation_pair
,scale_range
by @ternaus
Small improvements
- Fixed links in readme by @guspan-tanadi
- Better bounding box processing in Dropouts
Albumentations 1.4.24 Release Notes
- Support Our Work
- Core
- Transforms
- Bugfixes
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Core
- Added new keypoints format
xyz
for ImageOnly and Dual transforms (z coordinate stays unchanged)
Transforms
New transform AtLeastOneBBoxRandomCrop
Crop an area from image while ensuring at least one bounding box is present in the crop.
![Screenshot 2024-12-24 at 1 46 24 PM](https://private-user-images.githubusercontent.com/5481618/398484744-21fa3971-3aa2-442e-bdfa-a076541c1096.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzM5ODQ4NDc0NC0yMWZhMzk3MS0zYWEyLTQ0MmUtYmRmYS1hMDc2NTQxYzEwOTYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OWZmZWQ1ZTNmNmMwYzM1NTg1MzlmMDMwMzY3YmFlNzk4OTUwNWQwNjRlOTljOTIxMjFhYWYwNDA5MTAwNDgzOCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.uhXPLqK-LAFPNcHtXlaRpadvzpzqRtBz7XCV_Cd5jRM)
Improvements
- SmallestMaxSize: Added option for separate max_size for height/width
- LongestMaxSize: Added option for separate max_size for height/width
- Added keypoints support to:
CenterCrop3D
,CoarseDropout3D
,CubicSymmetry
,Pad3D
,PadIfNeeded3D
,RandomCrop3D
(by @ternaus)
Bugfixes
- Do not import
eval-type-backport
for python 3.10 and older. by @PerchunPak - Bugfix in
ToTensorV2
by @matejpekar
Albumentations 1.4.23 Release Notes
- Support Our Work
- Core
- Transforms
- Bugfixes
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Core
Target images
as numpy array
Now supports numpy arrays with shape (num_images, height, width, num_channels)
or (num_images, height, width)
as images
in Compose
- Ideal for video processing applications
- Same transform applies to all images in the array
New 3D Data Support
- volume:
(depth, height, width)
or(depth, height, width, num_channels)
- mask3d:
(depth, height, width)
or(depth, height, width, num_channels)
- volumes:
(num_volumes, depth, height, width)
for batch processing - masks3d:
(num_volumes, depth, height, width)
for batch processing
volume = np.random.rand(96, 256, 256) # Your 3D medical volume
mask = np.zeros((96, 256, 256)) # Your 3D segmentation mask
transformed = transform(volume=volume, mask3d=mask)
transformed_volume = transformed['volume']
transformed_mask = transformed['mask3d']
Transforms
Added 3D transforms by @ternaus
Padding & Cropping
- Pad3D: Pad 3D volumes with flexible padding options
- PadIfNeeded3D: Conditional padding to meet minimum dimensions or divisibility requirements
- CenterCrop3D: Center cropping for 3D volumes
- RandomCrop3D: Random cropping of 3D volumes
transform = A.Compose([
# Crop volume to a fixed size for memory efficiency
A.RandomCrop3D(size=(64, 128, 128), p=1.0),
# Randomly remove cubic regions to simulate occlusions
A.CoarseDropout3D(
num_holes_range=(2, 6),
hole_depth_range=(0.1, 0.3),
hole_height_range=(0.1, 0.3),
hole_width_range=(0.1, 0.3),
p=0.5
),
])
volume = np.random.rand(96, 256, 256) # Your 3D medical volume
mask = np.zeros((96, 256, 256)) # Your 3D segmentation mask
transformed = transform(volume=volume, mask3d=mask)
transformed_volume = transformed['volume']
transformed_mask = transformed['mask3d']
Augmentation
- CoarseDropout3D: Random cuboid dropout regions for occlusion simulation
- CubicSymmetry: 48 possible cube symmetry transformations (24 rotations + 24 rotoreflections)
Fixes
- Added flexible brightness in RandomSunFlare by @momincks
- Bugfix in CenterCrop, RandomCrop by @iRyoka
- Fix in Normalize docstring by @mennohofste
Albumentations 1.4.22 Release Notes
- Support Our Work
- Transforms
- Core
- Bugfixes
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Transforms
Elastic Transform
- Added argument
noise_distribution
that allows sampling displacement fields fromgaussian
and fromuniform
distributions. - Deprecated parameters
border_mode
,value
,mask_value
- you can specify them, but will not have any effect.
New transform ShotNoise
![Screenshot 2024-12-06 at 10 34 34](https://private-user-images.githubusercontent.com/5481618/393365867-b1fd6ffc-ed35-4065-bafa-9ea679eea176.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzM5MzM2NTg2Ny1iMWZkNmZmYy1lZDM1LTQwNjUtYmFmYS05ZWE2NzllZWExNzYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OWIyOTIxOTFiZGZlNjk3OWVhM2RlYjUwODc1ZDA4MDdkNWI2OTBiNjZiZWU1OWM2ZDFhODkyYzc4ZjVkYmMwNiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.iXYYCxKCgjvFGd9L9pmE8Ch65GTd7LPTipBY-YpziKQ)
Apply shot noise to the image by modeling photon counting as a Poisson process.
Shot noise (also known as Poisson noise) occurs in imaging due to the quantum nature of light.
When photons hit an imaging sensor, they arrive at random times following Poisson statistics.
This transform simulates this physical process in linear light space by:
1. Converting to linear space (removing gamma)
2. Treating each pixel value as an expected photon count
3. Sampling actual photon counts from a Poisson distribution
4. Converting back to display space (reapplying gamma)
The noise characteristics follow real camera behavior:
- Noise variance equals signal mean in linear space (Poisson statistics)
- Brighter regions have more absolute noise but less relative noise
- Darker regions have less absolute noise but more relative noise
- Noise is generated independently for each pixel and color channel
RandomGridShuffle
Addes support for bounding boxes
![Screenshot 2024-12-06 at 10 38 44](https://private-user-images.githubusercontent.com/5481618/393366833-e7fbeac0-f92b-4097-838f-d5ddaab9c68f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzM5MzM2NjgzMy1lN2ZiZWFjMC1mOTJiLTQwOTctODM4Zi1kNWRkYWFiOWM2OGYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OGExOWMwNTQ5NDQzOGJlYjhkMjRjYWI2YzhhZWQ2YzgxOWIzZjA1ODQ3NTFhNDk0ODU2YjI2NGVmMzRhNmU4NiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.X_eqQGpkETnnB-z-5nSLuHy6yCv1ClfjFVSozyIntp4)
CorseDropout
Added an option to inpaint holes using inpaint_ns
and inpaint_telea
from OpenCV
GridDropout
Added an option to inpaint holes using inpaint_ns
and inpaint_telea
from OpenCV
MaskDropout
Added an option to inpaint holes using inpaint_ns
and inpaint_telea
from OpenCV
XYMasking
Added an option to inpaint holes using inpaint_ns
and inpaint_telea
from OpenCV
New transform TimeReverse
Added NewTransform TimeReverse
Reverse the time axis of a spectrogram image, also known as time inversion.
Time inversion of a spectrogram is analogous to the random flip of an image,
an augmentation technique widely used in the visual domain. This can be relevant
in the context of audio classification tasks when working with spectrograms.
The technique was successfully applied in the AudioCLIP paper, which extended
CLIP to handle image, text, and audio inputs.
This transform is implemented as a subclass of HorizontalFlip since reversing
time in a spectrogram is equivalent to flipping the image horizontally.
New transform TimeMasking
Added NewTransform TimeMasking
Apply masking to a spectrogram in the time domain.
This transform masks random segments along the time axis of a spectrogram,
implementing the time masking technique proposed in the SpecAugment paper.
Time masking helps in training models to be robust against temporal variations
and missing information in audio signals.
This is a specialized version of XYMasking configured for time masking only.
For more advanced use cases (e.g., multiple masks, frequency masking, or custom
fill values), consider using XYMasking directly.
New transform FrequencyMasking
Apply masking to a spectrogram in the frequency domain.
This transform masks random segments along the frequency axis of a spectrogram,
implementing the frequency masking technique proposed in the SpecAugment paper.
Frequency masking helps in training models to be robust against frequency variations
and missing spectral information in audio signals.
This is a specialized version of XYMasking configured for frequency masking only.
For more advanced use cases (e.g., multiple masks, time masking, or custom
fill values), consider using XYMasking directly.
Added NewTransform FrequencyMasking
It is a specialized version of XYMasking that has the similar API as FrequencyMasking from torchaudio
New Transform Pad
![Screenshot 2024-12-06 at 11 19 42](https://private-user-images.githubusercontent.com/5481618/393376646-60d597ac-9c3a-4324-9b30-d66c37c6dd18.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzM5MzM3NjY0Ni02MGQ1OTdhYy05YzNhLTQzMjQtOWIzMC1kNjZjMzdjNmRkMTgucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MTM1NzU1ZGQ5ODI5ODZhNTM2N2U0M2MyZGMzYjYyODc4MzJmYmVhMDExM2I5YzBhYmVjOGMwMGQ3ODlmZjJlNiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.TAL9C3w6Bb7hXxeHA7SUSaLKyfNeqCPuDw4ogSyBUIo)
Pad the sides of an image by specified number of pixels.
Args:
padding (int, tuple[int, int] or tuple[int, int, int, int]): Padding values. Can be:
* int - pad all sides by this value
* tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y
* tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side
This is the generalization of the torchvision transform with the same name
New Transform Erasing
![Screenshot 2024-12-06 at 11 23 25](https://private-user-images.githubusercontent.com/5481618/393377569-8bf42b14-7c09-4bb3-8e61-2ea7b1ea16e7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzM5MzM3NzU2OS04YmY0MmIxNC03YzA5LTRiYjMtOGU2MS0yZWE3YjFlYTE2ZTcucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzBhMWQ5MDZjOThkOTM2M2UyNzk3YjllNDhmOTM3OWJiZWVkMTFiOTQxODdjYjNiNDVkODliZTgyOTUyMjM1MCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.5SB6JLmTDcR-bsbyWJRDaok9hmeorW5HWR-PyA7HZWE)
This is the generalization of the similar torchvision transform
Randomly erases rectangular regions in an image, following the Random Erasing Data Augmentation technique.
This augmentation helps improve model robustness by randomly masking out rectangular regions in the image,
simulating occlusions and encouraging the model to learn from partial information. It's particularly
effective for image classification and person re-identification tasks.
New Transform AdditiveNoise
![Screenshot 2024-12-06 at 11 26 17](https://private-user-images.githubusercontent.com/5481618/393378388-557c6dff-01a7-4fe2-a0fd-9073568bcd87.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzM5MzM3ODM4OC01NTdjNmRmZi0wMWE3LTRmZTItYTBmZC05MDczNTY4YmNkODcucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YWE0NTI5MmY2M2Q2MDE0ZGU4NjY5MzlhYWJmNzNmNGIzNmI4MmQyZGUyOTE5MDM0ZWYwODhhZTVjYTE2NTYyYSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.UzkGiGClbvuXxlZBF8UuzCQduwM34hqft4u6F3xDgJg)
Apply random noise to image channels using various noise distributions.
This transform generates noise using different probability distributions and applies it
to image channels. The noise can be generated in three spatial modes and supports
multiple noise distributions, each with configurable parameters.
Args:
noise_type: Type of noise distribution to use. Options:
- "uniform": Uniform distribution, good for simple random perturbations
- "gaussian": Normal distribution, models natural random processes
- "laplace": Similar to Gaussian but with heavier tails, good for outliers
- "beta": Flexible bounded distribution, can be symmetric or skewed
spatial_mode: How to generate and apply the noise. Options:
- "constant": One noise value per channel, fastest
- "per_pixel": Independent noise value for each pixel and channel, slowest
- "shared": One noise map shared across all channels, medium speed
Sharpen
Added 'gaussian' method for image sharpening.
New transform SaltAndPepper
![Screenshot 2024-12-06 at 11 52 54](https://private-user-images.githubusercontent.com/5481618/393384908-b93c1863-7db1-4aac-ba18-97cefad43dad.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzM5MzM4NDkwOC1iOTNjMTg2My03ZGIxLTRhYWMtYmExOC05N2NlZmFkNDNkYWQucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzE5YjNhNmMyNzZlZDYyMGI3NzNkNzBmZjY0OGJiMGZiMmExNzg4NmJlNzVmMGY4YTUxNmI5ZGEzMzQ0Y2UwMiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.8iTsVmnayORoHByJ9_NZQJVBda22ooZ9bZEpPR-yF5w)
Apply salt and pepper noise to the input image.
Salt and pepper noise is a form of impulse noise that randomly sets pixels to either maximum value (salt)
or minimum value (pepper). The amount and proportion of salt vs pepper noise can be controlled.
New transform PlasmaBrightNessContrast
![Screenshot 2024-12-06 at 11 54 34](https://private-user-images.githubusercontent.com/5481618/393385429-b783a2ad-3757-401d-8964-29728d829dd3.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk1OTI0NDYsIm5iZiI6MTczOTU5MjE0NiwicGF0aCI6Ii81NDgxNjE4LzM5MzM4NTQyOS1iNzgzYTJhZC0zNzU3LTQwMWQtODk2NC0yOTcyOGQ4MjlkZDMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDIxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAyMTVUMDQwMjI2WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzYwOGNmMTM2ZTBhMGY2MmIzNDlkYmRjNGZmMjE2MDcxODMxMDYwMjE2MjAzMWE0ZWFhMjFjZThlY2I1OTk4NSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.LRQDt68qaA18Ru1p6jLGAMBvj-5CjZd71f8dUjEcy6c)
Apply plasma fractal pattern to modify image brightness and contrast.
This transform uses the Diamond-Square algorithm to generate organic-looking fractal patterns
that are then used to create spatially-varying brightness and contrast adjustments.
The result is a natural-looking, non-uniform modification of the image.
New Transform PlasmaShadow
<img width="118...
Albumentations 1.4.21 Release Notes
- Support Our Work
- Transforms
- Core
- Benchmark
- Speedups
Support Our Work
- Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
- Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
- Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server
Transforms
Auto padding in crops
Added option to pad the image if crop size is larger than the crop size
Old way
[
A.PadIfNeeded(min_height=1024, min_width=1024, p=1),
A.RandomCrop(height=1204, width=1024, p=1)
]
New way:
A.RandomCrop(height=1204, width=1024, p=1, pad_if_needed=True)
Works for:
You may also use it to pad image to a desired size.
Core
Random state
Now random state for the pipeline does not depend on the global random state
Before
random.seed(seed)
np.random.seed(seed)
transform = A.Compose(...)
Now
transform = A.Compose(seed=seed, ...)
or
transform = A.Compose(...)
transform.set_random_seed(seed)
Saving used parameters
Now you can get exact parameters that were used in the pipeline on a given sample with
transform = A.Compose(save_applied_params=True, ...)
result = transform(image=image, bboxes=bboxes, mask=mask, keypoints=keypoints)
print(result["applied_transforms"])
Benchmark
Moved benchmark to a separate repo
https://github.com/albumentations-team/benchmark/
Current result for uint8 images:
Transform | albumentations 1.4.20 |
augly 1.0.0 |
imgaug 0.4.0 |
kornia 0.7.3 |
torchvision 0.20.0 |
---|---|---|---|---|---|
HorizontalFlip | 8325 ± 955 | 4807 ± 818 | 6042 ± 788 | 390 ± 106 | 914 ± 67 |
VerticalFlip | 20493 ± 1134 | 9153 ± 1291 | 10931 ± 1844 | 1212 ± 402 | 3198 ± 200 |
Rotate | 1272 ± 12 | 1119 ± 41 | 1136 ± 218 | 143 ± 11 | 181 ± 11 |
Affine | 967 ± 3 | - | 774 ± 97 | 147 ± 9 | 130 ± 12 |
Equalize | 961 ± 4 | - | 581 ± 54 | 152 ± 19 | 479 ± 12 |
RandomCrop80 | 118946 ± 741 | 25272 ± 1822 | 11503 ± 441 | 1510 ± 230 | 32109 ± 1241 |
ShiftRGB | 1873 ± 252 | - | 1582 ± 65 | - | - |
Resize | 2365 ± 153 | 611 ± 78 | 1806 ± 63 | 232 ± 24 | 195 ± 4 |
RandomGamma | 8608 ± 220 | - | 2318 ± 269 | 108 ± 13 | - |
Grayscale | 3050 ± 597 | 2720 ± 932 | 1681 ± 156 | 289 ± 75 | 1838 ± 130 |
RandomPerspective | 410 ± 20 | - | 554 ± 22 | 86 ± 11 | 96 ± 5 |
GaussianBlur | 1734 ± 204 | 242 ± 4 | 1090 ± 65 | 176 ± 18 | 79 ± 3 |
MedianBlur | 862 ± 30 | - | 813 ± 30 | 5 ± 0 | - |
MotionBlur | 2975 ± 52 | - | 612 ± 18 | 73 ± 2 | - |
Posterize | 5214 ± 101 | - | 2097 ± 68 | 430 ± 49 | 3196 ± 185 |
JpegCompression | 845 ± 61 | 778 ± 5 | 459 ± 35 | 71 ± 3 | 625 ± 17 |
GaussianNoise | 147 ± 10 | 67 ± 2 | 206 ± 11 | 75 ± 1 | - |
Elastic | 171 ± 15 | - | 235 ± 20 | 1 ± 0 | 2 ± 0 |
Clahe | 423 ± 10 | - | 335 ± 43 | 94 ± 9 | - |
CoarseDropout | 11288 ± 609 | - | 671 ± 38 | 536 ± 87 | - |
Blur | 4816 ± 59 | 246 ± 3 | 3807 ± 325 | - | - |
ColorJitter | 536 ± 41 | 255 ± 13 | - | 55 ± 18 | 46 ± 2 |
Brightness | 4443 ± 84 | 1163 ± 86 | - | 472 ± 101 | 429 ± 20 |
Contrast | 4398 ± 143 | 736 ± 79 | - | 425 ± 52 | 335 ± 35 |
RandomResizedCrop | 2952 ± 24 | - | - | 287 ± 58 | 511 ± 10 |
Normalize | 1016 ± 84 | - | - | 626 ± 40 | 519 ± 12 |
PlankianJitter | 1844 ± 208 | - | - | 813 ± 211 | - |
Speedups
- Speedup in PlankianJitter in uint8 mode
- Replaced
cv2.addWeighted
withwsum
from simsimd package
Albumentations 1.4.20 Release Notes
Hotfix version.
- Fix in check_version
- Fix in PieceWiseAffine
- Fix in RandomSizedCrop and RandomResizedCrop
- Fix in
RandomOrder