Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in keypoints when using large optical distortions #2285

Open
ThomasieDK opened this issue Jan 21, 2025 · 11 comments · Fixed by #2286
Open

Error in keypoints when using large optical distortions #2285

ThomasieDK opened this issue Jan 21, 2025 · 11 comments · Fixed by #2286
Labels
bug Something isn't working

Comments

@ThomasieDK
Copy link

ThomasieDK commented Jan 21, 2025

Describe the bug

When applying large deformations using opticaldistortion the keypoints are not transformed correctly.

To Reproduce

OS: Windows
Python version 3.11
Albumentations V2.0.0

Steps to reproduce the behavior:

import cv2
import numpy as np
import albumentations as A
from albumentations.core.composition import KeypointParams
import matplotlib.pyplot as plt

# ---------------------------
# A) Generate a Checkerboard
# ---------------------------
def generate_checkerboard(rows=8, cols=8, square_size=50):
    """
    Single-channel checkerboard: 0=black squares, 255=white squares.
    'rows' and 'cols' = number of squares in each dimension.
    """
    height = rows * square_size
    width = cols * square_size
    cb = np.ones((height, width), dtype=np.uint8) * 255

    for r in range(rows):
        for c in range(cols):
            if (r + c) % 2 == 0:
                y1 = r * square_size
                x1 = c * square_size
                cb[y1:y1 + square_size, x1:x1 + square_size] = 0
    return cb


def get_internal_checker_corners(rows, cols, square_size):
    """
    Return the (rows-1)*(cols-1) 'internal' corners typically used for calibration.
    E.g. for 8x8 squares, we get 7x7=49 points.
    """
    points = []
    for r in range(1, rows):
        for c in range(1, cols):
            x = c * square_size
            y = r * square_size
            points.append((x, y))
    return points


transform = A.Compose([
    # A.RandomCrop(width=330, height=330),
    # A.RandomBrightnessContrast(p=0.2),
    A.OpticalDistortion(p=1, mode='fisheye', distort_limit=(-1.05, -1.05),
    interpolation = 4,mask_interpolation=4)
], keypoint_params=A.KeypointParams(format='xy', remove_invisible=True,
                                    angle_in_degrees=True,
                                    check_each_transform=True))

image = generate_checkerboard(rows=16, cols=16, square_size=25)
keypoints = get_internal_checker_corners(rows=16, cols=16, square_size=25)
keypoints_orig = [(x, y, 0, 1) for (x, y) in keypoints]
transformed = transform(image=image, keypoints=keypoints)
transformed_image = transformed['image']
transformed_keypoints = transformed['keypoints']
corners_distorted = [(x, y) for (x, y) in transformed_keypoints]
corners_distorted = np.array(corners_distorted)
keypoints = np.array(keypoints_orig)
fig, axs = plt.subplots(1, 2)
axs[0].imshow(image)
axs[0].scatter(keypoints[:, 0], keypoints[:, 1])
axs[1].imshow(transformed_image)
axs[1].scatter(corners_distorted[:, 0], corners_distorted[:, 1])
plt.show()

Expected behavior

The keypoints should be aligned with the internal checkers.

Actual behavior

The transformation between the image and the keypoints are not aligned

Screenshots

Image

@ThomasieDK ThomasieDK added the bug Something isn't working label Jan 21, 2025
@RhysEvan
Copy link
Contributor

I have found the origin of the issue but am sadly not able to think of a way to correct it...

What I did to specifically achieve this is go into albumentations>augmentations>geomtric>functional.py>remap_keypoints
The original code:

@handle_empty_array("keypoints")
def remap_keypoints(
    keypoints: np.ndarray,
    map_x: np.ndarray,
    map_y: np.ndarray,
    image_shape: tuple[int, int],
) -> np.ndarray:
    height, width = image_shape[:2]

    # Create inverse mappings
    x_inv = np.arange(width).reshape(1, -1).repeat(height, axis=0)
    y_inv = np.arange(height).reshape(-1, 1).repeat(width, axis=1)

    # Extract x and y coordinates
    x, y = keypoints[:, 0], keypoints[:, 1]

    # Clip coordinates to image boundaries
    x = np.clip(x, 0, width - 1, out=x)
    y = np.clip(y, 0, height - 1, out=y)

    # Convert to integer indices
    x_idx, y_idx = x.astype(int), y.astype(int)

    # Apply the inverse mapping
    new_x = x_inv[y_idx, x_idx] + (x - map_x[y_idx, x_idx])
    new_y = y_inv[y_idx, x_idx] + (y - map_y[y_idx, x_idx])

    # Clip the new coordinates to ensure they're within the image bounds
    new_x = np.clip(new_x, 0, width - 1, out=new_x)
    new_y = np.clip(new_y, 0, height - 1, out=new_y)

    # Create the transformed keypoints array
    return np.column_stack([new_x, new_y, keypoints[:, 2:]])

after "extensive" testing and math checking I simply decided to remove all the extras and just use the map_x and map_y for the reprojection:

@handle_empty_array("keypoints")
def remap_keypoints(
    keypoints: np.ndarray,
    map_x: np.ndarray,
    map_y: np.ndarray,
    image_shape: tuple[int, int],
) -> np.ndarray:
    height, width = image_shape[:2]

    # Create inverse mappings
    x_inv = np.arange(width).reshape(1, -1).repeat(height, axis=0)
    y_inv = np.arange(height).reshape(-1, 1).repeat(width, axis=1)
    # Extract x and y coordinates
    x, y = keypoints[:, 0], keypoints[:, 1]
    print("original; ", str(x[0]))
    # Clip coordinates to image boundaries
    x = np.clip(x, 0, width - 1, out=x)
    y = np.clip(y, 0, height - 1, out=y)
    print("clipped; ", str(x[0]))
    # Convert to integer indices
    x_idx, y_idx = x.astype(int), y.astype(int)
    print("inted; ", str(x[0]))
    # Apply the inverse mapping
    new_x = map_x[y_idx, x_idx]
    new_y = map_y[y_idx, x_idx]
    # new_x = x_inv[y_idx, x_idx] + (x - map_x[y_idx, x_idx])
    # new_y = y_inv[y_idx, x_idx] + (y - map_y[y_idx, x_idx])
    print("inv_x; ",str(x_inv[y_idx[0], x_idx[0]]))
    print("map_x; ", str(map_x[y_idx, x_idx][0]))
    print("new_x; ", str(new_x[0]))
    # Clip the new coordinates to ensure they're within the image bounds
    new_x = np.clip(new_x, 0, width - 1, out=new_x)
    new_y = np.clip(new_y, 0, height - 1, out=new_y)
    print("clipped_new_x; ", str(new_x[0]))

    # Create the transformed keypoints array
    return np.column_stack([new_x, new_y, keypoints[:, 2:]])

Which resulted in the inverted distortion result of all the checkpoints. g.e.:

Image

When directly comparing that to the remap function that is actually applied to the image it is clear that cv2.remap must be doing something to invert the xy_map resulting in the inverting issue.

@ternaus
Copy link
Collaborator

ternaus commented Jan 21, 2025

@ThomasieDK

Thanks, love such bug reports. Show deep domain knowledge of the reporter and effort he put into debug.

Will look into it and fix.

@CometManAtGitHub
Copy link

CometManAtGitHub commented Jan 22, 2025

Hi,

I can add something for A.ThinPlateSpline. Keypoints are also not correct when using A.ThinPlateSpline.

OS: Linux via Windows WSL-2
Python 3.9.21
Albumentations 2.0.0 (via Conda)

Example:

Image

Albumentations_keypoints.zip

@ternaus ternaus mentioned this issue Jan 22, 2025
@ternaus
Copy link
Collaborator

ternaus commented Jan 22, 2025

This issue was affecting all distortion: Optical, Elastic, ThinPlateSpline, Grid.

Pushed a fix. Not ideal for points far from the center, but should work better than before.

Two main issues:

  1. opencv remap uses inverse transform, while we need forward for keypoints.
  2. at large distortions point could be split into several keypoints that are pretty far from each other. Right now pick only one of them.

If you have ideas on how to handle this better than now. I am all attention.

Image

@ternaus ternaus reopened this Jan 23, 2025
@CometManAtGitHub
Copy link

CometManAtGitHub commented Jan 23, 2025

Hi.

OK, then there is no earlier version where this worked for points far from Center (some unoptimized version maybe)?

Thank you very much.

@ternaus
Copy link
Collaborator

ternaus commented Jan 23, 2025

@RhysEvan created an improved version for the issue. Just merged it.

@ternaus
Copy link
Collaborator

ternaus commented Jan 23, 2025

@CometManAtGitHub Yes, most likely distortions always had issues with keypoints. And right now it finally looks correct.

@ternaus ternaus closed this as completed Jan 23, 2025
@CometManAtGitHub
Copy link

CometManAtGitHub commented Jan 24, 2025

Hi,

The output of the augmentations

  • GridDistortion
  • ThinPlateSpline

for both

  • keypoints_params = A.KeypointParams(format='xy', remove_invisible=remove_invisible) # 'xy'
  • keypoints_params = A.KeypointParams(format='yx', remove_invisible=remove_invisible) # 'yx'

are y,x.

This is not the case for rotation and affine e.g.

Tested the main via pip install and also the release 2.0.1.

It is detectable via:
A.GridDistortion(num_steps=1, distort_limit=0.3, normalized = True, interpolation=interpolation, p=p)
A.ThinPlateSpline(scale_range=[0.2,0.4], num_control_points=4, interpolation=interpolation, p=p)

Image

Also the augmentations are a lot slower than before, just to note.

@ternaus ternaus reopened this Jan 24, 2025
@ternaus
Copy link
Collaborator

ternaus commented Jan 24, 2025

Thanks, will take a look.
Tricky issue. More in the scientific domain rather than engineering.
Need better tests for this as well

@ternaus
Copy link
Collaborator

ternaus commented Jan 26, 2025

@CometManAtGitHub Updated. Fixed issues with x, y being switched (example with the cat)

Now have two methods for dealing with keypoints. Both not ideal, but mask method is quite fast and works decently well on large distortions

@ternaus ternaus closed this as completed Feb 7, 2025
@CometManAtGitHub
Copy link

CometManAtGitHub commented Feb 14, 2025

Thank you very much. It works ok and is fast.

Some smaller offsets exist.

But I also noticed a severe error, where landmarks disapeared.

Image

I created a replay for repetition, it might be used with the landmark extraction code from above for the cat. But it is 50 MB large, so here is the augmentation as code, the error should appear in 1 out of 10 cases.

def apply_augmentations_image_keypoints(image, keypoints, width, height, with_replay=False):
    """Apply augmentations to an image and keypoints"""

    interpolation = cv2.INTER_LANCZOS4
    border_mode = cv2.BORDER_CONSTANT

    value = (0, 0, 0)
    fill_image = 0
    rotate_limit = 30

    remove_invisible = True # Remove keypoints which are not visible in the image due to cropping or rotation e.g.
    keypoints_params = A.KeypointParams(format='xy', remove_invisible=remove_invisible) # , label_fields=['class_labels']

    p = 1.0

    # Simple augmentations
    a_rotation = A.Rotate(limit=rotate_limit, interpolation=interpolation, border_mode=border_mode, fill=fill_image, p=p)
    a_affine = A.Affine(scale=(0.9, 1.1), translate_percent=(0.1, 0.1), rotate=(-rotate_limit, rotate_limit), shear=(-10, 10), interpolation=interpolation, border_mode=border_mode, fill=fill_image, p=p)
    a_padifneeded = A.PadIfNeeded(min_height=height, min_width=width, border_mode=border_mode, p=p)
    a_centercrop = A.CenterCrop(height=height, width=width, p=1.0)

    # Complex augmentations
    alpha = 1
    sigma = 50
    a_elastic = A.ElasticTransform(alpha=alpha, sigma=sigma, interpolation=interpolation, p=p)
    num_steps = 2 # 1 ... 5
    distort_limit = 0.3 # 0.1 ... 0.5
    a_griddistortion = A.GridDistortion(num_steps=num_steps, distort_limit=distort_limit, normalized = False, interpolation=interpolation, p=p)
    distort_limit = [0.0, 0.95]
    mode = "fisheye"
    a_opticaldistortion = A.OpticalDistortion(distort_limit=distort_limit, mode=mode, interpolation=interpolation, p=p)
    scale_range = [0.2, 0.4]
    num_control_points = 2
    a_thinplatespline = A.ThinPlateSpline(scale_range=scale_range, num_control_points=num_control_points, interpolation=interpolation, p=p)

    # Sometimes we want repeatable augmentations
    Composer = A.Compose
    if with_replay:
        Composer = A.ReplayCompose

    # Define augmentation pipeline
    augmentations = Composer([
        a_rotation,                 # correct
        a_affine,                   # correct
        a_elastic,                # =============> ERRORS with albumentations < 2.0.4 <=============
        a_griddistortion,         # =============> ERRORS with albumentations < 2.0.4 <=============
        a_opticaldistortion,      # =============> ERRORS with albumentations < 2.0.4 <=============
        a_thinplatespline,        # =============> ERRORS with albumentations < 2.0.4 <=============
        # a_padifneeded,
        # a_centercrop
    ], keypoint_params=keypoints_params)

As you already stated it will not work for extreme cases like when duplication of landmarks appear.
I think one can experiment with one's data and optimize the parameters to avoid such cases.

Update: I can recreate the problem with vanished keypoints by only applying griddistortion.

@ternaus ternaus reopened this Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants