Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading noise when using geom_jitter in combination with facet_wrap #3722

Open
hkosm opened this issue Jan 10, 2020 · 1 comment · May be fixed by #6330
Open

Misleading noise when using geom_jitter in combination with facet_wrap #3722

hkosm opened this issue Jan 10, 2020 · 1 comment · May be fixed by #6330
Labels
feature a feature request or enhancement layers 📈

Comments

@hkosm
Copy link

hkosm commented Jan 10, 2020

Hello,

I like to use geom_jitter mostly to plot discrete data and avoid overplotting.

When using a combination of geom_jitter and facet_wrap with free scales I came to the issue that due to the free scales, jittering appeared misleading.

In the example below, I would like to see approximately equal noise, as I use jittering purely for visibility of data points. Apparently though, due to the free scales in geom_wrap the jittering looks different in the two facets.
How could this be approached? Could this be an issue inside face_wrap?

library(ggplot2)

set.seed(123)
df <- data.frame(
  y = rnorm(100, 50, 10),
  x = c(rep(1:5, 10), rep(seq(0, 100, length.out = 5), 10)),
  group = c(rep("a", times = 50), rep("b", times = 50))
)

ggplot(df, aes(x = x, y = y)) +
  geom_point(color = "orange", alpha = 0.7) +
  geom_point(position = "jitter", color = "blue", alpha = 0.7) +
  facet_wrap( ~ group, scales = "free")

Created on 2020-01-10 by the reprex package (v0.3.0)

@hkosm hkosm changed the title Misleading noise when using geom_jitter in combination with face_wrap Misleading noise when using geom_jitter in combination with facet_wrap Jan 10, 2020
@yutannihilation
Copy link
Member

Thanks. The default jitterings are calculated on the whole data, not on the data per PANEL. It seems it's possible to delay the calculation to compute_layer() so that PANEL in data can be considered. (But, I'm not yet sure which behaviour is semantically correct)

setup_params = function(self, data) {
list(
width = self$width %||% (resolution(data$x, zero = FALSE) * 0.4),
height = self$height %||% (resolution(data$y, zero = FALSE) * 0.4),
seed = self$seed
)
},
compute_layer = function(self, data, params, layout) {
trans_x <- if (params$width > 0) function(x) jitter(x, amount = params$width)
trans_y <- if (params$height > 0) function(x) jitter(x, amount = params$height)
with_seed_null(params$seed, transform_position(data, trans_x, trans_y))
}

@thomasp85 thomasp85 added feature a feature request or enhancement layers 📈 labels Jan 21, 2020
@teunbrand teunbrand linked a pull request Feb 12, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement layers 📈
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants