Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize the binaryToDecimal function in the DecimalUtils class #3146

Closed
qian0817 opened this issue Feb 6, 2025 · 0 comments
Closed

Optimize the binaryToDecimal function in the DecimalUtils class #3146

qian0817 opened this issue Feb 6, 2025 · 0 comments

Comments

@qian0817
Copy link

qian0817 commented Feb 6, 2025

Describe the enhancement requested

https://github.com/apache/parquet-java/blob/master/parquet-pig/src/main/java/org/apache/parquet/pig/convert/DecimalUtils.java

  public static BigDecimal binaryToDecimal(Binary value, int precision, int scale) {
    /*
     * Precision <= 18 checks for the max number of digits for an unscaled long,
     * else treat with big integer conversion
     */
    if (precision <= 18) {
      ByteBuffer buffer = value.toByteBuffer();
      byte[] bytes = buffer.array();
      int start = buffer.arrayOffset() + buffer.position();
      int end = buffer.arrayOffset() + buffer.limit();
      long unscaled = 0L;
      int i = start;
      while (i < end) {
        unscaled = (unscaled << 8 | bytes[i] & 0xff);
        i++;
      }
      int bits = 8 * (end - start);
      long unscaledNew = (unscaled << (64 - bits)) >> (64 - bits);
      if (unscaledNew <= -pow(10, 18) || unscaledNew >= pow(10, 18)) {
        return new BigDecimal(unscaledNew);
      } else {
        return BigDecimal.valueOf(unscaledNew / pow(10, scale));
      }
    } else {
      return new BigDecimal(new BigInteger(value.getBytes()), scale);
    }
  }

If precision is less than 18, the condition unscaledNew <= -pow(10, 18) || unscaledNew >= pow(10, 18) can not be true, so we can remove the judgment logic here. Additionally, using BigDecimal.valueOf(unscaledNew, scale) is preferable over using BigDecimal.valueOf(unscaledNew / pow(10, scale)), as it does not convert the unscaled value to double.

Component(s)

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant