[HUDI-8934] Claim of RFC-87. Avro elimination for Flink writer #12729

Alowator · 2025-01-29T08:38:30Z

Change Logs

Inspired by RFC-84 HUDI-8920: there is an opinion Avro is not the best choice for Hudi. It requires an extra ser/de operations not only between Flink operators (will be fixed by RFC-84).

I decided to benchmark a POC version with native Flink's RowData writer for Hudi. It was simple enough, because Hudi already has native RowData to Parquet writer used by append mode, I reused this writer to create log blocks and two bottlenecks were found:

Hudi performs a lot of Avro ser/de operations in writer runtime.
Hudi stores Avro recrods as List, it causes a GC pressure on writer runtime, on my benchmarks garbage collection is about 30% of all hudi writer runtime.

Impact

As a result I reduced write time from ~4min to ~1min 20sec (x3 write performance boost):

I have a POC version we are already testing in our cloud environment, key improvements:

This POC based on RFC-84 and already contains POC from RFC-84
Write native RowData to parquet log blocks (eliminate unnecessary ser/de)
Recrods are stored in BinaryInMemorySortBuffer (reduce GC pressure)
Records are sorted by new QuickSort().sort(sortBuffer) before writing log block (maybe it's possible to preform data compaction without sorting? this sort performs fast enough so it doesn't affect write performance.)
RowDataStreamWriteFunction flushes bucket async (reduces preciding operators backpressure)

My config

PC: 32CPU 128GiB
Data: 60 million records of TPC-H lineitem table
Java: 17 openjdk
Flink: 1.20, Single JM + Sinlge TM, standalone, taskmanager.process.size: 8G
Write: Hadoop HDFS 3.3.1, 9 node cluster
Read: Kafka 2.8, 3 node cluster, 8 partitions
Hudi table:
'connector' = 'hudi',
'path' = '<hdfs_path>',
'table.type' = 'MERGE_ON_READ',
'metadata.enabled' = 'false',
'index.type'='BUCKET',
'hoodie.bucket.index.hash.field'='l_orderkey,l_linenumber',
'hoodie.bucket.index.num.buckets'='8',
'hoodie.parquet.compression.codec' = 'snappy',
'hoodie.logfile.data.block.format' = 'parquet',
'hoodie.enable.fast.sort.write' = 'true',
'write.operation' = 'upsert',
'write.batch.size'='256',
'write.tasks'='8',
'compaction.async.enabled' = 'false',
'clean.async.enabled' = 'false',
'hoodie.archive.automatic' = 'false',
'hoodie.clean.automatic' = 'false'

Risk level (write none, low medium or high below)

None

Documentation Update

None

Contributor's checklist

Read through contributor's guide
Change Logs and Impact were stated clearly
Adequate tests were added if applicable
CI passed

hudi-bot · 2025-01-29T10:06:21Z

CI report:

fa4a3d1 Azure: SUCCESS

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

danny0405 · 2025-01-30T03:17:02Z

@cshuo is working on a RFC to introduce basic infras to Hudi, such as schema/data type/expressions, one of the goles is to integrate with engine native "row" for both reader and writer, welcome to contribute to it, I will cc you once the RFC is comming out.

Alowator · 2025-01-30T03:24:14Z

@cshuo is working on a RFC to introduce basic infras to Hudi, such as schema/data type/expressions, one of the goles is to integrate with engine native "row" for both reader and writer, welcome to contribute to it, I will cc you once the RFC is comming out.

So do you think it’s better to do it as a part of another RFC ? I think creation of engine native row for Flink is a very big task to make it as part of another RFC, so it’s better to clearly separate this work

danny0405 · 2025-01-30T03:44:39Z

@cshuo is working on a RFC to introduce basic infras to Hudi, such as schema/data type/expressions, one of the goles is to integrate with engine native "row" for both reader and writer, welcome to contribute to it, I will cc you once the RFC is comming out.

So do you think it’s better to do it as a part of another RFC ? I think creation of engine native row for Flink is a very big task to make it as part of another RFC, so it’s better to clearly separate this work

That's my thought, @cshuo 's RFC is huge and it's great if you and @geserdugarov can get involved.

Alowator · 2025-01-30T06:54:35Z

@cshuo is working on a RFC to introduce basic infras to Hudi, such as schema/data type/expressions, one of the goles is to integrate with engine native "row" for both reader and writer, welcome to contribute to it, I will cc you once the RFC is comming out.

So do you think it’s better to do it as a part of another RFC ? I think creation of engine native row for Flink is a very big task to make it as part of another RFC, so it’s better to clearly separate this work

That's my thought, @cshuo 's RFC is huge and it's great if you and @geserdugarov can get involved.

Ok, will look forward for @cshuo 's RFC. This work requires a scrutiny related to Flink write performance, including moments I presented in this ticket.

[HUDI-8934] Claim of RFC-87. Avro elimination for Flink writer

fa4a3d1

github-actions bot added the size:S PR with lines of changes in (10, 100] label Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HUDI-8934] Claim of RFC-87. Avro elimination for Flink writer #12729

[HUDI-8934] Claim of RFC-87. Avro elimination for Flink writer #12729

Alowator commented Jan 29, 2025 •

edited

Loading

hudi-bot commented Jan 29, 2025

danny0405 commented Jan 30, 2025

Alowator commented Jan 30, 2025 •

edited

Loading

danny0405 commented Jan 30, 2025

Alowator commented Jan 30, 2025

[HUDI-8934] Claim of RFC-87. Avro elimination for Flink writer #12729

Are you sure you want to change the base?

[HUDI-8934] Claim of RFC-87. Avro elimination for Flink writer #12729

Conversation

Alowator commented Jan 29, 2025 • edited Loading

Change Logs

Impact

My config

Risk level (write none, low medium or high below)

Documentation Update

Contributor's checklist

hudi-bot commented Jan 29, 2025

CI report:

danny0405 commented Jan 30, 2025

Alowator commented Jan 30, 2025 • edited Loading

danny0405 commented Jan 30, 2025

Alowator commented Jan 30, 2025

Alowator commented Jan 29, 2025 •

edited

Loading

Alowator commented Jan 30, 2025 •

edited

Loading