Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ART(Adaptive Radix Tree) on sequential #11

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Conversation

codingskynet
Copy link
Owner

#10 에서 연구한, sequential에서 BTree보다 빠르고, HashMap보다도 우수한 캐시히트를 가지는 자료구조를 구현해봤습니다.
간략하게 PoC 정도로만 만들었기 때문에, PR 생성 이후 K를 무조건 String(for ascii)으로 고정시키고, Node256을 128로 바꾸는 패치 등을 진행해보는 것도 괜찮아 보입니다.

@codingskynet
Copy link
Owner Author

codingskynet commented Nov 25, 2021

Benchmark: cargo criterion --bench sequential (only run bench_string_vs_btreemap)
Envirnment:

  • SW: Windows 10 Pro 10.0.19043 Build 19043, WSL 2 Ubuntu 20.04(5.10.16.3-microsoft-standard-WSL2), w/ other minor programs running (Chrome, etc...)
  • HW: Ryzen 5 3600 (6c 12t, static clock 4.2GHz), DDR4 3600MHz 16GiB

The branch art(b3f3b53) result:

[String] Inserted +5e5, Ops (I: 100%, L: 0%, R: 0%, total: +1.92e5)/std::BTreeMap                                                                          
                        time:   [146.66 ms 147.66 ms 148.73 ms]
                        thrpt:  [1.2910 Melem/s 1.3003 Melem/s 1.3092 Melem/s]
[String] Inserted +5e5, Ops (I: 100%, L: 0%, R: 0%, total: +1.92e5)/ART                                                                          
                        time:   [85.830 ms 88.368 ms 91.018 ms]
                        thrpt:  [2.1095 Melem/s 2.1727 Melem/s 2.2370 Melem/s]

[String] Inserted +5e5, Ops (I: 0%, L: 100%, R: 0%, total: +1.92e5)/std::BTreeMap                                                                          
                        time:   [136.45 ms 138.53 ms 140.82 ms]
                        thrpt:  [1.3634 Melem/s 1.3860 Melem/s 1.4071 Melem/s]
[String] Inserted +5e5, Ops (I: 0%, L: 100%, R: 0%, total: +1.92e5)/ART                                                                          
                        time:   [82.525 ms 83.332 ms 84.109 ms]
                        thrpt:  [2.2828 Melem/s 2.3040 Melem/s 2.3266 Melem/s]

[String] Inserted +5e5, Ops (I: 0%, L: 0%, R: 100%, total: +1.92e5)/std::BTreeMap                                                                          
                        time:   [157.13 ms 158.36 ms 159.78 ms]
                        thrpt:  [1.2016 Melem/s 1.2124 Melem/s 1.2220 Melem/s]
[String] Inserted +5e5, Ops (I: 0%, L: 0%, R: 100%, total: +1.92e5)/ART                                                                          
                        time:   [116.99 ms 121.40 ms 125.79 ms]
                        thrpt:  [1.5264 Melem/s 1.5815 Melem/s 1.6412 Melem/s]

[String] Inserted +5e5, Ops (I: 5%, L: 90%, R: 5%, total: +1.92e5)/std::BTreeMap                                                                          
                        time:   [174.70 ms 183.48 ms 191.66 ms]
                        thrpt:  [1.0018 Melem/s 1.0464 Melem/s 1.0990 Melem/s]
[String] Inserted +5e5, Ops (I: 5%, L: 90%, R: 5%, total: +1.92e5)/ART                                                                          
                        time:   [93.524 ms 98.472 ms 103.38 ms]
                        thrpt:  [1.8572 Melem/s 1.9498 Melem/s 2.0530 Melem/s]

[String] Inserted +5e5, Ops (I: 30%, L: 50%, R: 20%, total: +1.92e5)/std::BTreeMap                                                                          
                        time:   [191.75 ms 199.14 ms 205.85 ms]
                        thrpt:  [932.74 Kelem/s 964.16 Kelem/s 1.0013 Melem/s]
[String] Inserted +5e5, Ops (I: 30%, L: 50%, R: 20%, total: +1.92e5)/ART                                                                          
                        time:   [98.705 ms 103.46 ms 107.96 ms]
                        thrpt:  [1.7784 Melem/s 1.8557 Melem/s 1.9452 Melem/s]

[String] Inserted +5e5, Ops (I: 40%, L: 20%, R: 40%, total: +1.92e5)/std::BTreeMap                                                                          
                        time:   [215.38 ms 219.76 ms 223.66 ms]
                        thrpt:  [858.44 Kelem/s 873.69 Kelem/s 891.43 Kelem/s]
[String] Inserted +5e5, Ops (I: 40%, L: 20%, R: 40%, total: +1.92e5)/ART                                                                          
                        time:   [112.48 ms 116.99 ms 120.88 ms]
                        thrpt:  [1.5883 Melem/s 1.6412 Melem/s 1.7070 Melem/s]

[String] Inserted +5e5, Ops (I: 50%, L: 0%, R: 50%, total: +1.92e5)/std::BTreeMap                                                                          
                        time:   [230.13 ms 233.84 ms 237.02 ms]
                        thrpt:  [810.07 Kelem/s 821.09 Kelem/s 834.31 Kelem/s]
[String] Inserted +5e5, Ops (I: 50%, L: 0%, R: 50%, total: +1.92e5)/ART                                                                          
                        time:   [116.39 ms 118.84 ms 121.25 ms]
                        thrpt:  [1.5835 Melem/s 1.6156 Melem/s 1.6496 Melem/s]

On geometric mean, ART is about 1.739x faster than std::BTreeMap.

@codingskynet
Copy link
Owner Author

MacBook Air (M1, 2020)에서 debug_art를 돌려서 각 struct의 size를 구해보면 다음과 같다.

V          is     8 byte.
NodeV<V>   is    24 byte.
NodeHeader is    16 byte.
Node<V>    is     8 byte.
Node4<V>   is    64 byte.
Node16<V>  is   168 byte.
Node48<V>  is   664 byte.
Node256<V> is  2072 byte.

부정확하지만, PREFIX_LEN을 12에서 13으로 바꾸고 stress_art를 돌리면 꽤나 느린 거 같다. 따라서, 각 struct의 size를 줄여보는(가령, Header의 크기를 줄이거나, len을 usize 대신 u8을 쓴다는 등) 시도를 해서, 캐시히트 및 퍼포먼스를 체크해봐도 좋을 듯 하다. 또한, HashMap과의 비교도 bench에 추가해서 보는 것이 좋을 듯하다.

@codingskynet
Copy link
Owner Author

와 이거 언제 뜯어 고치냐...

@codingskynet
Copy link
Owner Author

조금 고민해봤는데, 어차피 실제로 DB 짤 때는 이거 안 쓸 거 같아서, ascii(or base64급)에 맞게 최적화한 ART 만드는 거름으로 쓰는 게 좋을 듯? 이거는 그냥 legacy로 두면 뭐 기념비적일 듯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant