integrate zmq #403

aniketmaurya · 2025-01-08T18:34:12Z

What does this PR do?

Before submitting

Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?

Faster process communication with zmq.

TODO: A follow up PR to add a proxy and support multiple inference worker processes and multiple uvicorn processes.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

bhimrazy · 2025-01-08T19:00:21Z

Seems like a great addition, @aniketmaurya 🎉!
I remember llamastack also has zmq under the hood.
I'm looking forward to learning more about it and its uses over multiprocessing queues.
😊

aniketmaurya · 2025-01-09T16:31:07Z

thanks for taking a look @bhimrazy! yeah, zmq seem to improve the process communication time for sending the prediction results to the main process. This is gonna be very useful especially while serving streaming tokens from LLMs.

src/litserve/loops/base.py

codecov · 2025-01-09T21:19:19Z

Codecov Report

Attention: Patch coverage is 61.06195% with 44 lines in your changes missing coverage. Please review.

Project coverage is 88%. Comparing base (747308a) to head (bf6969c).
Report is 2 commits behind head on main.

Additional details and impacted files

@@         Coverage Diff         @@
##           main   #403   +/-   ##
===================================
- Coverage    89%    88%   -1%     
===================================
  Files        30     30           
  Lines      1893   1976   +83     
===================================
+ Hits       1683   1734   +51     
- Misses      210    242   +32

lantiga

The PR looks good in general, but not abstracting away the interprocess communication mechanism makes the code more complex (you have a socket rather than a queue so that has more opaque semantics), also a lot of details are guarded by if .. else conditions

I would take this opportunity to create a base class that takes care of the communication, and two concrete classes. You can then instantiate them in LitServer based on the one you want and you don't have to deal with conditionals and keep the code easily consumable.

Clear semantics is particularly important for people implementing loops, we need to keep it simple.

aniketmaurya · 2025-01-10T13:52:08Z

agree with your points @lantiga, I would put these inside the put_response method which can be used everywhere else. I will also create an encapsulation to hide these socket details for the next PR which would enable zmq for multiple workers.

aniketmaurya · 2025-01-10T14:50:16Z

creating followup as per above suggestion.

integrate zmq

6bd199f

aniketmaurya requested review from williamFalcon, lantiga, ethanwharris, Andrei-Aksionov and Borda as code owners January 8, 2025 18:34

aniketmaurya added 6 commits January 9, 2025 11:27

backward compatilbity

580ee4a

fix typing

b41042f

fix tests

74771fc

resolve conflict

caecfa8

update tests

c62b2e5

update

18e5db3

ethanwharris approved these changes Jan 9, 2025

View reviewed changes

aniketmaurya and others added 10 commits January 9, 2025 16:50

delete socket

a4fc585

terminate context

105fa5a

fix random port

4e0b4d8

add todo

3643fbf

disable zmq

593af5f

use ipc

e75f7b2

fix windows CI

9096628

clean up

ae2911e

add tests

49a6748

omit windows

bf6969c

aniketmaurya commented Jan 9, 2025

View reviewed changes

src/litserve/loops/base.py Show resolved Hide resolved

lantiga reviewed Jan 10, 2025

View reviewed changes

williamFalcon approved these changes Jan 10, 2025

View reviewed changes

aniketmaurya merged commit 43692d4 into main Jan 10, 2025
20 of 21 checks passed

aniketmaurya deleted the integrate-zmq branch January 10, 2025 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

integrate zmq #403

integrate zmq #403

aniketmaurya commented Jan 8, 2025 •

edited

Loading

bhimrazy commented Jan 8, 2025 •

edited

Loading

aniketmaurya commented Jan 9, 2025

codecov bot commented Jan 9, 2025 •

edited

Loading

lantiga left a comment

aniketmaurya commented Jan 10, 2025 •

edited

Loading

aniketmaurya commented Jan 10, 2025

integrate zmq #403

integrate zmq #403

Conversation

aniketmaurya commented Jan 8, 2025 • edited Loading

What does this PR do?

PR review

Did you have fun?

bhimrazy commented Jan 8, 2025 • edited Loading

aniketmaurya commented Jan 9, 2025

codecov bot commented Jan 9, 2025 • edited Loading

Codecov Report

lantiga left a comment

Choose a reason for hiding this comment

aniketmaurya commented Jan 10, 2025 • edited Loading

aniketmaurya commented Jan 10, 2025

aniketmaurya commented Jan 8, 2025 •

edited

Loading

bhimrazy commented Jan 8, 2025 •

edited

Loading

codecov bot commented Jan 9, 2025 •

edited

Loading

aniketmaurya commented Jan 10, 2025 •

edited

Loading