Releases · VikParuchuri/marker

Inline math

Marker will handle inline math if --use_llm is set. This makes reading scientific papers a lot nicer! The feature has been optimized for speed.

Local LLMs

We now support Ollama - when you're passing the --use_llm flag, you can select the Ollama inference service like this:

marker_single FILEPATH --use_llm --llm_service marker.services.ollama.OllamaService

You can set the options --ollama_base_url and --ollama_model. By default, it will use llama3.2-vision.

Batch LLM calls

LLM calls are now batched across processors for a significant speedup if you're passing --use_llm.

Misc fixes

Biology PDFs now work a lot better - leading line numbers are stripped
Improved OCR heuristics
Updated the examples

What's Changed

Batch together llm inference requests by @VikParuchuri in #536
Add another heuristic to clean up line numbers by @iammosespaulr in #538
Add Inline Math Support by @tarun-menta in #517
Factor out llm services, enable local models by @VikParuchuri in #544
Improve LLM speed; handle inline math; allow local models by @VikParuchuri in #537

Full Changelog: v1.4.0...v1.5.0

Improved LaTeX OCR

We trained a new LaTeX OCR model that works a lot better overall. It will reliably output KaTeX-compatible math. It also operates on longer sequences than before.

The rendered output is on the right, original document on the left:

Block visualization

You can now visualize blocks in the streamlit app, thanks to @jazzido . By selecting json output and checking "show blocks", you get a nice visualization where you can see how marker parsed the page. Clicking on blocks will show the HTML.

Links and references

We fixed a bug with links and references, they now render as one block. You can see the extracted references here:

Misc bugfixes

Fixed some bugs with tables and row splitting
Escaped $ inside text and tables so we don't accidentally render things as equations

What's Changed

[streamlit_app] Visualize extracted blocks by @jazzido in #502
Texify by @VikParuchuri in #513
Update texify by @VikParuchuri in #514

New Contributors

@jazzido made their first contribution in #502

Full Changelog: v1.3.2...v1.3.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows fixes

Memory leak fix

Convert.py enhancements

What's Changed

New Contributors

Contributors

Inline math

Local LLMs

Batch LLM calls

Misc fixes

What's Changed

Contributors

New benchmarks

Overall

Table

Update gemini model

Misc bugfixes

What's Changed

Contributors

Improved LaTeX OCR

Block visualization

Links and references

Misc bugfixes

What's Changed

New Contributors

Contributors

Releases: VikParuchuri/marker

v1.5.3

Windows fixes

Memory leak fix

Convert.py enhancements

What's Changed

New Contributors

Contributors

Fix LLM service issue

Fix OCR issue

Inline math; speed up LLM calls; allow local models

Inline math

Local LLMs

Batch LLM calls

Misc fixes

What's Changed

Contributors

LLM fixes; new benchmarks

New benchmarks

Overall

Table

Update gemini model

Misc bugfixes

What's Changed

Contributors

Bump gemini version

Fix pytorch bug

New LaTeX OCR model; block visualizer; better links/references

Improved LaTeX OCR

Block visualization

Links and references

Misc bugfixes

What's Changed

New Contributors

Contributors

Fix table bugs

Improved equations, bugfixes