Skip to content

v0.14.0

Compare
Choose a tag to compare
@angeloskath angeloskath released this 24 May 01:33
· 240 commits to main since this release
9f9cb7a

Highlights

  • Small-size build that JIT compiles kernels and omits the CPU backend which results in a binary <4MB
    • Series of PRs 1, 2, 3, 4, 5
  • mx.gather_qmm quantized equivalent for mx.gather_mm which speeds up MoE inference by ~2x
  • Grouped 2D convolutions

Core

  • mx.conjugate
  • mx.conv3d and nn.Conv3d
  • List based indexing
  • Started mx.distributed which uses MPI (if installed) for communication across machines
    • mx.distributed.init
    • mx.distributed.all_gather
    • mx.distributed.all_reduce_sum
  • Support conversion to and from dlpack
  • mx.linalg.cholesky on CPU
  • mx.quantized_matmul sped up for vector-matrix products
  • mx.trace
  • mx.block_masked_mm now supports floating point masks!

Fixes

  • Error messaging in eval
  • Add some missing docs
  • Scatter index bug
  • The extensions example now compiles and runs
  • CPU copy bug with many dimensions