mlx.MLXPatches

Idempotent runtime patches for mlx_lm.

Usage

Source

mlx.MLXPatches()

MlxEngine applies these on its worker thread before the first batch_generate call. The single current patch replaces BatchGenerator.stats with a version that clamps the elapsed-time denominators when computing tokens-per-second, preventing a ZeroDivisionError.

Methods

Name Description
apply() Apply all patches once; subsequent calls are no-ops.

apply()

Apply all patches once; subsequent calls are no-ops.

Usage

Source

apply()