MLX updated: Metal 4 support and distributed training across multiple Macs via RDMA over Thunderbolt

Apple's open-source ML research framework gains Metal 4 support and the ability to scale model training across multiple Macs connected via Thunderbolt, using RDMA to reduce data transfer latency.

MLX evolves with Apple silicon

At the WWDC 2026 Platforms State of the Union, Apple announced a significant update to MLX, its open-source machine learning research framework optimized for Apple Silicon. The main addition is Metal 4 support, the new graphics and compute API introduced alongside it, which allows more efficient use of the unified GPU in M-series chips. The second addition, perhaps more unusual, concerns scalability: MLX can now distribute model training across multiple Macs connected via Thunderbolt, using RDMA (Remote Direct Memory Access) to transfer data directly between device memories without going through the CPU, reducing latency and increasing throughput.

Context: Mac clusters for research

The update is particularly relevant for researchers working with Apple hardware: it allows building clusters of Mac Studio or Mac Pro machines to train models that would otherwise require dedicated hardware. It is not an announcement aimed at the general public, but it defines Apple's positioning in the AI research landscape: the closed ecosystem paradoxically becomes an advantage when Thunderbolt and unified memory enable RDMA transfers that on heterogeneous hardware would require far more complex infrastructure. MacRumors highlighted it as one of the key points of the Platforms State of the Union.

← Back to home