This PR is an ongoing effort to add a CUDA backend to MLX, very little things work now but you can run the tutorial example already.
To build and test:
$ cmake . -Bbuild -DMLX_BUILD_CUDA=ON -DMLX_B...
It’s still much faster to use a high-end Nvidia GPU vs M-class processors for training. CUDA for training, M or A-class for inference using Apple’s CoreML framework.
Ability to run MLX models (like Apple’s foundation models) on CUDA devices for inference. A bit like when they released iTunes for Windows. This way, MLX has a chance at becoming a universal format.
Let Apple use Nvidia clusters on their cloud for Private Cloud.
A few guesses:
It’s still much faster to use a high-end Nvidia GPU vs M-class processors for training. CUDA for training, M or A-class for inference using Apple’s CoreML framework.
Ability to run MLX models (like Apple’s foundation models) on CUDA devices for inference. A bit like when they released iTunes for Windows. This way, MLX has a chance at becoming a universal format.
Let Apple use Nvidia clusters on their cloud for Private Cloud.