Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
List GPU Devices on Linux¶
You can list GPU devices using the following command on linux.
:::bash
lspci -v | grep VGAMachine Learning Frameworks Supporting Managing GPU Resources¶
ZeRO + DeepSpeed is a deep learning optimization library that makes distributed training on GPU clusters easy, efficient, and effective.
High-level Scientific Libraries with Built-in GPU Support¶
TensorFlow
PyTorch
XGBoost
LightGBM
Numba
Low-level Libraries for General Purpose GPU Computing¶
CUDA and Vulkan (successor to OpenCL) are the 2 most popular frameworks for GPU computing. CUDA is commerical and for Nvidia GPUs only while Vulkan is opensource and support more brand of GPUs. Generally speaking, CUDA has slight better performance than Vulkan on Nvidia GPUs. It is suggested that you go with CUDA if you want to squeeze the most out of performance.
Rust¶
wgpu wgpu is a cross-platform, safe, pure-rust graphics api. It runs natively on Vulkan, Metal, D3D12, D3D11, and OpenGLES; and on top of WebGPU on wasm.
rust-gpu rust-gpu aims at making Rust a first-class language and ecosystem for GPU shaders.
vulkano Vulkano is a Rust wrapper around the Vulkan graphics API. It follows the Rust philosophy, which is that as long as you don’t use unsafe code you shouldn’t be able to trigger any undefined behavior. In the case of Vulkan, this means that non-unsafe code should always conform to valid API usage. Vulkano is not as mature as ash .
https://
https://
pathfinder¶
pathfinder is a fast, practical, GPU-based rasterizer for fonts and vector graphics using OpenGL 3.0+, OpenGL ES 3.0+, WebGL 2, and Metal.
GPU Computing in Python¶
Please refer to GPU Computing in Python for more details.
https://
https://
https://
https://
https://
wgpu-py wgpu-py is a next generation GPU API for Python. It is a Python lib wrapping wgpu-native and exposing it with a Pythonic API similar to the WebGPU spec.
vulkan-kompute General purpose GPU compute framework for cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases.
https://
Beyond CUDA: GPU Accelerated Python for Machine Learning on Cross-Vendor Graphics Cards Made Simple
C++¶
Thrust is a parallel algorithms library which resembles the C++ Standard Template Library (STL). Thrust is high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. Interoperability with established technologies (such as CUDA, TBB, and OpenMP) facilitates integration with existing software.
Java¶
Graphics Rendering¶
OpenGL
https://
PyOpenGL
GLFW
https://
External Graphics Card¶
https://
References¶
Comparative performance analysis ofVulkan and CUDA programming modelimplementations for GPUs
https://
towardsdatascience .com /python -performance -and -gpus -1be860ffd58d https://
towardsdatascience .com /speed -up -your -algorithms -part -1 -pytorch -56d8a4ae7051 https://
towardsdatascience .com /speed -up -your -algorithms -part -2 -numba -293e554c5cc1 https://
towardsdatascience .com /speed -up -your -algorithms -part -3 -parallelization -4d95c0888748 https://
towardsdatascience .com /speeding -up -your -algorithms -part -4 -dask -7c6ed79994ef SpeedUpYourAlgorithms/1)%20PyTorch.ipynb
SpeedUpYourAlgorithms/2)%20Numba.ipynb
SpeedUpYourAlgorithms/3)%20Prallelization.ipynb
SpeedUpYourAlgorithms/4)%20Dask.ipynb