The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Development repository for the Triton language and compiler
Efficient, scalable and enterprise-grade CPU/GPU inference server for Hugging Face transformer models 🚀
App Debugging & Inspection Tool for Android
Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with only few lines of legible code)
NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes
Intel I2C controller and slave device drivers for macOS
2 Mic Hat, 4 Mic Array, 6-Mic Circular Array Kit, and 4-Mic Linear Array Kit for Raspberry Pi
Parallelformers: An Efficient Model Parallelization Toolkit for Deployment