NVIDIA is pushing what’s possible on the AI PC platform further with its latest RTX technologies announced today.
NVIDIA is advancing the PC AI platform with several key technologies: RTX AI Toolkit, RTX Acceleration For CoPilot, AI Inference Manager SDK and more.
The difference between NVIDIA and others who are just beginning their AI PC journey is obvious from the start. While others mainly talk about how their hardware, NPUs, are faster than its competitors, NVIDIA is the one making the AI PC platform dynamic by introducing several new features. The company has a list of technologies already available to AI PC consumers running its RTX platform, such as the industry-leading Deep Learning Super Sampling (DLSS) feature that has seen countless reviews. updates that add to its neural network to make games work and look. better.
The company also offers several assistants in the form of its RTX Chat, a chatbot, which runs locally on your PC and acts as an assistant. Windows also supports TensorRT and TensorRT-LLM, which accelerates GenAI and LLM models on client platforms without the need for access to the cloud. Several upcoming gaming technologies will use AI enhancements such as ACE (Avatar). Cloud Engine) which is also receiving a new update today.
NVIDIA also outlines the current AI computing power landscape and shows how its GeForce RTX 40 desktop processors scale from 242 TOPS at the entry level to 1,321 TOPS at the high-end level. This represents a 4.84x increase at the lower end and a 26.42x increase at the very top compared to the latest TOPS AI 45-50 NPUs we’ll see on SOCs this year.
Even NVIDIA GeForce RTX 40 Laptop options like the RTX 4050 start at 194 TOPS, a 3.88x increase over the fastest NPU, while the RTX 4090 Laptop chip delivers a 13.72x speedup with its 686 TOPS.
Microsoft Copilot Runtime adds RTX acceleration
So from today’s announcements we first have the Windows Copilot Runtime which benefits from RTX acceleration for local PC SLMs (Small Language Models). Copilot is considered Microsoft’s next big thing in the AI PC landscape and pretty much everyone is trying to jump on the bandwagon. Microsoft and NVIDIA are working together to enable developers to bring new GenAI features to Windows OS and web applications by providing easy API access to GPU-accelerated SLMs and RAGs.
NVIDIA says RTX GPUs will accelerate these new AI capabilities, delivering fast and responsive AI experiences on Windows devices.
NVIDIA RTX AI Toolkit and NVIDIA AIM SDK help developers make AI experiences faster and better
The second update is the announcement of the NVIDIA RTX AI Toolkit which also helps developers create application-specific AI models that can be run on PC. The RTX AI Toolkit will include a suite of tools and SDKs for model customization (QLoRa), optimization (TensorRT Model Optimizer), and deployment (TensorRT Cloud) on RTX AI PCs and will be available in June.
With the new RTX AI Toolkit, developers will be able to deploy their models 4x faster and in 3x smaller packages, speeding up the deployment process and delivering new experiences to users faster. A comparison between a standard “general purpose” model and an optimized RTX AI Tooklit model is also shown. The GP model runs on an RTX 4090 and produces 48 tokens/second while requiring 17 GB of VRAM. Meanwhile, an optimized model of the RTX AI Toolkit running on an RTX 4050 GPU produces 187 tokens/second, a nearly 4x increase, while only requiring 5GB of VRAM.
The RTX AI Toolkit is also operated by software partners such as Adobe, Blackmagic Design and Topaz, who integrate its components into some of the most popular creative applications.
Also rolling out is the new NVIDIA AI Inference Manager (AIM) SDK, which is a streamlined AI deployment toolkit for PC developers. AIM offers developers:
- Unified inference APU for all backends (NIM, DML, TRT, etc.) and hardware (cloud, local GPU, etc.)
- Hybrid orchestration on PC and cloud inference with PC capability verification
- Download and configure models and runtime environment on PC
- Low-latency integration into the gaming pipeline
- Running CUDA and graphics simultaneously
The NVIDIA AIM SDK is now available in Early Access and supports all major inference backends such as TensorRT, DirectML, Llama.cpp and PyTorch CUDA on GPUs, CPUs and NPUs.
NVIDIA ACE NIMs are on full display at Computex, GenAI Digital Avatar microservices now available for RTX AI PCs
Finally, we have NVIDIA’s NIM ACEs debuting today. These new ACE Inference microservices reduce ACE model deployment time from weeks to minutes and run locally on PC devices for natural language understanding, speech synthesis, facial animation, and more.
NVIDIA will showcase Covert Protocol Tech Demo development by Inworld AI at Computex while developers will also showcase their own ACE models at the event, such as Aww Inc’s (Audio2Face) digital brand ambassador, Code Z from OutPalm (Audio2Face), the multi-language demo (Audio2Face), social engineering demo from Soulshell (Audio2Face) and Sophie from UneeQ (Audio2Face).
And it doesn’t stop there, NVIDIA also announced that ACE (Avatar Cloud Engine) is now generally available for the cloud, paving the way for the future of GenAI Avatars. With these digital human micro-services, you benefit from the following technologies:
- NVIDIA Riva ASR, TTS and NMT — for automatic voice recognition,
text to speech conversion and translation - NVIDIA Nemotron LLM — for language understanding and context
response generation - NVIDIA Audio2Face — for realistic facial animation based on audio tracks
- NVIDIA Omniverse RTX — for realistic skin and hair traced in real time
- NVIDIA Audio2Gesture — to generate audio-based body gestures
songs, available soon - NVIDIA Nemotron-3 4.5B — a new small language model (SLM)
specifically designed for low latency PC RTX AI inferencing on device
As you can see, NVIDIA has revealed a lot of exciting technologies and innovations in the AI PC segment, powered by its RTX GPU and RTX platform. This shows NVIDIA’s leadership status in the AI sector and why it remains unrivaled.