Nvidia Helps You to Build Lifelike Digital Humans to Transform Industries

NVIDIA ACE—a suite of technologies bringing digital humans to life with generative AI—is now generally available for developers. Packaged as NVIDIA NIMs, these inference microservices enable developers to deliver high-quality natural language understanding, speech synthesis, and facial animation for gaming, customer service, healthcare, and more.

NVIDIA is also introducing ACE PC NIM microservices for deployment across the installed base of 100M RTX AI PCs and laptops through early access.

ACE now available for production deployment

Leading game and platform developers are revolutionizing real-time character interactions across diverse industries. Companies such as Aww Inc, Dell Technologies, Gumption, Hippocratic AI, Inventec, OurPalm, Perfect World Games, Reallusion, ServiceNow, SoulBotix, SoulShell, and Uneeq are embracing and integrating ACE into their platforms and applications.

Nvidia has a suite of digital human technologies, including NVIDIA Riva, NVIDIA Audio2Face, and NVIDIA Omniverse RTX. They’re available through NVIDIA AI Enterprise.

Microservices are on the NVIDIA NGC Catalog and the NVIDIA ACE GitHub repository, including:

Riva ASR 2.15.1 adds a new English model with higher quality and accuracy.
Riva TTS 2.15.1 improves the representation of German, European, and Latin American Spanish, Mandarin, and Italian. Also included is the beta release of P-Flow, a fast and efficient flow-based model that can adapt to a new voice with very little data.
Riva NMT 2.15.1 adds a new 1.5B any-to-any translation model.
Audio2Face 1.011 adds more blendshape customization options at runtime, supports more audio sampling rates, and has improved lip sync and facial performance quality with Metahuman characters.
Omniverse Renderer Microservice 1.0.0 adds new animation data protocol and gRPC and HTTP endpoints.
Animation Graph Microservice 1.0.0 adds support for avatar position and facial expression animations.
ACE Agent 4.0.0 adds speech support for custom RAGs, Colang 2.0 support, and prebuilt support for example RAG workflows.

Early-access microservices include the following:

Nemotron-3 4.5B SLM 0.1.0 is designed for on-device inference and includes INT4 quantization for minimal VRAM usage.
Speech Live Portrait 0.1.0 animates a person’s portrait photo using audio and supports lip sync, blinking and head pose animation.
VoiceFont 1.1.1 reduces latency for real-time use cases and supports 4 concurrent batches across all GPUs.

1 thought on “Nvidia Helps You to Build Lifelike Digital Humans to Transform Industries”

  1. Right. Ready to never trust in deals and plights made by call and videoconference again.

    But I’m afraid the elderly will be nearly defenseless against phone and videoconference scams, with people that look and sound like their family members or someone they trust.

Comments are closed.