NVIDIA Introduces NIM Microservices to Enhance Generative AI in Digital Environments

Alvin Lang
Jul 30, 2024 07:08

NVIDIA unveils new NIM microservices and Metropolis reference workflow at SIGGRAPH to advance generative physical AI in various industries.

NVIDIA has announced significant advancements in generative physical AI, introducing new NIM microservices and the NVIDIA Metropolis reference workflow at SIGGRAPH. These innovations are designed to improve the training of physical machines and enhance their ability to handle complex tasks, according to NVIDIA Blog.

Generative AI in Physical Environments

Generative AI technology, already widely used for writing and learning, is now poised to assist in navigating the physical world. NVIDIA’s new offerings include three fVDB NIM microservices that support deep learning frameworks for 3D worlds and several USD NIM microservices for working with Universal Scene Description (USD), also known as OpenUSD.

The newly developed OpenUSD NIM microservices work in tandem with generative AI models to enable developers to integrate generative AI copilots and agents into USD workflows, thereby expanding the capabilities of 3D environments.

NVIDIA NIM Microservices Transform Physical AI Landscapes

Physical AI employs advanced simulations and learning methods to help robots and other automated systems perceive, reason, and navigate their surroundings more effectively. This technology is revolutionizing industries such as manufacturing and healthcare by advancing smart spaces and enhancing the functionality of robots, factory technologies, surgical AI agents, and autonomous vehicles.

NVIDIA provides a suite of NIM microservices tailored for specific models and industry applications, supporting capabilities in speech and translation, vision and intelligence, and realistic animation and behavior.

Turning Visual AI Agents Into Visionaries

Visual AI agents, which leverage computer vision capabilities, are designed to perceive and interact with the physical world. These agents are powered by vision language models (VLMs), a new class of generative AI models that bridge digital perception and real-world interaction. VLMs enhance decision-making, accuracy, interactivity, and performance, enabling visual AI agents to handle complex tasks more effectively.

Generative AI-powered visual AI agents are being rapidly deployed across various sectors, including hospitals, factories, warehouses, retail stores, airports, and traffic intersections. NVIDIA’s NIM microservices and reference workflows for physical AI provide developers with the tools needed to build and deploy high-performing visual AI agents.

Case Study: K2K Enhances Palermo’s Traffic Management

In Palermo, Italy, city traffic managers have deployed visual AI agents using NVIDIA NIM to gain physical insights and better manage roadways. K2K, an NVIDIA Metropolis partner, integrates NIM microservices and VLMs into AI agents that analyze live traffic camera feeds in real time. This allows city officials to ask questions in natural language and receive accurate insights and suggestions for improving city operations, such as adjusting traffic light timings.

Bridging the Simulation-to-Reality Gap

Many AI-driven businesses are adopting a “simulation-first” approach for generative physical AI projects. NVIDIA’s physical AI software, tools, and platforms, including NIM microservices and reference workflows, help streamline the creation of digital representations that accurately mimic real-world conditions. This approach is particularly beneficial for manufacturing, factory logistics, and robotics companies.

Vision language models (VLMs) are widely adopted across industries due to their ability to generate realistic imagery. However, they require immense volumes of data for training. Synthetic data generated from digital twins offers a powerful alternative, providing robust datasets for training physical AI models without the high costs and limitations of real-world data acquisition.

NVIDIA’s tools, such as NIM microservices and Omniverse Replicator, enable developers to build synthetic data pipelines for creating diverse datasets, enhancing the adaptability and performance of models like VLMs.

Availability

Developers can access NVIDIA’s state-of-the-art AI models and NIM microservices at ai.nvidia.com. The Metropolis NIM reference workflow is available in the GitHub repository, and Metropolis VIA microservices are available for download in developer preview. OpenUSD NIM microservices are also available in preview through the NVIDIA API catalog.

Image source: Shutterstock