At NVIDIA GTC 2024, Sterling introduced SkyWave™, a developer private-cloud service for research and innovation in advanced 5G and 6G wireless networks. Collaborating with NVIDIA, Sterling integrated the NVIDIA Aerial Commercial Testbed, also known as the Aerial Ran CoLab Over-the-Air (ARC-OTA) to provide a managed private 5G cloud service for developers. SkyWave combines ARC-OTA’s powerful full stack AI-RAN software and hardware components with a suite of Sterling technical services and support.
With SkyWave, researchers and developers can leverage the first fully programmable cellular network testbed service quickly, based on the NVIDIA Aerial Commercial Testbed. Users can rapidly experiment, simulate, prototype, validate, and benchmark innovative new software on the over-the-air research network, enabling them to accelerate commercial grade development of AI-native wireless networks.
From Algorithm Innovation to OTA

SkyWave reduces the complexities of procuring, installing, and testing the NVIDIA ARC-OTA components, allowing researchers to focus on their specific 5G and 6G experiments instead of the complexities of deployment. SkyWave also provides remote support and quarterly updates to our research customers.
Our work with ARC-OTA extends beyond being the provider of SkyWave; we are also a developer on the platform. Sterling has developed and released SkyWave Service Management, an ARC-OTA Developer Extension, and an ARC-OTA Developer Plugin called SkyWave GPU Partition.
SkyWave Service Management
The developer plugin delivers two fundamental features for ARC-OTA. The Service Orchestration feature provides a method to transition ARC-OTA from non-orchestrated containers using docker to an orchestrated platform using Kubernetes. This year-long development effort was completed by GTC 2024.
The Service Monitoring feature uses a combination of Grafana, Loki, Promtail and Prometheus to provide dashboards that display ARC-OTA service status and current KPIs. The feature was developed using open-source industry-standard tools and it provides a platform that can be extended or modified for each end researcher’s needs.
SkyWave GPU Partition Plugin
The plugin provides a guide to configure the ARC-OTA gNB GPU in a Supermicro GH200 server into multiple MIG partitions when used in a Kubernetes deployment. It was developed to allow the end researcher to experiment with multiple simultaneous GPU accelerated workloads on their ARC-OTA system.As an initial test case for this feature, Sterling is in the process of working with NVIDIA to deploy the NVIDIA Multimodal O-RAN RAG Chatbot project on ARC-OTA. Sterling’s work included optimizing a Llama 3.1 LLM using TensorRT-LLM and deploying it and an embeddings model to Triton Inference Server running in the second GPU partition. Sterling also modified NVIDIA’s code base to migrate it from using NVIDIA AI Endpoints to using an on-premises instance of Triton.
Future Development
Sterling is currently working with NVIDIA to develop an agentic solution for RAN systems. The agentic solution will monitor network traffic and will be able to modify specific configurations of the gNB. This work was presented at GTC 2025 as DLI course “
Automate 5G Network Configurations with NVIDIA AI LLM Agents and Kinetica Accelerated Database.”