HERMES - Hardened Edge Relay & Machine-intelligent Encrypted Speech

HERMES

HERMES

HERMES is the voice and VoIP communications suite of the LDM SDK. It delivers encrypted voice, rich media messaging, and AI-powered human-to-machine interfacing across tactical networks — from high-bandwidth LAN to degraded RF links — entirely offline.

The Problem

Vehicle crews depend on voice communications for coordination, yet military VoIP solutions are often tightly coupled to specific radio hardware or require cloud connectivity for features like conferencing and call routing. Adding AI assistants or natural-language vehicle control typically means streaming audio to external servers — unacceptable in classified or denied environments. The result is a patchwork of intercom, radio, and data systems with no unified communications layer.

How HERMES Solves It

HERMES provides a complete SIP-based communications stack that runs entirely on the vehicle. The VoIP Server acts as a local SIP registrar and proxy, handling encrypted calls (SRTP AES-256-CM), conferencing, and gateway trunks to external networks. The VoIP Agent adds an AI assistant that crew members can speak to naturally — speech recognition, LLM inference, and speech synthesis all execute locally with zero cloud dependency. The Model Context Protocol (MCP) gives the AI agent access to vehicle tools such as MGRS lookup, bearing/distance calculations, and system queries.

Capabilities

Encrypted Voice

End-to-end encrypted voice communications using SRTP (AES-256-CM) with military-grade codecs (MELPe, STANAG 4591). Supports crew intercom, vehicle-to-vehicle, and vehicle-to-command post calling over any IP bearer.

Rich Media Delivery

Beyond voice, HERMES enables rich media exchange between crew stations and command systems:

  • Text messaging — Encrypted SIP MESSAGE-based text chat between registered users and groups
  • File transfer — In-band delivery of mission files, overlays, and imagery via SIP/MSRP
  • Presence & status — Real-time online/offline/busy indicators for all registered users
  • Group notifications — Broadcast alerts and mission updates to conference groups
  • Call logs & CDR — Searchable history of all voice and text interactions with timestamps

Rich media is delivered using the same encrypted SIP infrastructure as voice, ensuring all communications maintain the same security posture without requiring separate data channels.

Human-to-Machine Interfacing

HERMES provides a natural language interface between crew members and vehicle systems through the AI-powered VoIP Agent:

  • Voice commands — Speak natural language queries and commands to the vehicle AI assistant
  • Speech-to-Text (STT) — Offline Vosk speech recognition converts voice to text for processing
  • Text-to-Speech (TTS) — Offline Piper TTS delivers spoken responses back to the operator
  • MCP Tool Integration — The AI agent can execute vehicle tools (MGRS lookup, bearing/distance calculations, system queries) via the Model Context Protocol
  • Conversational context — Multi-turn dialogue with memory for complex interactions
  • Hands-free operation — Operators interact with vehicle systems without leaving their tactical display
  • Automated responses — Auto-answer mode for intercom relay, status announcements, and unmanned stations

The human-to-machine pipeline runs entirely offline — no cloud connectivity required. LLM inference (Ollama/Gemma), speech recognition (Vosk), and speech synthesis (Piper) all execute locally on the vehicle compute platform.

Architecture

graph TB subgraph "HERMES Services" SRV[VoIP Server
SIP Registrar/Proxy] AGT[VoIP Agent
AI Assistant] MCP[MCP Tool Server
Vehicle Tools] end subgraph "Clients" CLI1[VoIP Client 1
Crew Station] CLI2[VoIP Client 2
Commander] CLI3[VoIP Client 3
Driver] end subgraph "External Networks" EXT[External SIP
Gateway/Trunk] AI[Ollama LLM
Local Inference] end subgraph "Media & Messaging" VOICE[Encrypted Voice
SRTP/RTP] MSG[Text & Media
SIP MESSAGE/MSRP] end CLI1 <-->|SIP/RTP| SRV CLI2 <-->|SIP/RTP| SRV CLI3 <-->|SIP/RTP| SRV AGT <-->|SIP/RTP| SRV SRV <-->|SIP Trunk| EXT AGT <-->|HTTP| AI AGT <-->|MCP| MCP SRV --- VOICE SRV --- MSG style SRV fill:#4CAF50 style AGT fill:#9C27B0 style MCP fill:#FF9800 style CLI1 fill:#2196F3 style CLI2 fill:#2196F3 style CLI3 fill:#2196F3 style VOICE fill:#00796B style MSG fill:#00796B

Components

VoIP Server

The VoIP Server acts as a SIP registrar and proxy, managing call routing between clients within the vehicle network and to external networks.

Key Features:

  • SIP user registration and authentication
  • Call routing and proxying
  • CDR (Call Detail Records) logging
  • DSCP EF (Expedited Forwarding) QoS marking
  • Military codec support (MELPe, STANAG 4591)

Learn more about VoIP Server →

VoIP Client (Standalone)

The standalone VoIP Client runs as its own desktop window with a tabbed interface for dialing, contacts, settings, and help. Ideal for development, testing, and crew stations without a GVA HMI.

Standalone VoIP Client

Key Features:

  • SIP registration and calling
  • Multiple codec support (G.711, G.729, MELPe, Opus)
  • SRTP encryption (RFC 3711)
  • Address book management
  • Audio input/output device selection
  • Headless mode for intercom/relay
  • Auto-answer with WAV playback

Learn more about VoIP Client (Standalone) →

VoIP Client (GVA External App)

The GVA VoIP Client runs as a Display Extension inside the GVA HMI's COM screen (F7), providing integrated communications without the crew leaving their tactical display.

GVA VoIP Client In-Call

Key Features:

  • Renders within GVA HMI safe area (bezel-aware)
  • Conference groups and online presence
  • Call logs with timestamps and duration
  • Integrated contacts and address book
  • Respects active HMI theme colours
  • DDS registration via GVA Registry Service

Learn more about VoIP Client (GVA) →

VoIP Agent

The VoIP Agent is an AI-powered voice assistant that provides the human-to-machine interface. It answers calls and responds using natural language, or operates in standalone mode with a local microphone and speaker.

Key Features:

  • Ollama/Gemma LLM integration (local inference, no cloud)
  • Piper TTS (offline text-to-speech)
  • Vosk STT (offline speech recognition)
  • MCP (Model Context Protocol) tool support
  • Military coordinate tools (MGRS, bearing, distance)
  • Standalone mode (local mic/speaker)
  • Multi-turn conversational context

Learn more about VoIP Agent →

MCP Tool Server

The MCP Tool Server exposes vehicle system tools to the AI agent via the Model Context Protocol, enabling natural language access to platform capabilities.

Key Features:

  • MGRS coordinate conversion
  • Bearing and distance calculations
  • Vehicle system status queries
  • Extensible tool plugin architecture

Learn more about MCP Tool Server →

Audio Codecs

The VoIP suite supports multiple audio codecs optimised for different bandwidth and quality requirements:

Codec Bitrate Sample Rate Description
G.711 μ-law 64 kbps 8 kHz Standard telephony, highest quality
G.711 A-law 64 kbps 8 kHz European telephony standard
G.729 8 kbps 8 kHz Low bandwidth, good quality
MELPe 2.4 kbps 8 kHz Military standard (MIL-STD-3005)
STANAG 4591 1.2-2.4 kbps 8 kHz NATO interoperability (TSVCIS)
Opus 6-510 kbps 48 kHz Modern, adaptive bitrate

Security

SRTP Encryption

All VoIP communications can be encrypted using SRTP (Secure Real-time Transport Protocol) per RFC 3711. Key exchange is performed via SDP during call setup.

Network QoS

VoIP traffic is marked with DSCP EF (Expedited Forwarding, 46) for priority handling on managed networks. This ensures voice packets receive preferential treatment over data traffic.

Supported RFCs & Standards

HERMES implements the following IETF RFCs and military standards:

SIP Signalling

RFC Title Usage
RFC 3261 SIP: Session Initiation Protocol Core call signalling
RFC 3262 Reliability of Provisional Responses PRACK for reliable 1xx
RFC 3263 Locating SIP Servers DNS SRV/NAPTR resolution
RFC 3264 Offer/Answer Model with SDP Codec negotiation
RFC 3265 SIP-Specific Event Notification Presence and subscriptions
RFC 3311 UPDATE Method Mid-dialog session updates
RFC 3323 Privacy Mechanism for SIP Caller ID privacy
RFC 3428 SIP Extension for Instant Messaging Text messaging (MESSAGE)
RFC 3515 SIP REFER Method Call transfer
RFC 3891 SIP Replaces Header Attended transfer

Media & Transport

RFC Title Usage
RFC 3550 RTP: Real-Time Transport Protocol Audio media delivery
RFC 3551 RTP Profile for Audio/Video Payload type assignments
RFC 3711 SRTP: Secure Real-time Transport Protocol Media encryption (AES-256-CM)
RFC 4568 SDP Security Descriptions SRTP key exchange via SDP
RFC 4733 DTMF Relay via RTP (telephone-event) In-band DTMF tones
RFC 5761 Multiplexing RTP and RTCP Single-port media

Codecs

RFC / Standard Title Usage
RFC 7587 RTP Payload Format for Opus Opus codec negotiation
RFC 8817 TSVCIS RTP Payload Format NATO STANAG 4591 wideband
MIL-STD-3005 MELPe (Mixed Excitation Linear Prediction) 2.4 kbps tactical voice
STANAG 4591 NATO Narrowband & Wideband Voice Coding NATO interoperability
ITU-T G.711 Pulse Code Modulation (μ-law / A-law) Standard telephony
ITU-T G.729 CS-ACELP 8 kbps Low bandwidth voice

Authentication & Security

RFC Title Usage
RFC 2617 HTTP Digest Authentication SIP digest auth
RFC 3261 §26 SIP Security Mechanisms TLS transport
RFC 5061 DNS SRV for SIP over TLS Secure trunk discovery
RFC 2474 Differentiated Services (DSCP) QoS packet marking
RFC 3246 Expedited Forwarding PHB EF (46) for voice

Session Description

RFC Title Usage
RFC 4566 SDP: Session Description Protocol Media negotiation
RFC 3605 RTCP Attribute in SDP RTCP port signalling
RFC 5245 ICE: Interactive Connectivity Establishment NAT traversal
RFC 8866 SDP (revised) Updated SDP syntax

Quick Start

Starting the VoIP Server

# Start server on default port (5060)
./build/bin/gva-voip-server --domain=0

# With recording enabled
./build/bin/gva-voip-server --domain=0 --record-calls

# With DSCP EF marking
./build/bin/gva-voip-server --domain=0 --dscp-ef

Starting a VoIP Client

# Register with local server
./build/bin/gva-voip-client --user=operator1 --server=127.0.0.1

# With secure calling
./build/bin/gva-voip-client --user=operator1 --server=127.0.0.1 --srtp

Starting the AI Agent

# VoIP mode (answers calls)
./build/bin/gva-voip-agent --user=agent.ai --server=127.0.0.1 --model=gemma2

# Standalone mode (local mic/speaker)
./build/bin/gva-voip-agent --standalone --model=gemma2