HERMES - Hardened Edge Relay & Machine-intelligent Encrypted Speech¶

HERMES is the voice and VoIP communications suite of the LDM SDK. It delivers encrypted voice, rich media messaging, and AI-powered human-to-machine interfacing across tactical networks — from high-bandwidth LAN to degraded RF links — entirely offline.
The Problem¶
Vehicle crews depend on voice communications for coordination, yet military VoIP solutions are often tightly coupled to specific radio hardware or require cloud connectivity for features like conferencing and call routing. Adding AI assistants or natural-language vehicle control typically means streaming audio to external servers — unacceptable in classified or denied environments. The result is a patchwork of intercom, radio, and data systems with no unified communications layer.
How HERMES Solves It¶
HERMES provides a complete SIP-based communications stack that runs entirely on the vehicle. The VoIP Server acts as a local SIP registrar and proxy, handling encrypted calls (SRTP AES-256-CM), conferencing, and gateway trunks to external networks. The VoIP Agent adds an AI assistant that crew members can speak to naturally — speech recognition, LLM inference, and speech synthesis all execute locally with zero cloud dependency. The Model Context Protocol (MCP) gives the AI agent access to vehicle tools such as MGRS lookup, bearing/distance calculations, and system queries.
Capabilities¶
Encrypted Voice¶
End-to-end encrypted voice communications using SRTP (AES-256-CM) with military-grade codecs (MELPe, STANAG 4591). Supports crew intercom, vehicle-to-vehicle, and vehicle-to-command post calling over any IP bearer.
Rich Media Delivery¶
Beyond voice, HERMES enables rich media exchange between crew stations and command systems:
- Text messaging — Encrypted SIP MESSAGE-based text chat between registered users and groups
- File transfer — In-band delivery of mission files, overlays, and imagery via SIP/MSRP
- Presence & status — Real-time online/offline/busy indicators for all registered users
- Group notifications — Broadcast alerts and mission updates to conference groups
- Call logs & CDR — Searchable history of all voice and text interactions with timestamps
Rich media is delivered using the same encrypted SIP infrastructure as voice, ensuring all communications maintain the same security posture without requiring separate data channels.
Human-to-Machine Interfacing¶
HERMES provides a natural language interface between crew members and vehicle systems through the AI-powered VoIP Agent:
- Voice commands — Speak natural language queries and commands to the vehicle AI assistant
- Speech-to-Text (STT) — Offline Vosk speech recognition converts voice to text for processing
- Text-to-Speech (TTS) — Offline Piper TTS delivers spoken responses back to the operator
- MCP Tool Integration — The AI agent can execute vehicle tools (MGRS lookup, bearing/distance calculations, system queries) via the Model Context Protocol
- Conversational context — Multi-turn dialogue with memory for complex interactions
- Hands-free operation — Operators interact with vehicle systems without leaving their tactical display
- Automated responses — Auto-answer mode for intercom relay, status announcements, and unmanned stations
The human-to-machine pipeline runs entirely offline — no cloud connectivity required. LLM inference (Ollama/Gemma), speech recognition (Vosk), and speech synthesis (Piper) all execute locally on the vehicle compute platform.
Architecture¶
SIP Registrar/Proxy] AGT[VoIP Agent
AI Assistant] MCP[MCP Tool Server
Vehicle Tools] end subgraph "Clients" CLI1[VoIP Client 1
Crew Station] CLI2[VoIP Client 2
Commander] CLI3[VoIP Client 3
Driver] end subgraph "External Networks" EXT[External SIP
Gateway/Trunk] AI[Ollama LLM
Local Inference] end subgraph "Media & Messaging" VOICE[Encrypted Voice
SRTP/RTP] MSG[Text & Media
SIP MESSAGE/MSRP] end CLI1 <-->|SIP/RTP| SRV CLI2 <-->|SIP/RTP| SRV CLI3 <-->|SIP/RTP| SRV AGT <-->|SIP/RTP| SRV SRV <-->|SIP Trunk| EXT AGT <-->|HTTP| AI AGT <-->|MCP| MCP SRV --- VOICE SRV --- MSG style SRV fill:#4CAF50 style AGT fill:#9C27B0 style MCP fill:#FF9800 style CLI1 fill:#2196F3 style CLI2 fill:#2196F3 style CLI3 fill:#2196F3 style VOICE fill:#00796B style MSG fill:#00796B
Components¶
VoIP Server¶
The VoIP Server acts as a SIP registrar and proxy, managing call routing between clients within the vehicle network and to external networks.
Key Features:
- SIP user registration and authentication
- Call routing and proxying
- CDR (Call Detail Records) logging
- DSCP EF (Expedited Forwarding) QoS marking
- Military codec support (MELPe, STANAG 4591)
Learn more about VoIP Server →
VoIP Client (Standalone)¶
The standalone VoIP Client runs as its own desktop window with a tabbed interface for dialing, contacts, settings, and help. Ideal for development, testing, and crew stations without a GVA HMI.

Key Features:
- SIP registration and calling
- Multiple codec support (G.711, G.729, MELPe, Opus)
- SRTP encryption (RFC 3711)
- Address book management
- Audio input/output device selection
- Headless mode for intercom/relay
- Auto-answer with WAV playback
Learn more about VoIP Client (Standalone) →
VoIP Client (GVA External App)¶
The GVA VoIP Client runs as a Display Extension inside the GVA HMI's COM screen (F7), providing integrated communications without the crew leaving their tactical display.

Key Features:
- Renders within GVA HMI safe area (bezel-aware)
- Conference groups and online presence
- Call logs with timestamps and duration
- Integrated contacts and address book
- Respects active HMI theme colours
- DDS registration via GVA Registry Service
Learn more about VoIP Client (GVA) →
VoIP Agent¶
The VoIP Agent is an AI-powered voice assistant that provides the human-to-machine interface. It answers calls and responds using natural language, or operates in standalone mode with a local microphone and speaker.
Key Features:
- Ollama/Gemma LLM integration (local inference, no cloud)
- Piper TTS (offline text-to-speech)
- Vosk STT (offline speech recognition)
- MCP (Model Context Protocol) tool support
- Military coordinate tools (MGRS, bearing, distance)
- Standalone mode (local mic/speaker)
- Multi-turn conversational context
MCP Tool Server¶
The MCP Tool Server exposes vehicle system tools to the AI agent via the Model Context Protocol, enabling natural language access to platform capabilities.
Key Features:
- MGRS coordinate conversion
- Bearing and distance calculations
- Vehicle system status queries
- Extensible tool plugin architecture
Learn more about MCP Tool Server →
Audio Codecs¶
The VoIP suite supports multiple audio codecs optimised for different bandwidth and quality requirements:
| Codec | Bitrate | Sample Rate | Description |
|---|---|---|---|
| G.711 μ-law | 64 kbps | 8 kHz | Standard telephony, highest quality |
| G.711 A-law | 64 kbps | 8 kHz | European telephony standard |
| G.729 | 8 kbps | 8 kHz | Low bandwidth, good quality |
| MELPe | 2.4 kbps | 8 kHz | Military standard (MIL-STD-3005) |
| STANAG 4591 | 1.2-2.4 kbps | 8 kHz | NATO interoperability (TSVCIS) |
| Opus | 6-510 kbps | 48 kHz | Modern, adaptive bitrate |
Security¶
SRTP Encryption¶
All VoIP communications can be encrypted using SRTP (Secure Real-time Transport Protocol) per RFC 3711. Key exchange is performed via SDP during call setup.
Network QoS¶
VoIP traffic is marked with DSCP EF (Expedited Forwarding, 46) for priority handling on managed networks. This ensures voice packets receive preferential treatment over data traffic.
Supported RFCs & Standards¶
HERMES implements the following IETF RFCs and military standards:
SIP Signalling¶
| RFC | Title | Usage |
|---|---|---|
| RFC 3261 | SIP: Session Initiation Protocol | Core call signalling |
| RFC 3262 | Reliability of Provisional Responses | PRACK for reliable 1xx |
| RFC 3263 | Locating SIP Servers | DNS SRV/NAPTR resolution |
| RFC 3264 | Offer/Answer Model with SDP | Codec negotiation |
| RFC 3265 | SIP-Specific Event Notification | Presence and subscriptions |
| RFC 3311 | UPDATE Method | Mid-dialog session updates |
| RFC 3323 | Privacy Mechanism for SIP | Caller ID privacy |
| RFC 3428 | SIP Extension for Instant Messaging | Text messaging (MESSAGE) |
| RFC 3515 | SIP REFER Method | Call transfer |
| RFC 3891 | SIP Replaces Header | Attended transfer |
Media & Transport¶
| RFC | Title | Usage |
|---|---|---|
| RFC 3550 | RTP: Real-Time Transport Protocol | Audio media delivery |
| RFC 3551 | RTP Profile for Audio/Video | Payload type assignments |
| RFC 3711 | SRTP: Secure Real-time Transport Protocol | Media encryption (AES-256-CM) |
| RFC 4568 | SDP Security Descriptions | SRTP key exchange via SDP |
| RFC 4733 | DTMF Relay via RTP (telephone-event) | In-band DTMF tones |
| RFC 5761 | Multiplexing RTP and RTCP | Single-port media |
Codecs¶
| RFC / Standard | Title | Usage |
|---|---|---|
| RFC 7587 | RTP Payload Format for Opus | Opus codec negotiation |
| RFC 8817 | TSVCIS RTP Payload Format | NATO STANAG 4591 wideband |
| MIL-STD-3005 | MELPe (Mixed Excitation Linear Prediction) | 2.4 kbps tactical voice |
| STANAG 4591 | NATO Narrowband & Wideband Voice Coding | NATO interoperability |
| ITU-T G.711 | Pulse Code Modulation (μ-law / A-law) | Standard telephony |
| ITU-T G.729 | CS-ACELP 8 kbps | Low bandwidth voice |
Authentication & Security¶
| RFC | Title | Usage |
|---|---|---|
| RFC 2617 | HTTP Digest Authentication | SIP digest auth |
| RFC 3261 §26 | SIP Security Mechanisms | TLS transport |
| RFC 5061 | DNS SRV for SIP over TLS | Secure trunk discovery |
| RFC 2474 | Differentiated Services (DSCP) | QoS packet marking |
| RFC 3246 | Expedited Forwarding PHB | EF (46) for voice |
Session Description¶
| RFC | Title | Usage |
|---|---|---|
| RFC 4566 | SDP: Session Description Protocol | Media negotiation |
| RFC 3605 | RTCP Attribute in SDP | RTCP port signalling |
| RFC 5245 | ICE: Interactive Connectivity Establishment | NAT traversal |
| RFC 8866 | SDP (revised) | Updated SDP syntax |
Quick Start¶
Starting the VoIP Server¶
# Start server on default port (5060)
./build/bin/gva-voip-server --domain=0
# With recording enabled
./build/bin/gva-voip-server --domain=0 --record-calls
# With DSCP EF marking
./build/bin/gva-voip-server --domain=0 --dscp-ef
Starting a VoIP Client¶
# Register with local server
./build/bin/gva-voip-client --user=operator1 --server=127.0.0.1
# With secure calling
./build/bin/gva-voip-client --user=operator1 --server=127.0.0.1 --srtp
Starting the AI Agent¶
# VoIP mode (answers calls)
./build/bin/gva-voip-agent --user=agent.ai --server=127.0.0.1 --model=gemma2
# Standalone mode (local mic/speaker)
./build/bin/gva-voip-agent --standalone --model=gemma2