5 Tips to enhance voice surveillance
With regulatory expectations around voice channels increasing, firms need to ensure that they’re not only capturing relevant voice channels, but using voice data to enhance their surveillance and compliance.
Written by a human
With so many different forms of digital communication available, the idea of speaking to a contact over the phone might feel ‘old school’. However, in business verticals like financial services, voice conversations are still a vital part of day-to-day operations – and as such are increasingly subject to regulatory scrutiny.
The EU’s MiFID II and the FCA’s SYSC 10A continue to require firms to retain records of voice calls related to trades. In the U.S., FINRA’s 2026 Annual Regulatory Oversight Report consistently makes reference to the importance of recordkeeping lapses more than 50 times and has introduced a dedicated section on generative AI for the first time, with direct implications for how firms capture, monitor, and manage voice data. Meanwhile, since 2021, the Securities and Exchange Commission (SEC) and Commodity Futures Trading Commission (CFTC) have levied over $3.5 billion in fines related to off-channel communications failures – many of which have included voice communications.
Against this backdrop, and with a regulatory understanding that voice solutions now exist – and are working – compliant voice solutions are no longer a secondary consideration for firms.
While previous surveillance methods like ‘dip sampling’ recorded voice conversations may have sufficed in the past, firms are now required to capture and monitor voice communications at increasing scale, and to conduct similar monitoring of audio communications – such as key word analysis – in the same way they do for digital channels. Firms looking to establish or strengthen an effective voice capture, archiving, and surveillance strategy can benefit from these five areas of focus.
1) Know which communications channels might have voice functionality
There are many channels where the need to capture voice will be obvious. Phone calls can be placed on ‘actual’ hardware like landline and mobile phones, but also via Voice Over Internet Protocol (VoIP) services that place calls over internet connections.
The rise of hybrid working has introduced a range of video meeting and collaboration platforms, with these video calls naturally including an audio component that should be captured. Familiar names include Zoom, Cisco Webex, and Microsoft Teams, and firms should ensure these channels are being considered when establishing a voice capture and surveillance program.
However, other channels present perhaps unexpected voice-communications-based risk. Trading venues and turrets can include both messaging and voice chat functionalities to enable traders to make real-time decisions. Popular messaging and social media platforms like WhatsApp, X, Instagram, and LinkedIn all feature voice-note or messaging functionality, which potentially gives users looking to hide misconduct through off-channel communications another means of doing so – and compliance teams another challenge to tackle.
AI-powered tools introduce a further layer of voice risk. Some firms have begun deploying AI agents that conduct, summarize, or assist with calls — and these tools generate their own voice-adjacent outputs (transcripts, summaries, action items) that may also carry recordkeeping obligations. Any deployment of AI in or around voice workflows should be included in your channel inventory.
2) Comprehensively review your voice systems and workflows – and keep reviewing them
Knowing which channels are capable of voice communications is one thing, but knowing which channels your teams actually use to conduct voice communications is vital in building an effective capture and surveillance program.
While your teams might use Microsoft Teams for digital messaging and collaboration, they may use Zoom for calls with external clients, and Cisco for internal calls. Understanding where your teams are conducting voice communications means you aren’t investing resource into implementing solutions in the wrong areas, and will allow you to research comprehensive end-to-end solutions that can capture and archive data from all the relevant voice channels. This workflow audit will help to establish where potential gaps in your posture are, what solutions might be required to plug them, and where you might need to work with third-party partners to meet regulatory requirements.
Return-to-office mandates across many large financial institutions since 2024 have reshuffled voice channel usage in ways that aren’t always reflected in documentation. Firms that audited their voice workflows in 2022 or 2023 — when remote work was dominant — may find that actual usage patterns have shifted materially. A fresh audit is warranted, particularly for trading floor environments where turret and soft-phone usage has evolved.
Performing a compliance and workflow review isn’t ‘one and done’ – new voice channels come into the market constantly, and existing communications channels are adding voice capabilities. Compliance teams need to make sure their strategies are consistently reviewed to ensure they’re functioning as intended and can be adapted as required to mitigate new risks or challenges.
3) Ensure recordings are automatically archived – and that archive data is complete
Some voice and video communication applications will automatically save and store recordings on a user’s system in local files, or perhaps even on a shared cloud platform. However, this means that files are easily accessible for users, and can be subject to accidental loss, corruption, or even purposeful deletion should a user be trying to hide records of conversations. Voice files are often quite large compared to text files, and over-zealous system administrators looking to save space might look to delete them – without realizing the potential repercussions.
Where applications or workflows are not set up to automatically save recordings, such as traditional landline or mobile phones, firms run a real risk of conversations not being recorded at all, or data being lost or corrupted where teams attempt to move data to a separate archiving system.
By employing a solution that captures from source and automatically sends all recordings to a compliant archive, firms gain peace of mind knowing that their data will be captured and retained without needing to take additional steps, and risk of loss is eliminated. Regulators take an exceptionally dim view of firms being unable to provide data on request, and automatic archiving means that firms will never turn up empty-handed should regulators come knocking.
Storing data in a centralized, compliant archive also means that firms keep all of their compliance data in one highly accessible location, making it easier to build cases or notice signs of potential misconduct by reviewing relevant data from different channels alongside one another, giving access to vital context and minimizing the potential time and resource needed to locate files.
4) Treat AI-generated outputs from voice as records in their own right
AI-generated outputs for voice are fast-evolving and not yet widely understood. When AI tools transcribe a call, summarise a meeting, or generate a surveillance flag from a voice recording, those outputs are increasingly seen as records, not just processing artefacts.
One regulator leading the charge in this area is FINRA. Looking at FINRA’s 2026 Oversight Report, the U.S. regulator remains technology neutral but makes clear that firms remain fully accountable for how AI is used across communications, supervision, and documentation. That means the transcript sitting in your AI vendor’s system, the call summary emailed to a relationship manager, and the automated misconduct flag generated by a surveillance model may all be within the scope of your recordkeeping obligations. If they relate to ‘business as such’, they need to be retained.
Practically, this requires:
- A retention policy that explicitly covers AI-generated voice outputs, not just audio files
- Clarity on where AI-generated outputs are stored, including third-party vendor systems, and whether those locations meet WORM or equivalent retention standards
- Supervision frameworks that treat AI-generated communications (call summaries sent to clients, AI-drafted follow-ups) as communications subject to review
- Logging of prompts and AI outputs for accountability, consistent with FINRA’s guidance on GenAI governance
Firms that have invested in AI transcription and surveillance tools without extending their records governance to cover those tools’ outputs have an emerging gap that examiners are increasingly likely to identify on a global scale.
5) Go beyond voice files and leverage accurate transcription and cross-channel correlation
Current ‘dip sampling’ practices, where calls are selected at random to be reviewed, are increasingly inadequate and present a very real risk of signs of misconduct being missed. Even the largest firms do not have the resources to manually review the hundreds of hours of voice conversations that take place daily, and sifting through ‘business as usual’ conversations present other reviewing challenges< such as language barriers and trading jargon.
Accurate transcription of calls and voice files gives a full written record of conversations, with speaker separation and associated metadata generated automatically and stored in a compliant archive. From there, the transcription can be easily searched through and will fall under a firm’s wider compliance policies. Potential warning signs of misconduct that can be monitored for include:
So, for example, any potential warning signs of misconduct that would have taken many hours of manual listening and transcription can be identified and flagged automatically by compliance policies. Potential warning signs of misconduct that can be monitored for include:
- Signs of non-financial misconduct, such as racist, threatening, or derogatory language
- Indications of insider trading or market manipulation, spoofed instructions, pressure selling on recorded lines, or off-channel referrals – even if keyword triggers are potentially obscured by trading slang or jargon
- Potential signs of off-channel communications, such as speakers being asked to “contact me on WhatsApp” or “let’s discuss offline”
By having access to accurately transcribed records of conversations, stored alongside the audio files, firms can more easily integrate voice channels into their existing compliance workflows and automatically flag potential risks for further review.
Transcription alone is no longer the finish line; it’s the starting point. Voice data becomes most powerful when correlated across channels. A message saying “call me” followed by an unrecorded or unreviewed phone call is a pattern, not just two separate events. AI-enabled solutions will be critical here in detecting and flagging pattern anomalies. Surveillance systems that can join voice data with electronic communications for the same user, around the same time, provide a material advantage in identifying evasion and conduct risk. Firms should evaluate whether their surveillance tooling enables this kind of cross-channel view.
Global Relay Voice is designed to address each of these challenges in a single, integrated platform. Its broad Connector library spans trading turrets, soft phones, mobile, and collaboration tools, with automatic data capture directed to a compliant, AI-enabled Archive. This removes the risk of data loss or gaps and ensures a consistent, searchable audit trail.
Critically for 2026, all voice data and AI-generated outputs — transcripts, translations, surveillance flags — are retained within the same governed environment, directly meeting emerging recordkeeping obligations. AI-powered transcription in 57 languages, speaker separation, and cross-channel surveillance then deliver at-scale, contextualised monitoring: moving firms beyond dip sampling toward genuine risk detection, with the metadata quality and audit trail that regulators expect to see when they come knocking.