Some customers would rather press the mic button than tap out a message — especially on WhatsApp, Messenger, and Instagram. Captivation Hub's Conversation AI bot can now meet them there. When a contact sends a voice note or audio file, the bot transcribes it, runs it through your existing training and settings, and replies the same way it would to text.
This article walks through what's supported, which channels work, how to switch the feature on, and the limits worth knowing about up front.
What is Audio Response in Conversation AI?
Audio Response gives your Conversation AI bot the ability to "hear" your customers. When a contact sends a voice note or audio attachment, Captivation Hub transcribes the audio to text, hands the transcript over to your bot, and the bot generates a context-aware reply — so the customer can talk naturally without ever switching to typing.
Supported Audio Types
Knowing which formats the bot can transcribe reliably will save you debugging time later.
- Platform-native voice notes — WhatsApp, Facebook, and Instagram voice notes recorded with each app's mic button. They're delivered into Captivation Hub as audio objects the bot can transcribe.
- File formats (uploads/attachments) — OGG, MP3, MP4 (audio-only), AAC, M4A, MPEG. Make sure the file is audio-only — video MP4s aren't supported as audio inputs.
- Multiple audio files in one interaction — Supported. When several audio files arrive close together, the bot treats them as one combined interaction.
Channel Compatibility
Audio Response only runs on channels where Conversation AI is already operating. Make sure each channel below is properly connected in Captivation Hub before expecting audio replies.
- Facebook Messenger
- Instagram Direct Messages
- SMS (MMS)
How To Set Up Audio Response
The setup itself is short — most of it is just opening the right bot and flipping a toggle.
Step 1 — Open the bot you want to configure. From your account, go to AI Agents > Conversation AI > Agent List. Click the three dots (⋮) next to the bot, then choose Edit to open its settings.
Step 2 — Enable Audio Responses. In the bot's settings, toggle on "Also allow this bot to respond to: Voice Notes." and save your changes.
Step 3 — Test on a connected channel. Send a voice note from WhatsApp or one of the connected social channels and confirm the bot replies as expected.
Behavior & Limitations
A few things worth knowing about timing and message handling — they affect how you design flows for audio-first customers.
- Wait Time aggregation — The bot waits the configured Wait Time Before Responding so it can collect multiple inbound messages (audio + text together) and send back a single unified reply.
- Message limit — The bot follows your Maximum Message Limit. When the limit is reached, the bot sleeps until reset, just like in your standard text flow.
- Transcripts & transparency — Want to see what the bot heard and why it answered the way it did? Open the AI Response Info sidebar inside Conversations to review the transcript, prompt, training sources, and response info.
- Channel policies — Delivery on Meta channels has to comply with their policy windows (for example, the 24-hour window on Messenger and Instagram). Plan your flows around that.
Frequently Asked Questions
Q: Does Audio Response cost extra?
Usage rolls up under standard Conversation AI billing, plus whatever your channel charges for messaging (SMS/MMS, WhatsApp, etc.).
Q: Will the bot reply with audio or with text?
The bot replies with standard channel messages — generally text — for maximum compatibility across channels. Design your flows assuming text replies.
Q: Can I limit audio handling to specific channels?
Yes. Just assign only the channels you want the bot to use in Bot Settings. The bot will listen for and respond to audio only on the channels you've turned on.
Q: How are multiple audio files handled?
Multiple audio files that arrive close together get transcribed and processed during the bot's Wait Time window, so the bot can craft one cohesive, context-aware reply instead of firing off multiple responses.
Q: Where do I see what the bot "heard" and why it responded a certain way?
Open the AI Response Info sidebar inside the conversation. From there you can review the transcript, the prompt the bot used, and the training sources it pulled from.
Comments
0 comments
Please sign in to leave a comment.