Send an audio clip and the recipient sees the native iMessage audio
message balloon — waveform, play button, scrubber — not a generic file
pill.
Send an audio message
Two-step flow: upload the audio file via POST /v1/files, then send it
by file ID. Common formats are accepted — m4a, mp3, wav, caf, aiff.
import { readFileSync } from "node:fs";
import { createClient } from "@messages-dev/sdk";
const client = createClient();
// 1. Upload the audio file
const file = await client.uploadFile({
file: readFileSync("clip.m4a"),
mimeType: "audio/mp4",
filename: "clip.m4a",
});
// 2. Send it as an audio message
await client.sendAudioMessage({
from: "+15551234567",
to: "+15559876543",
audioMessage: file.id,
});
File IDs are reusable — upload a clip once and send it to many
recipients without re-uploading.
Parameters
| Field | Required | Description |
|---|
from | Yes | Sender line handle. |
to | Yes | Recipient phone number, Apple ID, or chat ID (cht_...). |
audioMessage | Yes | File ID (file_...) of an audio file uploaded via POST /v1/files. |
replyTo | No | Message ID or GUID to reply to. |
Requirements
iMessage only — audio messages are not supported on SMS lines. Some
lines don’t support audio messages; sending from one of those returns
400 advanced_features_required.
Common audio formats are accepted:
m4a / aac
mp3
wav
caf
aiff
The audio is transcoded server-side, so you don’t need to pre-encode.
Receive an audio message
Inbound voice memos are delivered through the standard message.received
webhook. The message payload sets is_audio_message: true
and the audio attachment includes a transcription field with the text that
iMessage auto-generates on-device:
{
"event": "message.received",
"data": {
"id": "msg_abc123",
"chat_id": "cht_def456",
"sender": "+15559876543",
"text": null,
"is_from_me": false,
"is_audio_message": true,
"attachments": [
{
"filename": "Audio Message.caf",
"mime_type": "audio/x-caf",
"size": 24576,
"url": "https://files.messages.dev/...",
"transcription": "Hey, can you grab milk on the way home?"
}
],
"sent_at": 1710000000000
},
"timestamp": 1710000000123
}
Download the audio bytes from attachments[0].url and use transcription to
read what was said without running speech-to-text yourself.
transcription is generated by Apple on the receiving Mac and may be null
on the first delivery if it hasn’t finished yet (usually within a couple of
seconds). Non-voice-memo audio attachments (e.g. a drag-and-dropped MP3)
arrive with is_audio_message: false and no transcription.
import { verifyWebhook } from "@messages-dev/sdk";
app.post("/webhooks", async (req, res) => {
const event = await verifyWebhook(
req.body,
req.headers["x-webhook-signature"],
process.env.WEBHOOK_SECRET!,
);
if (event.event === "message.received" && event.data.isAudioMessage) {
const voice = event.data.attachments?.[0];
console.log(`${event.data.sender} said: ${voice?.transcription ?? "(no transcript yet)"}`);
}
res.sendStatus(200);
});