What data does the AI collect during a call?
The AI collects the call audio, transcript, caller ID, and any details the caller provides during the conversation. Card data, sensitive health data, and anything outside the scope you've configured are not stored by default.
Written By Catherine Weir
Last updated About 2 hours ago
During a call, an AI voice agent collects four main types of data: the audio recording, the text transcript, the caller ID and call metadata, and any specific details the caller provides during the conversation (name, account number, reason for calling, appointment preferences, etc.). What it does not collect, by default: credit card numbers, sensitive health data outside the configured scope, biometric voice prints, or anything not needed for the specific purpose of the call.
Every piece of collected data should be tied to a specific business purpose, retained only as long as necessary, and governed by your platform's data processing agreement.
What's collected by default
Call audio โ the full recording of the conversation (if recording is enabled, subject to state consent law)
Text transcript โ time-stamped text of what was said, labeled by speaker
Caller ID โ the phone number and, where available, the CNAM name of the calling party
Call metadata โ timestamps, duration, routing decisions, connection quality signals
Structured fields โ details the caller explicitly provided (name, address, appointment time, reason for visit)
Derived signals โ detected intent, sentiment, confidence scores
Actions taken โ appointments booked, messages sent, escalations initiated
What's usually NOT collected or stored (and why)
Credit card numbers โ on a PCI-compliant platform, card data is captured via DTMF masking or tokenized links. The digits never enter the AI's context, the transcript, or the recording.
Social Security numbers and other sensitive identifiers โ redacted in real time unless the configured business purpose specifically requires them
Specific health details outside the configured scope โ for HIPAA-covered deployments, only PHI necessary for the task is collected
Biometric voice prints โ the AI isn't extracting or storing voice fingerprints of callers
Data about callers outside the caller's direct participation โ the AI doesn't lookup callers' third-party records, purchase history, or personal data beyond what you've explicitly integrated
What happens to the collected data
Stored encrypted โ AES-256 at rest, TLS 1.2+ in transit
Access-controlled โ only authorized team members can view it, with all access logged
Retained for your configured period โ typically 30 days to several years, set by you
Deleted per your policy โ retention expiration triggers automated deletion across all storage tiers
Not used to train AI models โ your call data is not used to improve the underlying AI model providers' systems unless you opt in
What gets shared with third-party AI model providers
If the voice AI platform uses an LLM from OpenAI, Anthropic, Google, or another provider (most do), some data is shared during the call for the model to generate responses:
The relevant portion of the transcript
The agent instructions and knowledge base content being used
The conversation context needed to respond appropriately
Ephemeral reasoning context that isn't retained beyond the request
Good platforms have zero-retention agreements with their model providers โ meaning the model provider doesn't store the data after generating the response. Ask your vendor whether they have zero-retention arrangements with their model providers.
Your rights and your callers' rights
Both you and your callers have rights over the data collected:
You, the business โ full access to your own call data, ability to export, delete, or restrict at will
Callers โ rights under applicable privacy laws (CCPA/CPRA, VCDPA, etc.) to access their own data, delete it, correct it, or opt out of certain uses
Your voice AI platform should support both sets of rights โ for you, through the admin dashboard; for callers, through a data subject request process.
Red flags in a vendor's data handling
No clear data processing agreement
Inability to say where data is stored, for how long, and by whom it can be accessed
Using customer call data to train their own AI models without explicit opt-in
No zero-retention agreement with underlying model providers
Vague "we may share with third parties" language
No data subject request process
Related concepts
See it in action
365agents publishes our detailed data handling practices in our Trust Center. The Receptionist Agent uses zero-retention arrangements with all underlying AI model providers, and our USDP framework ensures your callers have consistent privacy rights regardless of what state they're in.