What data does the AI collect during a call?

The AI collects the call audio, transcript, caller ID, and any details the caller provides during the conversation. Card data, sensitive health data, and anything outside the scope you've configured are not stored by default.

Written By Catherine Weir

Last updated About 2 hours ago

During a call, an AI voice agent collects four main types of data: the audio recording, the text transcript, the caller ID and call metadata, and any specific details the caller provides during the conversation (name, account number, reason for calling, appointment preferences, etc.). What it does not collect, by default: credit card numbers, sensitive health data outside the configured scope, biometric voice prints, or anything not needed for the specific purpose of the call.

Every piece of collected data should be tied to a specific business purpose, retained only as long as necessary, and governed by your platform's data processing agreement.

What's collected by default

  • Call audio โ€” the full recording of the conversation (if recording is enabled, subject to state consent law)

  • Text transcript โ€” time-stamped text of what was said, labeled by speaker

  • Caller ID โ€” the phone number and, where available, the CNAM name of the calling party

  • Call metadata โ€” timestamps, duration, routing decisions, connection quality signals

  • Structured fields โ€” details the caller explicitly provided (name, address, appointment time, reason for visit)

  • Derived signals โ€” detected intent, sentiment, confidence scores

  • Actions taken โ€” appointments booked, messages sent, escalations initiated

What's usually NOT collected or stored (and why)

  • Credit card numbers โ€” on a PCI-compliant platform, card data is captured via DTMF masking or tokenized links. The digits never enter the AI's context, the transcript, or the recording.

  • Social Security numbers and other sensitive identifiers โ€” redacted in real time unless the configured business purpose specifically requires them

  • Specific health details outside the configured scope โ€” for HIPAA-covered deployments, only PHI necessary for the task is collected

  • Biometric voice prints โ€” the AI isn't extracting or storing voice fingerprints of callers

  • Data about callers outside the caller's direct participation โ€” the AI doesn't lookup callers' third-party records, purchase history, or personal data beyond what you've explicitly integrated

What happens to the collected data

  • Stored encrypted โ€” AES-256 at rest, TLS 1.2+ in transit

  • Access-controlled โ€” only authorized team members can view it, with all access logged

  • Retained for your configured period โ€” typically 30 days to several years, set by you

  • Deleted per your policy โ€” retention expiration triggers automated deletion across all storage tiers

  • Not used to train AI models โ€” your call data is not used to improve the underlying AI model providers' systems unless you opt in

What gets shared with third-party AI model providers

If the voice AI platform uses an LLM from OpenAI, Anthropic, Google, or another provider (most do), some data is shared during the call for the model to generate responses:

  • The relevant portion of the transcript

  • The agent instructions and knowledge base content being used

  • The conversation context needed to respond appropriately

  • Ephemeral reasoning context that isn't retained beyond the request

Good platforms have zero-retention agreements with their model providers โ€” meaning the model provider doesn't store the data after generating the response. Ask your vendor whether they have zero-retention arrangements with their model providers.

Your rights and your callers' rights

Both you and your callers have rights over the data collected:

  • You, the business โ€” full access to your own call data, ability to export, delete, or restrict at will

  • Callers โ€” rights under applicable privacy laws (CCPA/CPRA, VCDPA, etc.) to access their own data, delete it, correct it, or opt out of certain uses

Your voice AI platform should support both sets of rights โ€” for you, through the admin dashboard; for callers, through a data subject request process.

Red flags in a vendor's data handling

  • No clear data processing agreement

  • Inability to say where data is stored, for how long, and by whom it can be accessed

  • Using customer call data to train their own AI models without explicit opt-in

  • No zero-retention agreement with underlying model providers

  • Vague "we may share with third parties" language

  • No data subject request process

Related concepts

See it in action

365agents publishes our detailed data handling practices in our Trust Center. The Receptionist Agent uses zero-retention arrangements with all underlying AI model providers, and our USDP framework ensures your callers have consistent privacy rights regardless of what state they're in.