Live product reveal: A first look at TheyDo Agent

Speaker Detection and PII Protection in Interview Processing

When processing interview transcripts in TheyDo, Journey AI automatically detects speakers and protects personal identifiable information (PII) while maintaining the valuable context of who said what. This guide explains how to use these features effectively.

What is Speaker Detection?

Speaker detection is an AI feature that automatically identifies different speakers in interview transcripts when you upload them to the Data Hub. This allows you to:

Assign personas to specific speakers
Choose which speakers to include in quote extraction
Maintain speaker context while protecting privacy
Focus insights on customer voices rather than researcher questions

How Speaker Detection Works

Step 1: Upload and Detection

When you upload an interview file (.txt format) or paste interview text:

Journey AI automatically detects the file type as "Interview"
The system identifies all unique speakers in the conversation
Speakers are presented for your review before processing

Step 2: Speaker Configuration

After detection, you'll see a speaker configuration screen showing:

All detected speakers (e.g., "Tony", "Samantha")
Include/Exclude toggles for each speaker
Persona assignment dropdown for each speaker

Step 3: Configure Your Speakers

Excluding Speakers

Toggle off speakers you want to exclude from quote extraction. Common use cases:

Exclude researchers/interviewers: Focus only on customer responses
Exclude observers: Remove non-participant comments
Exclude irrelevant speakers: Filter out administrative voices

Example: If "Tony" is the researcher and "Samantha" is the customer, exclude Tony to extract quotes only from Samantha.

Assigning Personas

For included speakers, you can:

Assign existing personas: Select from your workspace personas via dropdown
Leave unassigned: Process without persona attribution
Map multiple speakers to the same persona if they represent the same customer segment

Benefits of persona assignment:

All quotes from that speaker inherit the persona
Filter insights by persona in journey views
Track patterns across customer segments
Maintain context even after PII removal

PII Protection Process

How PII Obfuscation Works

TheyDo implements a two-stage approach to protect personal information:

During Upload (Configuration Stage):

You can see actual speaker names (e.g., "Samantha")
This visibility helps you correctly assign personas
Only the person uploading can see this information

After Processing (In Quotes and Insights):

Real names are replaced with generic identifiers (e.g., "Person 2")
Email addresses, phone numbers, and other PII are obfuscated
Persona assignments remain intact
Original speaker context is preserved without exposing identity

What Gets Obfuscated

Personal names
Email addresses
Phone numbers
Physical addresses
Social security numbers
Credit card information
Other identifiable personal data

What Gets Preserved

Assigned personas
Quote attribution to speakers (using anonymous identifiers)
The relationship between quotes from the same speaker
Sentiment and experience impact
All non-PII content

Best Practices

For Interview Preparation

Consistent speaker labels: Use clear speaker identifiers in your transcripts
Format consistency: Maintain consistent formatting (e.g., "Speaker Name: dialogue")
Clean transcripts: Remove timestamps or metadata that might interfere with detection

For Speaker Configuration

Always exclude researchers: Focus on customer voices unless researcher insights are specifically needed
Use persona assignment: Connect speakers to existing personas for better filtering
Review before processing: Double-check speaker inclusion/exclusion before continuing

For Privacy Compliance

Process immediately: The PII obfuscation happens automatically after upload
No manual PII removal needed: The system handles this for you
Audit trail maintained: System tracks who uploaded files while protecting subject identity

Working with Processed Interviews

After processing, your interview quotes will:

Show anonymous speaker identifiers (Person 1, Person 2, etc.)
Maintain all assigned personas
Be ready for insight mining with full context
Protect participant privacy while preserving research value

Filtering and Analysis

You can still:

Filter quotes by assigned persona
See which quotes came from the same speaker
Track sentiment patterns by persona
Build journeys based on specific customer segments

Tips for Success

Create personas first: Set up your personas before uploading interviews for smoother assignment
Batch similar interviews: Process interviews from the same customer segment together
Document your mapping: Keep a record of which personas you've assigned to which types of speakers
Use descriptive file names: Help yourself remember which interviews contain which customer types

Integration with Journey AI Features

Speaker detection and PII protection work seamlessly with other Journey AI features:

Insight Mining: Filtered quotes maintain speaker context
Journey Enrichment: Persona-tagged quotes enrich relevant journey steps
Experience Scoring: Sentiment analysis respects speaker filtering
Cross-source Analysis: Persona assignments enable pattern recognition across multiple interviews

By leveraging speaker detection and PII protection, you can maintain research integrity, protect participant privacy, and extract more targeted insights from your interview data.

Common Questions

Q: Can I change speaker assignments after processing? No, speaker configuration happens during upload. To change assignments, you'll need to re-upload the file.

Q: What if speaker detection misses a speaker? The system is designed to catch all unique speaker identifiers. Ensure your transcript uses consistent speaker labeling.

Q: Can I see the original names after PII obfuscation? No, once processed, the original names are permanently replaced to ensure privacy protection.

Q: How does this work with multiple interviews? Each interview file is processed independently. Assign the same persona across interviews to maintain consistency.

Continue reading: