General Audio Processing Flow

Initialization: The AudioReportService starts with parameters such as the audio file path, user email, number of speakers, organization name, and whether the number of speakers is known.
Audio Preprocessing:
- Convert to WAV Mono: Convert the audio file to WAV format with a single audio channel.
- Measure Noise: Measure the background noise level in the audio file.
- Noise Reduction: Reduce background noise in the audio file.
Speaker Diarization and Processing:
- Diarize Speakers: Identify and segment different speakers in the audio file.
- Fix Speaker Names: Standardize speaker names in the diarization output.
- Merge Short Files: Merge short audio segments if necessary.
- Split Audio: Split the audio file based on timestamps from the diarization output.
Quality Adjustments:
- Adjust Decibels: Normalize audio volume levels across segments.
- Speaker Matching: Match speakers within the meeting.
- Speaker Tracking: Track speakers across multiple meetings.
Speech Analysis:
- Speech-to-Text: Convert speech segments to text.
- Emotion Detection: Detect emotions in the audio segments.
- Music Detection: Identify segments containing music.
- Microphone Quality: Assess the quality of the microphone used.
- Reverb Detection: Detect reverb in the audio segments.
- Lag Detection: Identify any lag in the speaker's audio.
Data Aggregation and Database Insertion:
- Speaker Statistics: Calculate statistics for each speaker.
- Insert Data: Insert processed data into the database.
- Auto-Rename Speakers: Automatically rename speakers based on detected names.
- Merge Fragments: Merge audio fragments into cohesive statements.
- Punctuate Statements: Apply punctuation and capitalization to statements.
- Split Statements: Split statements into individual sentences.
- Text Emotions: Detect emotions in the text of the statements.
- Generate Final Emotions: Map text emotions to speech emotions.
- Toxicity Detection: Detect toxic language in the transcriptions.
- Offensive Language Detection: Identify offensive language in the transcriptions.
Report Generation and Delivery:
- Generate Report Data: Compile data for the final report.
- Convert to JSON: Format the report data as JSON.
- Send Report: Send the report via email or API, depending on the request source.

PreviousDatabase Structure NextSecurity Considerations

Last updated 8 months ago