Project Structure

This page provides structure overview of Meetra AI code.

/data

/raw - Original, immutable data dump.
/processed - Final, canonical data sets for modeling.
/external - Data from third-party sources.
/interim - Intermediate transformed data.
/profanity_data - Data for profanity detection model.
/test - Data for unit tests.

/db

db_init.sql - Script for initializing the database.
db_stream_init.sql - Script for initializing streaming database.

/RabbitMQ

advanced_config - Configuration file for RabbitMQ service.
rabbitmq.dockerfile - Dockerfile for RabbitMQ container.

/gpu_utils

daemon.json - Configuration file to force GPU usage (not used for CPU-only devices).

/mysql

Deprecated MySQL database

/tqai

tqai.dockerfile - Dockerfile for app container.
setup_docker.sh - Bash script for installing Docker & MySQL dependencies.

/docs

Directory for storing documentation configuration.
- /_templates
  - layout.html - Custom template for docs.
- conf.py - Documentation configuration file.

/models

Audio Quality Classification (binary saved models):

/audio_lag_detector - Lagging audio classification.
/audio_reverb_detector - Reverb audio classification.
/music_detector - Music vs. speech classification.

Speech Emotion Classification (binary saved models):

/speech_neg_emotion - Negative emotions.
/speech_neutral - Neutral vs. non-neutral (sadness, anger, fear, disgust).
/speech_pos_emotion - Positive emotions (calm, happiness, surprised).
/speech_posneg_emotion - Positive vs. negative.

/ntlk_data

Locally saved data for NLTK model.

/notebooks

Directory for storing research notebooks.

/reports

Directory for storing meeting reports.

/src

config.py - Script for loading settings from config.yaml.

/app

Flask app structure:
- /static - Static files for app.
- /templates - Templates for app.
- app.py - Flask app (API, report generation, etc.).
- default_settings.pkl - Default settings for endpoint /meeting/<meeting_id>/ (custom report generation).

/data

Data Processing Scripts:
- audio_splitter.py - Splits audio files into shorter fragments.
- audiodataframe.py - Class for processing audio files.
- basedataframe.py - Abstract base class for audio and text dataframes.
- textdataframe.py - Class for processing text data.

/database

DB Models & Scripts:
- create_database.py - Script for creating database.
- /models/init.py - DB models and methods.

/features

fourier_t.py - Generates Fourier transform of audio file.
rolling_emotions.py - Calculates time-weighted moving average scores for emotions.

/models

Audio-Based Models:

basemodel.py - Base class for some audio models.
audio_lag_detector.py - Detects audio lag.
audio_reverb_detector.py - Detects audio reverb.
speech_emotion_recognition_model.py - Classifies emotions in audio (hybrid architecture).
music_detector.py - Detects music vs. speech.
diarizationmodel.py - Speaker diarization.
speaker_detection_model.py - Compares speaker recordings.

Text-Based Models:

speech_to_text.py - Speech-to-text processing.
key_point_finder.py - Finds key parts of speech.
rpunct.py - Local source code for rpunct.
text_emotion_recognition_model.py - Classifies emotions from text (hybrid architecture).
text_summarizer.py - Generates text summary.
toxicity_detection_model.py - Detects toxicity in text fragments.
profanitydetectionmodel.py - Detects profanity in text fragments.
name_recognition_model.py - Detects introductions (e.g., "Hello I’m George").
offensive_lang_detection_model.py - Detects offensive language in text.
restore_punctuation_model.py - Corrects punctuation in speech-to-text output.

Environmental Recommendations:

/sensorsmodels.py - Environmental recommendations.

/services

Streaming & Report Services:

audio_report_service.py - Service for audio meeting report generation.
services_bridge.py - Connects streaming service and audio report service.
meetings_api.py - Class for API endpoints.

/streaming_service

Service for streaming audio from Timeqube Hardware Prototype to server.

/tests

Unit tests directory.

/tools

GUI tool for labeling audio files.

/utils

Auxiliary Scripts:
- audio/ - Audio processing utilities (e.g., audio.py, audio_data.py).
- check_upsampling.py - Checks audio upsampling from low frequency.
- format_converter.py - Converts various audio/video formats.
- docsutils/ - Utilities for documentation (e.g., dictutils.py).
- logging_utils.py - Error logger for services.
- nlp_utils.py - Extracts topics from text.
- reportutils.py - Sends report/message via email.
- rttm_utils.py - Merges short RTTM files.
- speaker_statistics_model.py - Calculates speaker statistics.
- textutils.py - Processes text (e.g., splits text into statements).

/visualization

Report generation and plot creation
- /images - JPG files to include in reports.
- /templates - HTML templates for rendering reports.
- audio_feature_plot.py - Plots audio features (e.g., from AudioDataframe) for better understanding.
- generate_business_report.py - Generates HTML report for meetings.
- meeting_visualizer.py - Class for generating various plots based on meeting data.

Other Files in Root

Database Management

add_user_to_db.py - Script for adding a new user to the database.
service_cleanup.py - Cleans the database and service files (e.g., reports, WAV fragment files) and creates default accounts.

Docker & Project Setup

docker-compose.yml - Docker Compose settings.
requirements.txt - External requirements for the project.
config.yaml - Project configuration parameters.

Pre-commit Hooks

install_precommit_hooks.bash - Script for installing pre-commit hooks requirements.
precommitutils.py - Configuration for hooks.
pyproject.toml - Configuration for the interrogate package.

Documentation

interrogate_badge.svg - Infographic of docstring coverage percentage.
show_docs.py - Creates and displays project documentation.

PreviousTech Stack and Models NextDatabase Structure

Last updated 9 months ago