Résumé:
In an increasingly interconnected world, effective communication across language
barriers is essential. Real-time transcription and translation systems have emerged as
solutions to facilitate seamless communication in multilingual settings. This thesis presents a
system designed to transcribe English speech, translate it into French, and address
challenges in capturing clear audio and handling noisy environments. The system
incorporates an automatic speech recognition machine learning model capable of
transcribing vocabulary typically used in meetings. It focuses on achieving real-time
performance with reasonable latency, even on low-performance hardware. Consequently,
our system successfully addressed the challenge of capturing clear audio in noisy
environments and transcribing vocabulary commonly used in meetings. Despite
acknowledging the constraints in accurately recognizing some words and occasional
transcription errors, the system's ability to deliver real-time performance with minimal
latency on hardware of modest capabilities is noteworthy and the translation system is
robust and Effective at capturing semantics.