Veuillez utiliser cette adresse pour citer ce document : https://di.univ-blida.dz/jspui/handle/123456789/41130
Affichage complet
Élément Dublin CoreValeurLangue
dc.contributor.authorBenlaoubi, Chaima Nour el Houda-
dc.contributor.authorKhettal, Mounia-
dc.contributor.authorYkhlef, Hadjer. (Promotrice)-
dc.date.accessioned2025-12-10T13:25:42Z-
dc.date.available2025-12-10T13:25:42Z-
dc.date.issued2025-
dc.identifier.urihttps://di.univ-blida.dz/jspui/handle/123456789/41130-
dc.descriptionill.,Bibliogr.cote:MA-004-1055fr_FR
dc.description.abstractLanguage-queried audio source separation (LASS) enables on-demand sound extraction of sound sources using natural language queries overcoming limitations in traditional audio source separation systems. In this work, we propose a language-queried audio source separation architecture integrating two major innovations: a cross attention driven ResUNet++ with multi scale receptive fields (via Atrous Spatial Pyramid Pooling), channel wise attention(Squeeze and Excitation block) and residual connections to integrate FLAN-T5 text embedding with audio features; Cosine similarity filtering to suppress overly similar mixture target pairs that might hinder the training. We trained our model on Clotho dataset derived mixtures and evaluated on its test set using state of the art metrics. Our system achieves good separation quality with an SDR of 2.41 and SDRI of 8.37. This work presents a lightweight, efficient framework for language-queried audio source separation compared to current state of the art models. Keywords: Language-queried audio source separation, Cross-Modal Attention, ResUNet++, Cosine similarity filtering, Phase-aware reconstruction, computational efficiency.fr_FR
dc.language.isoenfr_FR
dc.publisherUniversité Blida 1fr_FR
dc.subjectLanguage-queried audio source separationfr_FR
dc.subjectCross-Modal Attentionfr_FR
dc.subjectResUNet++fr_FR
dc.subjectcomputational efficiency.fr_FR
dc.subjectCosine similarity filteringfr_FR
dc.subjectPhase-aware reconstruction.fr_FR
dc.titleLanguage-queried audio source separationfr_FR
dc.typeThesisfr_FR
Collection(s) :Mémoires de Master

Fichier(s) constituant ce document :
Fichier Description TailleFormat 
BENLAOUBI Chaima Nour el Houda & KHETTAL Mounia.pdf3,93 MBAdobe PDFVoir/Ouvrir


Tous les documents dans DSpace sont protégés par copyright, avec tous droits réservés.