DEEP CO-TRAINING FRAMEWORK FOR SEMI-SUPERVISED AUDIO TAGGING

Cheifa, Ikram; Yakhlef, Hadjer ( Promotrice); Diffallah, Zhor ( promotrice)

Please use this identifier to cite or link to this item: http://localhost:8080/xmlui/handle/123456789/20463

Title:	DEEP CO-TRAINING FRAMEWORK FOR SEMI-SUPERVISED AUDIO TAGGING
Authors:	Cheifa, Ikram Yakhlef, Hadjer ( Promotrice) Diffallah, Zhor ( promotrice)
Keywords:	Audio Tagging Semi-Supervised Learning Deep Co-training Feature Extraction Statistical Tests
Issue Date:	2022
Publisher:	Université Blida 1
Abstract:	Audio tagging, also known as Sound Event Recognition, is concerned with the development of systems that are able to recognize sound events. A sound event is perceived as a separate individual entity that we can name and recognize, such as helicopter, glass breaking, baby crying, speech, etc. Considerable attention has been geared towards audio tagging for various applications, such as information retrieval, music tagging, and acoustic monitoring. The general framework for audio tagging usually involves two major steps: feature extraction and classification. Clearly, obtaining well-annotated, strongly labeled data is an expensive and time-consuming process. Therefore, a large portion of recent development has been devoted to effectively using weakly labeled data extracted from websites like Youtube, Freesound, or Flickr. Various semi-supervised learning approaches have been proposed in the literature. We can cite Mean Teacher, Pseudo Labeling, Mix Match, and most recently, Deep Co-training. The purpose of this project consists of devising an audio tagging system within the semi-supervised learning paradigm, specifically the Deep Co-training framework. Such systems essentially use both labeled and unlabeled audio data. In addition, our system is trained on two different datasets :Urban8k and Environmental Sound Classification, based on a deep residual neural network (ResNet) and a wide residual neural network (WideResNet). We supported our analysis and discussion with numerous statistical tests to analyze and compare our results. We have investigated the impact of differentiating the supervised ratio on the system’s performance and have tested the impact of various variants of DCT systems based on different adversarial attacks. The results demonstrate the efficacy of the Deep Co-training SSL strategy that significantly boosts the overall performance. Keywords: Audio Tagging, Semi-supervised learning, Deep Co-training, Feature Extraction, Statistical Tests.
Description:	ill., Bibliogr. Cote: ma-004-879
URI:	https://di.univ-blida.dz/jspui/handle/123456789/20463
Appears in Collections:	Mémoires de Master

Files in This Item:

File	Description	Size	Format
Cheifa Ikram.pdf		5,58 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets