Calibration approaches for language detection

To date, automatic spoken language detection research has largely been based on a closed-set paradigm, in which the languages to be detected are known prior to system application. In actual practice, such systems may face previously unseen languages (out-of-set (OOS) languages) which should be rejec...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	McLaren, M., Ferrer, L., Castan, D., Lawson, A., Lacerda F., Strombergsson S., Wlodarczak M., Heldner M., Gustafson J., House D.
Formato:	CONF
Materias:	Bins Calibration Speech communication Speech recognition Language detection Limited attentions Objective functions Research communities Spoken languages System applications System constraints Training data Modeling languages
Acceso en línea:	http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p2804_McLaren
Aporte de:	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) de Universidad de Buenos Aires

id	todo:paper_2308457X_v2017-August_n_p2804_McLaren
record_format	dspace
spelling	todo:paper_2308457X_v2017-August_n_p2804_McLaren2023-10-03T16:40:54Z Calibration approaches for language detection McLaren, M. Ferrer, L. Castan, D. Lawson, A. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D. Bins Calibration Speech communication Speech recognition Language detection Limited attentions Objective functions Research communities Spoken languages System applications System constraints Training data Modeling languages To date, automatic spoken language detection research has largely been based on a closed-set paradigm, in which the languages to be detected are known prior to system application. In actual practice, such systems may face previously unseen languages (out-of-set (OOS) languages) which should be rejected, a common problem that has received limited attention from the research community. In this paper, we focus on situations in which either (1) the system-modeled languages are not observed during use or (2) the test data contains OOS languages that are unseen during modeling or calibration. In these situations, the common multi-class objective function for calibration of language-detection scores is problematic. We describe how the assumptions of multi-class calibration are not always fulfilled in a practical sense and explore applying global and language-dependent binary objective functions to relax system constraints. We contrast the benefits and sensitivities of the calibration approaches on practical scenarios by presenting results using both LRE09 data and 14 languages from the BABEL dataset. We show that the global binary approach is less sensitive to the characteristics of the training data and that OOS modeling with individual detectors is the best option when OOS test languages are not known to the system. Copyright © 2017 ISCA. CONF info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p2804_McLaren
institution	Universidad de Buenos Aires
institution_str	I-28
repository_str	R-134
collection	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic	Bins Calibration Speech communication Speech recognition Language detection Limited attentions Objective functions Research communities Spoken languages System applications System constraints Training data Modeling languages
spellingShingle	Bins Calibration Speech communication Speech recognition Language detection Limited attentions Objective functions Research communities Spoken languages System applications System constraints Training data Modeling languages McLaren, M. Ferrer, L. Castan, D. Lawson, A. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D. Calibration approaches for language detection
topic_facet	Bins Calibration Speech communication Speech recognition Language detection Limited attentions Objective functions Research communities Spoken languages System applications System constraints Training data Modeling languages
description	To date, automatic spoken language detection research has largely been based on a closed-set paradigm, in which the languages to be detected are known prior to system application. In actual practice, such systems may face previously unseen languages (out-of-set (OOS) languages) which should be rejected, a common problem that has received limited attention from the research community. In this paper, we focus on situations in which either (1) the system-modeled languages are not observed during use or (2) the test data contains OOS languages that are unseen during modeling or calibration. In these situations, the common multi-class objective function for calibration of language-detection scores is problematic. We describe how the assumptions of multi-class calibration are not always fulfilled in a practical sense and explore applying global and language-dependent binary objective functions to relax system constraints. We contrast the benefits and sensitivities of the calibration approaches on practical scenarios by presenting results using both LRE09 data and 14 languages from the BABEL dataset. We show that the global binary approach is less sensitive to the characteristics of the training data and that OOS modeling with individual detectors is the best option when OOS test languages are not known to the system. Copyright © 2017 ISCA.
format	CONF
author	McLaren, M. Ferrer, L. Castan, D. Lawson, A. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D.
author_facet	McLaren, M. Ferrer, L. Castan, D. Lawson, A. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D.
author_sort	McLaren, M.
title	Calibration approaches for language detection
title_short	Calibration approaches for language detection
title_full	Calibration approaches for language detection
title_fullStr	Calibration approaches for language detection
title_full_unstemmed	Calibration approaches for language detection
title_sort	calibration approaches for language detection
url	http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p2804_McLaren
work_keys_str_mv	AT mclarenm calibrationapproachesforlanguagedetection AT ferrerl calibrationapproachesforlanguagedetection AT castand calibrationapproachesforlanguagedetection AT lawsona calibrationapproachesforlanguagedetection AT lacerdaf calibrationapproachesforlanguagedetection AT strombergssons calibrationapproachesforlanguagedetection AT wlodarczakm calibrationapproachesforlanguagedetection AT heldnerm calibrationapproachesforlanguagedetection AT gustafsonj calibrationapproachesforlanguagedetection AT housed calibrationapproachesforlanguagedetection
_version_	1807322008367661056

Calibration approaches for language detection

Ejemplares similares