PENGEMBANGAN ALAT TRANSKRIPSI REAL-TIME BERBASIS ON-DEVICE SPEECH RECOGNITION BAGI PENGGUNA DENGAN GANGGUAN PENDENGARAN

Authors

  • Marlon Brando Layanto Universitas Tarumanagara
  • Hadian Satria Utama
  • Wahidin Wahab

DOI:

https://doi.org/10.36595/jire.v9i1.1707

Keywords:

Automatic Speech Recognition, Gangguan Pendengaran, Transkripsi Real-Time, ASR On-Device, Teknologi Pembantu

Abstract

Gangguan pendengaran berdampak signifikan terhadap kualitas hidup penderitanya, terutama dalam aspek komunikasi sosial. Untuk mengatasi keterbatasan akses terhadap teknologi seperti hearing aid dan implan koklea, penelitian ini merancang alat bantu berupa perangkat transkripsi real-time berbasis teknologi automatic speech recognition (ASR). Perangkat terdiri dari mikrofon, mikrokontroler dengan modul bluetooth, heads-up display dan power supply yang ditanam dalam alat yang dapat dipasang pada kacamata. Suara yang ditangkap mikrofon dikirim ke smartphone, ditranskripsikan menggunakan model ASR on-device berbasis arsitektur Transducer dengan encoder Zipformer, kemudian hasil teks dikirim kembali ke alat untuk ditampilkan langsung ke mata pengguna melalui heads-up display. Pengujian dilakukan terhadap akurasi transkripsi menggunakan metrik word error rate (WER) dan kecepatan transkripsi menggunakan real-time factor (RTF). Model ASR menunjukkan performa baik dengan rata-rata WER terendah sebesar 9,81% pada empat dataset uji. Uji coba alat secara langsung di lingkungan terkontrol (40-60 dB) pada jarak 20-100 cm menunjukkan rata-rata WER sebesar 12,93% dan RTF sebesar 0,015, sedangkan di lingkungan tidak terkendali (60-70 dB), alat memiliki WER sebesar 27,53% dan RTF sebesar 0,015. Pada jarak 20 cm, WER alat pada kedua lingkungan memiliki perbedaan 0,19%. Hasil ini menunjukkan alat mampu melakukan transkripsi dengan cepat dan cukup akurat pada jarak alat yang dekat dengan pembicara.

Downloads

Download data is not yet available.

References

[1] World Health Organization, “Deafness and hearing loss.” Diakses: 9 Februari 2025. [Daring]. Tersedia pada: https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss

[2] J. L. Punch, R. Hitt, dan S. W. Smith, “Hearing Loss and Quality of Life,” Journal of communication disorders, vol. 78, hlm. 33–45, 2019.

[3] J. Gao, H. Hu, dan L. Yao, “The Role of Social Engagement in the Association of Self-Reported Hearing Loss and Health-Related Quality of Life,” BMC Geriatr, vol. 20, no. 1, hlm. 182, Des 2020, doi: 10.1186/s12877-020-01581-0.

[4] D. S. Powell, E. S. Oh, N. S. Reed, F. R. Lin, dan J. A. Deal, “Hearing Loss and Cognition: What We Know and Where We Need to Go,” Front. Aging Neurosci., vol. 13, hlm. 769405, Feb 2022, doi: 10.3389/fnagi.2021.769405.

[5] “Who Issues Guidance to Improve Access to Hearing Care in Low- and Middle-Income Settings.” Diakses: 16 Agustus 2024. [Daring]. Tersedia pada: https://www.who.int/news/item/01-03-2024-who-issues-guidance-to-improve-access-to-hearing-care-in-low--and-middle-income-settings

[6] A. Orji, K. Kamenov, M. Dirac, A. Davis, S. Chadha, dan T. Vos, “Global and Regional Needs, Unmet Needs and Access to Hearing Aids,” International Journal of Audiology, vol. 59, no. 3, hlm. 166–172, Mar 2020, doi: 10.1080/14992027.2020.1721577.

[7] Yana Karisma, N. D. S. Ismail, Shinta Esabella, Erwin Mardinata, dan Rodianto, “PENERAPAN SPEECH TO TEXT PADA APLIKASI KAMUS BAHASA SUMBAWA INDONESIA INGGRIS BERBASIS ANDROID,” JIRE, vol. 5, no. 2, hlm. 230–241, Des 2022, doi: 10.36595/jire.v5i2.751.

[8] I. Sinha dan O. Caverly, “EyeHear: Smart Glasses for the Hearing Impaired,” dalam HCI International 2020 – Late Breaking Papers: Universal Access and Inclusive Design, vol. 12426, C. Stephanidis, M. Antona, Q. Gao, dan J. Zhou, Ed., dalam Lecture Notes in Computer Science, vol. 12426. , Cham: Springer International Publishing, 2020, hlm. 358–370. doi: 10.1007/978-3-030-60149-2_28.

[9] A. M. Ridha dan W. Shehieb, “Assistive Technology for Hearing-Impaired and Deaf Students Utilizing Augmented Reality,” dalam 2021 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), ON, Canada: IEEE, Sep 2021, hlm. 1–5. doi: 10.1109/CCECE53047.2021.9569193.

[10] B. Li dkk., “A Language Agnostic Multilingual Streaming On-Device ASR System,” 2022, arXiv. doi: 10.48550/ARXIV.2208.13916.

[11] J. Macoskey, G. P. Strimel, dan A. Rastrow, “Bifocal Neural ASR: Exploiting Keyword Spotting for Inference Optimization,” dalam ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada: IEEE, Jun 2021, hlm. 5999–6003. doi: 10.1109/ICASSP39728.2021.9414652.

[12] M. Ali, F. Alam, I. Ahmed, B. AlQattan, A. K. Yetisen, dan H. Butt, “3D printing of Fresnel lenses with wavelength selective tinted materials,” Additive Manufacturing, vol. 47, hlm. 102281, 2021.

[13] S. Deng dkk., “Carbon Nanotube Array Based Binary Gabor Zone Plate Lenses,” Sci Rep, vol. 7, no. 1, hlm. 15256, Nov 2017, doi: 10.1038/s41598-017-15472-9.

[14] “The human eye.” Diakses: 6 Juli 2025. [Daring]. Tersedia pada: http://labman.phys.utk.edu/phys222core/modules/m8/human_eye.html

[15] Espressif Systems, “32-bit MCU & 2.4 GHz Wi-Fi & Bluetooth/Bluetooth LE.” ESP32-WROVER-E datasheet, 2025. [Daring]. Tersedia pada: https://documentation.espressif.com/esp32-wrover-e_esp32-wrover-ie_datasheet_en.pdf

[16] R. K. Kodali dan K. S. Mahesh, “Low cost ambient monitoring using ESP8266,” dipresentasikan pada 2016 2nd International Conference on Contemporary Computing and Informatics (IC3I), IEEE, 2016, hlm. 779–782.

[17] NanJing Top Power ASIC Corp., “1A Standalone Linear Li-Ion Battery Charger.” TP4056 datasheet, n.d. [Daring]. Tersedia pada: https://datasheet.lcsc.com/lcsc/1809261820_TOPPOWER-Nanjing-Extension-Microelectronics-TP4056-42-ESOP8_C16581.pdf

[18] Aerosemi Technology, “High Efficiency 1.2 MHz 2A Step Up Converter.” MT3608 datasheet, n.d. [Daring]. Tersedia pada: https://www.olimex.com/Products/Breadboarding/BB-PWR-3608/resources/MT3608.pdf

[19] J. Castaño, S. Martínez-Fernández, X. Franch, dan J. Bogner, “Analyzing the evolution and maintenance of ml models on hugging face,” dipresentasikan pada Proceedings of the 21st International Conference on Mining Software Repositories, 2024, hlm. 607–618.

[20] X. Li, S. Takamichi, T. Saeki, W. Chen, S. Shiota, dan S. Watanabe, “Yodas: Youtube-Oriented Dataset for Audio and Speech,” dalam 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Taipei, Taiwan: IEEE, Des 2023, hlm. 1–8. doi: 10.1109/ASRU57964.2023.10389689.

[21] A. Conneau dkk., “FLEURS: FEW-Shot Learning Evaluation of Universal Representations of Speech,” dalam 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar: IEEE, Jan 2023, hlm. 798–805. doi: 10.1109/SLT54892.2023.10023141.

[22] R. Ardila dkk., “Common Voice: A Massively-Multilingual Speech Corpus,” 2019, arXiv. doi: 10.48550/ARXIV.1912.06670.

[24] A. Radford, J. W. Kim, T. Xu, G. Brockman, C. McLeavey, dan I. Sutskever, “Robust speech recognition via large-scale weak supervision,” dipresentasikan pada International conference on machine learning, PMLR, 2023, hlm. 28492–28518.

Downloads

Published

2026-04-24

How to Cite

1.
Layanto MB, Utama HS, Wahab W. PENGEMBANGAN ALAT TRANSKRIPSI REAL-TIME BERBASIS ON-DEVICE SPEECH RECOGNITION BAGI PENGGUNA DENGAN GANGGUAN PENDENGARAN. JIRE [Internet]. 2026 Apr. 24 [cited 2026 May 16];9(1):1-12. Available from: http://e-journal.stmiklombok.ac.id/index.php/jire/article/view/1707