Modelo de Inteligencia artificial para el diagnóstico del dengue
Contenido principal del artículo
Resumen
El diagnóstico diferencial del dengue frente a otras arbovirosis representa un desafío clínico significativo en regiones endémicas. Este estudio más allá de proponer un modelo de Inteligencia Artificial (AI) para el diagnóstico del Dengue, realiza un estudio y preparación del entorno de los conjuntos de datos de Dengue para entrenar moledos de AI, especilamente de machine learning. Este estudio evaluó la efectividad de técnicas de machine learning aprendizaje supervisado para predecir la infección por dengue utilizando datos clínicos y demográficos. Se evaluaron varios algoritmos de clasificación binaria tanto paramétricos como no paramétricos mediante un proceso de validación cruzada y métricas de desempeño ampliamente utilizadas como el accuracy o el F1-score. Se halló que la calidad del dato afecta el resultado del modelo ya que en el dataset balanceado y con datos mejor tratados, el modelo binario entrega mejores resultados que en el dataset desbalanceado o con ruido en sus registros. Se concluye que, revisado evidencias cuantitativas, se necesita realizar un estudio y experimentación más profunda de los dataset de Dengue para facilitar el proceso de entramiento de los modelos de machine learning.
##plugins.themes.bootstrap3.displayStats.downloads##
Detalles del artículo
Sección

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial 4.0.
Cómo citar
Referencias
Bhatt, S., Gething, P. W., Brady, O. J., Messina, J. P., Farlow, A. W., Moyes, C. L., Drake, J. M., Brownstein, J. S., Hoen, A. G., Sankoh, O., et al. (2013). The global distribution and burden of dengue. Nature, 496(7446), 504–507. https://doi.org/10.1038/nature12060 DOI: https://doi.org/10.1038/nature12060
Brady, O. J., & Hay, S. I. (2020). The global expansion of dengue: How Aedes aegypti mosquitoes enabled the first pandemic arbovirus. The Lancet Infectious Diseases, 20(1), e42–e51. https://doi.org/10.1016/S1473-3099(19)30446-X DOI: https://doi.org/10.1146/annurev-ento-011019-024918
World Health Organization. (2025). Dengue and severe dengue [Fact sheet]. World Health Organization. https://www.who.int/news-room/fact-sheets/detail/dengue-and-severe-dengue
Pan American Health Organization. (2025). Dengue: Datos y análisis. Pan American Health Organization. https://www.paho.org/es/temas/dengue/datos
Quinn, E., Cheong, A., Calvert, J., Higgins, G., Hahesy, T., & Carr, J. (2018). Clinical features and laboratory findings of travelers returning to South Australia with dengue virus infection. Tropical Medicine and Infectious Disease, 3(1), 6. https://doi.org/10.3390/tropicalmed3010006 DOI: https://doi.org/10.3390/tropicalmed3010006
Peeling, R. W., Artsob, H., Pelegrino, J., et al. (2010). Evaluation of diagnostic tests: Dengue. Nature Reviews Microbiology, 8(12 Suppl), S30–S38. https://doi.org/10.1038/nrmicro2459 DOI: https://doi.org/10.1038/nrmicro2459
Beltrán-Silva, S. L., Chacón-Hernández, S. S., Moreno-Palacios, E., & Pereyra-Molina, J. Á. (2018). Clinical and differential diagnosis: Dengue, chikungunya and Zika. Revista Médica del Hospital General de México, 81(4), 218–227. https://doi.org/10.1016/j.hgmx.2016.10.003 DOI: https://doi.org/10.1016/j.hgmx.2016.09.011
World Health Organization. (2016). Laboratory testing for Zika virus infection: Interim guidance (WHO/ZIKV/LAB/16.1). World Health Organization. https://apps.who.int/iris/handle/10665/204671
Bohm, B. C., Borges, F. E., Silva, S. C., Soares, A. T., Ferreira, D. D., Belo, V. S., Lignon, J. S., & Bruhn, F. R. P. (2024). Utilization of machine learning for dengue case screening. BMC Public Health, 24, 1573. https://doi.org/10.1186/s12889-024-19083-8 DOI: https://doi.org/10.1186/s12889-024-19083-8
Neto, S. R. S., Oliveira, T. T., & Neto, L. M. (2023). Binary models for arboviruses classification using machine learning: A benchmarking evaluation. In Proceedings of the 56th Hawaii International Conference on System Sciences (pp. 2834–2843). https://doi.org/10.24251/HICSS.2023.348 DOI: https://doi.org/10.24251/HICSS.2023.348
Lukman, A. F., Mohammed, S., Olaluwoye, O., & Farghali, R. A. (2024). Handling multicollinearity and outliers in logistic regression using the robust Kibria–Lukman estimator. Axioms, 13(1), 19. https://doi.org/10.3390/axioms13010019 DOI: https://doi.org/10.3390/axioms14010019
Hu, L., Chen, J., Vaughan, J., & Yang, H. (2020). Supervised machine learning techniques: An overview with applications to banking. arXiv. https://doi.org/10.48550/arXiv.2008.04059
Boateng, E. Y., Otoo, J., & Abaye, D. A. (2020). Basic tenets of classification algorithms: k-nearest-neighbor, support vector machine, random forest and neural network – A review. Journal of Data Analysis and Information Processing, 8(4), 341–357. https://doi.org/10.4236/jdaip.2020.84020 DOI: https://doi.org/10.4236/jdaip.2020.84020
Hatwell, J., Gaber, M. M., & Azad, R. M. A. (2020). Ada-WHIPS: Explaining AdaBoost classification with applications in the health sciences. BMC Medical Informatics and Decision Making, 20, 250. https://doi.org/10.1186/s12911-020-01201-2 DOI: https://doi.org/10.1186/s12911-020-01201-2
Hosseinzadeh, E., Afkanpour, M., Momeni, M., et al. (2025). Data quality assessment in healthcare: Dimensions, methods and tools – A systematic review. BMC Medical Informatics and Decision Making, 25, 296. https://doi.org/10.1186/s12911-025-03136-y DOI: https://doi.org/10.1186/s12911-025-03136-y
Ahmed, A., Xi, R., Hou, M., Shah, S. A., & Hameed, S. (2023). Harnessing big data analytics for healthcare: A comprehensive review of frameworks, implications, applications, and impacts. IEEE Access, 11, 112891–112928. https://doi.org/10.1109/ACCESS.2023.3323574 DOI: https://doi.org/10.1109/ACCESS.2023.3323574
Anderson, R. (2007). The credit scoring toolkit: Theory and practice for retail credit risk management and decision automation. Oxford University Press. https://global.oup.com/academic/product/the-credit-scoring-toolkit-9780199226405 DOI: https://doi.org/10.1093/oso/9780199226405.001.0001
Grandini, M., Bagli, E., & Visani, G. (2020). Metrics for multi-class classification: An overview. arXiv. https://doi.org/10.48550/arXiv.2008.05756
Maletzky, A., Böck, C., Tschoellitsch, T., Roland, T., Ludwig, H., Thumfart, S., Giretzlehner, M., Hochreiter, S., & Meier, J. (2022). Lifting hospital electronic health record data treasures: Challenges and opportunities. JMIR Medical Informatics, 10(10), e38557. https://doi.org/10.2196/38557 DOI: https://doi.org/10.2196/38557
Shi, X., Prins, C., Van Pottelbergh, G., Mamouris, P., Vaes, B., & De Moor, B. (2021). An automated data cleaning method for electronic health records by incorporating clinical knowledge. BMC Medical Informatics and Decision Making, 21, 267. https://doi.org/10.1186/s12911-021-01630-7 DOI: https://doi.org/10.1186/s12911-021-01630-7
Syed, R., Eden, R., Makasi, T., Chukwudi, I., Mamudu, A., et al. (2023). Digital health data quality issues: Systematic review. Journal of Medical Internet Research, 25, e42615. https://doi.org/10.2196/42615 DOI: https://doi.org/10.2196/42615
Lighterness, A., Adcock, M., Scanlon, L. A., & Price, G. (2024). Data quality–driven improvement in health care: Systematic literature review. Journal of Medical Internet Research, 26, e57615. https://doi.org/10.2196/57615 DOI: https://doi.org/10.2196/57615
Weng, W. H., & Szolovits, P. (2020). Machine learning for clinical predictive analytics. In S. R. Steinhubl, P. W. Zimlichman, & B. D. Topol (Eds.), Data science for healthcare: Methodologies and applications (pp. 199–217). Springer. https://doi.org/10.1007/978-3-030-47994-7_12 DOI: https://doi.org/10.1007/978-3-030-47994-7_12
Florek, P., & Zagdański, A. (2023). Benchmarking state-of-the-art gradient boosting algorithms for classification. arXiv. https://doi.org/10.48550/arXiv.2305.17094
Sistema de Informação de Agravos de Notificação. (2025). Dados epidemiológicos – SINAN. Ministério da Saúde, Brasil. https://portalsinan.saude.gov.br/dados-epidemiologicos-sinan
Esteva, A., Kuprel, B., Novoa, R. A., et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115–118. https://doi.org/10.1038/nature21056 DOI: https://doi.org/10.1038/nature21056
Lohr, S. L. (2021). Sampling: Design and analysis (3rd ed.). CRC Press. https://doi.org/10.1201/9780429298899 DOI: https://doi.org/10.1201/9780429298899
Kordos, M., & Rusiecki, A. (2016). Reducing noise impact on MLP training. Soft Computing, 20(1), 49–65. https://doi.org/10.1007/s00500-015-1690-9 DOI: https://doi.org/10.1007/s00500-015-1690-9
Google. (2024). Google Colaboratory [Computer software]. https://colab.research.google.com
Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B., Bussonnier, M., Frederic, J., et al. (2016). Jupyter Notebooks – A publishing format for reproducible computational workflows. In F. Loizides & B. Schmidt (Eds.), Positioning and power in academic publishing: Players, agents and agendas (pp. 87–90). IOS Press. https://doi.org/10.3233/978-1-61499-649-1-87 DOI: https://doi.org/10.3233/978-1-61499-649-1-87
Python Software Foundation. (2024). Python (Version 3.x) [Computer software]. https://www.python.org
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830. https://www.jmlr.org/papers/v12/pedregosa11a.html