AbstractObjectivesWe analyzed Sasang constitution case reports using text mining to derive network analysis results and designed a classification algorithm using machine learning to select a model suitable for classifying Sasang constitution based on text data.
MethodsCase reports on Sasang constitution published from January 1, 2000, to December 31, 2022, were searched. As a result, 343 papers were selected, yielding 454 cases. Extracted texts were pretreated and tokenized with the Python-based KoNLPy package. Each morpheme was vectorized using TF-IDF values. Word cloud visualization and centrality analysis identified keywords mainly used for classifying Sasang constitution in clinical practice. To select the most suitable classification model for diagnosing Sasang constitution, the performance of five models—XGBoost, LightGBM, SVC, Logistic Regression, and Random Forest Classifier—was evaluated using accuracy and F1-Score.
ResultsThrough word cloud visualization and centrality analysis, specific keywords for each constitution were identified. Logistic regression showed the highest accuracy (0.839416), while random forest classifier showed the lowest (0.773723). Based on F1-Score, XGBoost scored the highest (0.739811), and random forest classifier scored the lowest (0.643421).
ConclusionsThis is the first study to analyze constitution classification by applying text mining and machine learning to case reports, providing a concrete research model for follow-up research. The keywords selected through text mining were confirmed to effectively reflect the characteristics of each Sasang constitution type. Based on text data from case reports, the most suitable machine learning models for diagnosing Sasang constitution are logistic regression and XGBoost.
참고문헌1. Jung, S. H. (2021). A Study on <Nanjungilgi> Using Topic Modeling and Network Analysis. The Korean Language and Literature, (197), 111-144.
https://doi.org/10.31889/kll.2021.12.197.111
![]() 2. Cho, S. Z., & Kang, S. H. (2016). Industrial Applications of Machine Learning (Artificial Intelligence). Industrial Engineering Magazine, 23(2), 34-38.
3. Seo, H. J. (2019). A Preliminary Discussion on Policy Decision Making of AI in The Fourth Industrial Revolution. Informatization Policy, 26(3), 1-1.
https://doi.org/10.22693/NIAIP.2019.26.3.003
4. Baek, S. W. (2023). Natural Language Processing in Construction Management. KSCE 2023 CONVENTION, 549-550.
5. Park, K. M., & Hwang, K. B. (2011). A Bio-Text Mining System Based on Natural Language Processing. Journal of KIISE: Computing Practices and Letters, 17(4), 205-213.
6. Choi, C. H., Park, K. H., Park, H. K., Lee, M. J., Kim, J. S., & Kim, H. S. (2017). Development of Heavy Rain Damage Prediction Function for Public Facility Using Machine Learning. Journal of Korean Society of Hazard Mitigation, 17(6), 443-450.
https://doi.org/10.9798/KOSHAM.2017.17.6.443
![]() 7. Hong, J. W., Kim, Y. I., Park, S. J., Kim, B. C., Eom, I. K., & Hwang, M. W., et al (2009). Data mining Algorithms for the Development of Sasang Type Diagnosis. Journal of Physiology & Pathology in Korean Medicine, 23(6), 1234-1240.
8. Lee, J. H., & Lee, H. H. (2019). Selecting Sasang-Type classification model using machine learning and designing the service flow. Journal of Digital Contents Society, 20(2), 321-327.
http://dx.doi.org/10.9728/dcs.2019.20.2.321
![]() 9. Lee, H. R., & Lee, J. H. (2021). A Study on the Development of Diagnostic Tools for Sasang Constitutional Patterns. Journal of Sasang Constitutional Medicine, 33(3), 95-126.
https://doi.org/10.7730/JSCM.2021.33.3.95
10. Kim, G. W. (2002). Relation of Sasang Constitution diseases and Mind-Body Medicine (Sasang Constitutinal Medicine from the psychiatry point of view). Journal of Oriental Neuropsychiatry, 13(2), 11-19.
11. Craddock, N., & Mynors-Wallis, L. (2014). Psychiatric diagnosis: impersonal, imperfect and important. Br J Psychiatry, 204(2), 93-95.
https://doi.org/10.1192/bjp.bp.113.133090
![]() ![]() 12. Srivastava A., Sahami M.(2009). Text mining: Classification, Clustering, and Applications. CRC Press.
13. Park S. E., Gang J. Y.Python Text Mining Complete Guide. 1st Edition. Gyeonggi. Wikibooks;(2022). p. 322
14. Seo D. H.Grab It! Text Mining with Python. 1st Edition. Seoul. bjpublic;(2019). p. 203
15. Park, D. H., & Cho, M H. (2022). Identifying Fine Dining Restaurant Consumers’ Perceptions: A Pre- and During COVID-19 Comparison using Big Data. Korean Journal of Hospitality & Tourism, 31(4), 17-32.
https://doi.org/10.24992/KJHT.2022.6.31.04.17
![]() 16. Seo D. H.(2019). Grab It! Text Mining with Python. 1st Edition. Seoul. bjpublic;p. 203
17. Rácz, A., Bajusz, D., & Héberger, K. (2021). Effect of Dataset Size and Train/Test Split Ratios in QSAR/QSPR Multiclass Classification. Molecules, 26(4), 1111.
![]() ![]() ![]() 18. Department of Sasang Constitutional Medicine, College of Korean Medicine. (2004). Sasang constitutional medicine. Jipmoon, 164–165, 643729-730.
19. Park, H. S., Joo, J. C., Kim, J. H., & Kim, K. Y. (2002). A Study on clinical application of the QSCCII(Questionnaire for the Sasang Constitution ClassificationII). Journal of Sasang Constitutional Medicine, 14(2), 35-44.
20. Baek, Y. H., Kim, H. S., Lee, S. W., & Jang, E. S. (2014). The Concordance and Validity Assessment of Diagnosis for the Expert in Sasang Constitution. Journal of Sasang constitutional medicine, 26(3), 295-303.
![]() 21. Lee, S. G., Kwak, C. K., Lee, E. J., Ko, B. H., & Song, I. B. (2003). The Study of the Upgrade of QSCCII(II)-A Study on the re-validity of QSCCII-. Journal of Sasang constitutional medicine, 15(1), 39-49.
22. Kang, M. S., Oh, J. W., Lee, H. R., & Lee, J. H. (2019). Patient Group Study to Improve the Accuracy of QSCC II+. Journal of Sasang Constitutional Medicine, 31(3), 48-65.
https://doi.org/10.7730/JSCM.2019.31.3.48
23. Do, J. H., Nam, J. H., Jang, E. S., Jang, J. S., Kim, J. W., & Kim, Y. S., et al (2013). Comparison between Diagnostic Results of the Sasang Constitutional Analysis Tool (SCAT) and a Sasang Constitution Expert. Journal of Sasang constitutional medicine, 25(3), 158-166.
https://doi.org/10.7730/JSCM.2013.25.3.158
![]() 24. Hwang, D. S., Cho, J. H., Lee, C. H., Jang, J. B., & Lee, K. S. (2006). A Study on Reproducibility of Responses to the Questionnaire for Sasang Constitution Classification II (QSCCII). Journal of Korean Medicine, 27(3), 145-150.
25. Kim, J. W., Sul, Y. K., Choi, J. J., Kwon, S. D., Kim, K. K., & Lee, Y. T. (2007). Comparative Study of Diagnostic Accuracy Rate by Sasang Constitutions on Measurement Method of Body Shape. Journal of physiology & pathology in Korean Medicine, 21(1),
26. Lee, E. J., Song, K. B., Choi, H. S., Yoo, J. H., Kwak, C. K., & Sohn, E. H., et al (2005). Pilot Study on the classification for sasangin by the voice analysis. Journal of Korean Oriental Medicine, 26(1), 93-102.
27. Lee J.H.(2022). Korean Medicine Clinical Practice Guideline for Sasang(Four) constitutional medicine patterns. Korea. The Society of Sasang Constitutional Medicine.
28. Kim, M. J., & Lee, S. J. (2018). Study of health characteristics of female college students according to sasang constitution and factors affecting BMI. Journal of Sasang constitutional medicine, 30(3), 48-61.
29. Kim, E. Y., & Kim, J. W. (2004). A Clinical study on the Sasang Constitution and Obesity. Journal of Sasang constitutional medicine, 16(1), 100-111.
30. Hong, S. C., Lee, S. K., Lee, E. J., Han, G. H., Chou, Y. J., & Choi, C. H., et al (1998). A Study on the morphologic characteristics of each constitution’s trunk. Journal of Sasang constitutional medicine, 10(1), 101-142.
31. Choi, J. S., & Kim, K. Y. (1998). A Study on Disease and Medical Theory of Soyangin Bisoohan-pyohanbyung-theory. Journal of Sasang constitutional medicine, 10(2), 61-110.
32. Park, S. E. (2021). Analysis of the Status of Natural Language Processing Technology Based on Deep Learning. The Korea Journal of BigData, 6(1), 63-81.
https://doi.org/10.36498/kbigdt.2021.6.1.63
|
|