Multifaceted validity analysis of clinical skills test in the educational field setting

Han Chae; Min-jung Lee; Myung-Ho Kim; Kyuseok Kim; Eunbyul Cho; Chae, Han; Lee, Min-jung; Kim, Myung-Ho; Kim, Kyuseok; Cho, Eunbyul

doi:10.13048/jkm.24001

JKM > Volume 45(1); 2024 > Article

Chae, Lee, Kim, Kim, and Cho: Multifaceted validity analysis of clinical skills test in the educational field setting

Original Article

The Journal of Korean Medicine 2024; 45(1): 1-16.

Published online: March 1, 2024

DOI: https://doi.org/10.13048/jkm.24001

Multifaceted validity analysis of clinical skills test in the educational field setting

Han Chae¹

, Min-jung Lee²^{, 3}

, Myung-Ho Kim⁴

, Kyuseok Kim⁵

, Eunbyul Cho⁶

¹School of Korean Medicine, Pusan National University

²Department of Medical Education, Seoul National University College of Medicine

³Department of Human Systems Medicine, Seoul National University

⁴Department of Internal Korean Medicine, Woosuk University Medical Center

⁵Department of Ophthalmology, Otorhinolaryngology, and Dermatology of Korean Medicine, College of Korean Medicine, Kyung Hee University

⁶KM Science Division, Korea Institute of Oriental Medicine

Correspondence to: Eunbyul Cho, KM Science Division, Korea Institute of Oriental Medicine, 1672, Yuseong-daero, Yuseong-gu, Daejeon, 34054, Republic of Korea, Tel: +82-42-869-2779, Fax: +82-42-861-5800, E-mail: chostar427@gmail.com

Received September 1, 2023 Revised December 6, 2023 Accepted February 16, 2024

Abstract

Introduction

The importance of clinical skills training in traditional Korean medicine education is increasingly emphasized. Since the clinical skills tests are high-stakes tests that determine success in national licensing exams, it is essential to develop reliable multifaceted analysis methods for clinical skills tests in actual education settings. In this study, we applied the multifaceted validity evaluation methods to the evaluation results of the cardiopulmonary resuscitation module to confirm the applicability and effectiveness of the methods.

Methods

In this study, we used internal consistency, factor analysis, generalizability theory G-study and D-study, ANOVA, Kendall’s tau, descriptive statistics, and other statistical methods to analyze the multidimensional validity of a cardiopulmonary resuscitation test in clinical education settings over the past three years.

Results

The factor analysis and internal consistency analysis showed that the evaluation rubric had an unstable structure and low concordance. The G-study showed that the error of the clinical skills assessment was large due to the evaluator and unexpected errors. The D-study showed that the variance error of the evaluator should be significantly reduced to validate the evaluation. The ANOVA and Kendall’s tau confirmed that evaluator heterogeneity was a problem.

Discussion and Conclusion

Clinical skills tests should be continuously evaluated and managed for validity in two steps of pre-production and actual implementation. This study has presented specific methods for analyzing the validity of clinical skills training and testing in actual education settings. This study would contribute to the foundation for competency-based evidence-based education in practical clinical training.

Keywords: Clinical skills test, CPR examination rubric, multifaceted validity, Generalizability Theory

Acknowledgement

본 연구는 부산대학교의 연구비지원을 받았음.

This work was supported by a 2-Year Research Grant of Pusan National University.

Notes

Conflict of interest

We authors declare no potential conflict of interest relevant to this article.

Ethical statement

This study was approved b the Institutional Review Board (Consent No.: PNU IRB/2022_75_HR).

Data availability

The data of this study are available upon reasonable request.

Fig. 1

CPR total score according to the year and rater.

Table 1

Description and scale reliability statistics of CPR examination in year 2019, 2020 and 2021

Year	Group	n	items	Cronbach's α	Mean	Std.Dev.	Median	Variance	skewness	kurtosis	Min.	Max.
2019		47	16	0.646	17.362	2.566	90.000	164.593	−0.615	−0.607	55	100
	1	6	16		81.667	11.690	82.500	136.667	0.245	0.959	65	100
	2	6	16		75.833	16.558	77.500	274.167	0.128	−0.665	55	100
	3	6	16		87.500	13.693	90.000	187.500	−0.876	−0.048	65	100
	4	6	16		97.500	4.183	100.000	17.500	−1.537	1.429	90	100
	5	6	16		90.833	12.007	95.000	144.167	−1.201	0.847	70	100
	6	6	16		96.667	6.055	100.000	36.667	−1.952	3.657	85	100
	7	6	16		81.667	12.910	77.500	166.667	0.705	−1.623	70	100
	8	5	16		82.000	7.583	80.000	57.500	0.315	−3.081	75	90

2020		49	16	0.457	15.667	1.589	95.000	63.223	−1.104	0.188	70	100
	1	6	16		95.000	6.325	97.500	40.000	−0.889	−0.781	85	100
	2	6	16		90.833	8.612	90.000	74.167	−0.026	−2.367	80	100
	3	6	16		88.333	10.328	90.000	106.667	−1.172	1.970	70	100
	4	6	16		90.000	7.746	90.000	60.000	0.000	−1.875	80	100
	5	6	16		96.667	6.055	100.000	36.667	−1.952	3.657	85	100
	6	6	16		97.500	6.124	100.000	37.500	−2.449	6.000	85	100
	7	6	16		92.500	8.216	95.000	67.500	−0.811	−1.029	80	100
	8	7	16		94.286	8.864	95.000	78.571	−2.215	5.299	75	100

2021		41	14	0.545	13.500	1.569	94.100	83.106	−0.973	0.542	65	100
	1	6	14		90.200	7.101	91.150	50.428	0.086	−1.541	82	100
	2	6	14		99.017	2.409	100.000	5.802	−2.449	6.000	94	100
	3	6	14		99.017	2.409	100.000	5.802	−2.449	6.000	94	100
	4	6	14		91.183	10.343	94.100	106.970	−0.492	−1.928	77	100
	5	6	14		88.217	6.427	88.200	41.302	−1.358	2.467	77	94
	6	6	14		87.267	10.777	88.250	116.135	−0.517	−0.596	71	100
	7	5	14		82.340	10.176	88.200	103.548	−1.932	3.701	65	88

Table 2

Item reliability statistics of CPR examination rubric in year 2019, 2020 and 2021

item	2019				2020				2021

	Mean	SD	Item-rest correlation	α if deleted	Mean	SD	Item-rest Correlation	α if deleted	Mean	SD	Item-rest correlation	α if deleted
1	0.979	0.146	0.140	0.644	1.000	0.000	-	-	0.976	0.156	0.056	0.549
2	0.979	0.146	0.140	0.644	0.980	0.143	0.063	0.456	0.951	0.218	0.322	0.509
3	0.979	0.146	0.140	0.644	0.980	0.143	0.063	0.456	0.927	0.264	0.381	0.490
4	0.745	0.441	0.190	0.641	0.857	0.354	0.142	0.443	0.976	0.156	0.056	0.549
5	0.979	0.146	0.319	0.635	1.000	0.000	-	-	1.000	0.000	-	-
6	0.979	0.146	−0.094	0.655	0.959	0.200	−0.106	0.484	0.927	0.264	0.243	0.519
7	0.809	0.449	0.368	0.613	0.837	0.373	0.133	0.447	0.780	0.419	0.382	0.471
8	1.681	0.515	0.507	0.585	1.898	0.306	0.172	0.435	1.707	0.461	0.499	0.422
9	0.830	0.433	0.298	0.625	0.918	0.277	0.359	0.390	0.902	0.300	0.317	0.500
10	1.213	0.907	0.414	0.621	1.673	0.658	0.356	0.351	0.976	0.156	0.056	0.549
11	0.957	0.204	0.161	0.642	0.959	0.200	0.233	0.429	1.683	0.471	0.308	0.498
12	1.000	0.209	0.000	0.653	1.000	0.000	-	-	0.927	0.264	0.047	0.557
13	1.809	0.449	0.260	0.630	1.898	0.421	−0.007	0.498	1.000	0.000	-	-
14	0.957	0.204	0.118	0.645	0.939	0.242	0.359	0.398	1.780	0.419	−0.056	0.608
15	0.936	0.247	0.257	0.634	0.939	0.242	0.064	0.459
16	1.532	0.654	0.656	0.539	1.816	0.391	0.323	0.382	·	·	·	·

Table 3

Model fit measures of factor analysis using CPR rubric items in year 2019, 2020 and 2021

Year	RMSEA	TLI	BIC	χ²	df	p-value
2019	0.583	−0.047	987.208	1275.969	75	<.001
2020	0.487	−0.055	559.066	808.143	64	<.001
2021	0.136	0.358	−83.077	76.606	43	0.001

RMSEA, Root Mean Square Error of Approximation; Tucker-Lewis Index, TLI; Bayesian information criterion, BIC.

Table 4

Factor loading of CPR rubric items in year 2019, 2020 and 2021

	2019				2020			2021

	Factor			Uniqueness	Factor		Uniqueness	Factor		Uniqueness

	1	2	3		1	2		1	2
1	0.994			0.002			0.990			0.997
2	0.994			0.002	0.997		0.005	0.935		0.121
3	0.994			0.002	0.997		0.005	0.821		0.309
4			0.451	0.725			0.985			0.946
5		0.426		0.815	-	-	-	-	-	-
6			0.927	0.131			0.995		0.308	0.892
7		0.546		0.649			0.997		0.926	0.134
8		0.714		0.419			0.990		0.542	0.669
9		0.348	−0.464	0.663		0.967	0.065	0.364		0.839
10		0.503		0.740		0.725	0.467			0.952
11				0.951			0.997		0.460	0.779
12			−0.599	0.627	-	-	-	0.386		0.839
13	0.633		0.342	0.458			0.992	-	-	-
14	0.688			0.507		0.602	0.637			0.995
15				0.950			0.918
16		0.917		0.159	0.304		0.893

Table 5

Analysis results of G-study in year 2019, 2020 and 2021

Year	Effect	degree of freedom	T	Sum of Square	Mean of Square	Variance Component
2019	p	40	2843703.645	2214042.579	55351.064	−3489.952
	r	3	651972.027	22310.961	7436.987	−1509.119
	pr	120	11183319.360	8317304.754	69310.873	69310.873

	Total	163		10553658.294

2020	p	48	60633.713	45420.887	946.268	71.335
	r	3	15293.153	80.327	26.776	−12.942
	pr	144	155887.719	95173.679	660.928	660.928

	Total	195		140674.893

2021	p	46	189166927.415	152602872.108	3317453.741	−179236.126
	R	3	37879212.951	1315157.643	438385.881	−76510.901
	Pr	138	747229042.693	556746957.635	4034398.244	4034398.244

	Total	187		710664987.385

Table 6

Analysis results of D-study in year 2020

Size of facets		Variance components			Coefficients

student	rater	σ²(τ)	σ²(δ)	σ²(Δ)	Eρ²	Φ
1	4	71.335	165.232	165.232	0.30	0.30
1	8	71.335	82.616	82.616	0.46	0.46
1	12	71.335	55.077	55.077	0.56	0.56
1	21	71.335	31.473	31.473	0.69	0.69
1	22	71.335	30.042	30.042	0.70	0.70
1	23	71.335	28.736	28.736	0.71	0.71

참고문헌

1. Chae H, Cho E, Kim SK, et al. Analysis on validity and academic competency of mock test for Korean Medicine National Licensing Examination using Item Response Theory. Keimyung Medical Journal. 2023.

2. Park SH. Possibilities and Limits of High Stakes Testing in US. Korean Journal of Comparative Education. 2010; 20:1–21.

3. Eggen TJ, Stobart G. High-stakes testing–value, fairness and consequences. High-Stakes Testing in Education. Routledge: 2015. p. 1–6.

4. Korea Health Personnel Licensing Examination Institute. Clinical skill test. https://www.kuksiwon.or.kr/EngHome/cnt/c_3109/view.do?seq=18(2023, accessed 2023-06-18 2023).

5. Kim KS. Introduction and administration of the clinical skill test of the medical licensing examination, republic of Korea (2009). Journal of Educational Evaluation for Health Professions. 2010; 7

6. Han SY, Lee S.-H, Chae HDeveloping a best practice framework for clinical competency education in the traditional East-Asian medicine curriculum. BMC Med Educ. 2022; 22:3522022; 05. 11. DOI:https://doi.org/10.1186/s12909-022-03398-4

7. Shin J, Go Y, Song C, et al. Presentation on research trends and suggestion for further research and education on Objective Structured Clinical Examination and Clinical Performance Examination in Korean Medicine education: Scoping review. Society of Preventive Korean Medicine. 2022; 26:87–112. 10.25153/SPKOM.2022.26.2.008

8. Korean Laws Information Center. ACT ON DEVELOPMENT OF E-LEARNING INDUSTRY AND PROMOTION OF UTILIZATION OF E-LEARNING. Ministration of Trade, Industry and Engergy. 18358:Sejong, Korea: Korean Laws Information Center;2021.

9. Chae H, Han SY, Yang G, et al. Study on the herbology test items in Korean medicine education using Item Response Theory. Kor J Herbology. 2022; 37:13–21. DOI:https://doi.org/10.6116/kjh.2022.37.2.13

10. Chae H, Lee SJ, Han c-h, et al. Study on the Academic Competency Assessment of Herbology Test using Rasch Model. J Korean Med. 2022; 43:27–41. DOI:https://doi.org/10.13048/jkm.22017

11. Kang Y. Evaluating the cutoff score of the advanced practice nurse certification examination in Korea. Nurse Education in Practice. 2022; 63:103407DOI:https://doi.org/10.1016/j.nepr.2022.103407

12. Kim S, Kim Y. Generalizability Theory. 2nd ed. Paju: Education Science Publishing;2016.

13. Nunnally JC, Bernstein IH. Psychometric theory. 3 rd ed. New York: McGraw-Hill;1994.

14. Shin S, Kim GS, Song JA, et al. Development of examination objectives based on nursing competency for the Korean Nursing Licensing Examination: a validity study. J Educ Eval Health Prof. 2022; 19:192022; 08. 23. DOI:https://doi.org/10.3352/jeehp.2022.19.19

15. Seong T, Kang DJ, Kang E, et al. Introduction to Modern Pedagogy. Seoul: Hakjisa;2018.

16. Lee M-j. Exploring the feasibility of implementing criterion-referenced assessment in Korean medicine education: enhancing comprehension and relevance. J Kor Med Edu. 2023; 1:10–14. DOI:https://doi.org/10.23215/JKME.PUB.1.1.10

17. Chae H. Jamovi, an open-source software for teaching data literacy and performing medical research. J Kor Med edu. 2023; 1:28–36. DOI:https://doi.org/10.23215/JKME.PUB.1.2.28

18. Navas-Ferrer C, Urcola-Pardo F, Subirón-Valera AB, et al. Validity and reliability of objective structured clinical evaluation in nursing. Clinical Simulation in Nursing. 2017; 13:531–543.

19. Hur HK, Park SM, Kim KK, et al. Evaluation of Lasater judgment rubric to measure nursing student’ performance of emergency management simulation of hypoglycemia. Journal of Korean Critical Care Nursing. 2012; 5:15–27.

20. Kim J, Cho L.-RAnalysis of error source in subjective evaluation on patient dentist interaction: Application of Generalizability Theory. The Journal of the Korean Dental Association. 2019; 57:448–455.

21. Lee SY, Lm SJ, Yune SJ, et al. Assessment of Medical Students in Clinical Clerkships. Korean Medical Education Review. 2013; 15:120–124.

22. Rim MK, Ahn D.-S, Hwang IH, et alValidation study to establish a cutoff for the national health personnel licensing examination. 2014. Korea Health Personnel Licensing Examination Institute.

23. Ahn S, Choi S. Proposal for a Cut Score for the Physics Ability Test: Comparison between the Modified Angoff, Bookmark, and IDM Methods. New Physics: Sae Mulli. 2018; 68:599–610. DOI:http://dx.doi.org/10.3938/NPSM.68.599

24. Schoonheim-Klein M, Muijtjens A, Habets L, et al. Who will pass the dental OSCE? Comparison of the Angoff and the borderline regression standard setting methods. European Journal of Dental Education. 2009; 13:162–171. DOI:https://doi.org/10.1111/j.1600-0579.2008.00568.x

25. Pell G, Fuller R, Homer M, et al. How to measure the quality of the OSCE: A review of metrics – AMEE guide no. 49. Medical Teacher. 2010; 32:802–811. 10.3109/0142159X.2010.507716

26. Downing SM. Validity: on meaningful interpretation of assessment data. Med Educ. 2003; 37:830–837. 10.1046/j.1365-2923.2003.01594.x

27. Harden RM, Lilley P, Patricio M. The Definitive Guide to the OSCE: The Objective Structured Clinical Examination as a performance assessment. Elsevier Health Sciences;2015.

28. Patrício MF, Julião M, Fareleira F, et al. Is the OSCE a feasible tool to assess competencies in undergraduate medical education? Medical teacher. 2013; 35:503–514.

29. Ghouri A, Boachie C, McDowall S, et al. Gaining an advantage by sitting an OSCE after your peers: a retrospective study. Medical Teacher. 2018; 40:1136–1142.

30. Iyer A, Dovedi V. Is there such a thing as a fair OSCE? Medical Teacher. 2018; 40:1192–1192.