TRANSLATION AND CROSS-CULTURAL ADAPTATION OF GENERIC SKILL SELF-ASSESSMENT INSTRUMENT FOR INDONESIAN UNDERGRADUATE MEDICAL STUDENTS

Background: There are various educational strategies that promote generic skills development in medical education; hence, there is a need for a valid and reliable instrument to assess them. This study aims to translate and adapt a generic skills self-assessment instrument developed by Groen et al.1 to assess Indonesian medical student’s generic skills in a classroom context. Methods: WHO's guidelines were used for the translation process, which consisted of: 1) forward translation, 2) expert panel review (using the Delphi method), 3) back translation, 4) pre-testing and cognitive interviews, and 5) the final version. Additional measures were employed to improve the translation accuracy, including proofreading (prior to step 2), expert panel review after step 3 and 4, and pilot testing along with psychometric testing after step 5. Backward translation was done by a professional translation service. Ten fourth-year students from Atma Jaya School of Medicine and Health Sciences were involved in step 4; meanwhile, we piloted the translated instrument to 35 other fourth-year students from the same sample pool. We also conducted an internal reliability test using Cronbach's alpha and construct validity test, including corrected total-item correlation and principal component analysis. Results: Steps 1-3 produced an Indonesian version of the generic skills assessment instrument with good face and content validity. Quantitative data analysis showed high internal reliability (Cronbach’s Alpha = .955


INTRODUCTION
][5][6] Examples of essential generic skills include critical thinking, organizational skills, mental flexibility, communication, interpersonal and teamwork skills, self-leadership, and digital literacy.Due to its importance, generic skills development has become one of higher education's primary objectives. 7ployers today value a vast array of generic skills.Approximately 90% of employers identified critical thinking abilities as "very" and "somewhat" important, yet only 39% of employers think that higher education graduates adequately possess this skill. 8Teamwork is another skill desired by employers, as 93% of them viewed these as "very" and "somewhat" important, yet only 48% believed that fresh graduates are able to perform effectively in teams. 8Particularly in the healthcare field, teamwork, intra-and inter-collaboration, as well as critical and logic-systematic thinking, are often sought from healthcare workers to improve healthcare delivery quality, cost-effectiveness, and efficiencies. 9urther, critical and logic-systematic thinking also improves the accuracy of diagnosis-making and disease classification, determining the best and most appropriate therapy regiment, and reducing medical error rates. 10,11ble professionalism (defined as professionalism that is based on one's faith in God Almighty), self-reflective, lifelong learner, and effective communicator are the essential characteristics expected from an Indonesian physician. 12As such, developing generic skills associated with these characteristics becomes an inherent process of Indonesian formal medical education.Personalized feedback, flipped classrooms, reflective writing, and other student-centered learning approaches such as problem-based learning (PBL) and skills laboratory (SL) are often used as educational strategies to develop the aforementioned generic skills during the undergraduate medical education phase. 13,14nfortunately, those educational strategies are mainly used to facilitate students' medical knowledge and skills during the preclinical education phase, and thus the majority of available instruments to assess students' learning are not suitable to assess Indonesian generic skills development.
Groen et al. 1 proposed a self-assessment tool to observe the growth and development of medical students' generic skills.This instrument was adapted from the generic skills acquisition self-assessment questionnaire developed by Maastricht's University for the Bachelor of European Studies program. 15aastricht's self-assessment questionnaire was specifically designed to assess students' performance during problem-based learning.The items were all phrased in positive sentences-which may impact the questionnaire's applicability in a broader learning setting as well as rendering its construct validity due to the risk of response bias.The generic skills self-assessment instrument proposed by Groen  et al. (2020) keeps most of the original statements while adding new items to assess soft skills beyond the PBL context, along with mixing negative phrase statements to stimulate students to be more critical in assessing their skills in hope to reduce the risk of response bias.The revised instrument consists of 33 statements referring to specific generic skills, each is scored using a 5-Likert scale.The items are grouped into five distinct domains deemed relevant in an active-learning context, including "Communication Skills" (6 items), "Analytical Thinking Skills" (7 items), "Teamwork Skills" (8 items), "Time Management Skills" (6 items), and "Professional Attitude" (6 items).

METHODS
We referred to the WHO guideline on translating and adapting an instrument. 16The authors had acquired permission to use the instrument from Groen et al. 1 Ethical clearance was obtained from the Research Ethics Committee at Atma Jaya Catholic University of Indonesia.The WHO process consists of five stages: 1) forward translation, 2)

Step 3 -Proofreading
The authors used professional Indonesian translating and proofreading services with many experiences working on scientific manuscripts to proofread the initial translation that had gone through the first panel expert review.

Step 4 -Second expert panel review
The proofread result in step 3 was reviewed by the same expert panel in step 2. The purpose of this step was to ensure that the context remained unchanged during the translation process.All expert panels agreed with the proofread result, and no further changes were made.

Step 5 -Back translation
The same translating and proofreading service was employed to do the back translation.

Step 6 -Third expert Panel review
The back translation result in step 5 was reviewed and compared with the original instrument.This step was done to ensure translation accuracy.

Step 7 -Pre-testing and Cognitive Interviewing
We used an accidental sampling method to recruit the participants for this step, where we contacted students who shared similar characteristics with the target population for the pilot study to participate.We asked ten fourth-year students who were waiting to enter clinical rotation to fill in the instrument and provide feedback on the translation quality.The purpose of this step was to find out if the students could understand and appropriately fill out the translated self-assessment questionnaire.All participants signed an online informed consent prior to answering all questions in the selfassessment questionnaire independently.They were interviewed via phone calls or online meeting platforms (Microsoft Teams).Each participant was asked to comment on the instrument's readability and provide suggestions, if any, to improve expert-panel, 3) back translation, 4) pre-testing and cognitive interviews, and 5) the final version.We added five stages to ascertain the accuracy of the translated instrument.
Step 1 -Forward Translation from English to Indonesian Forward translation was done by GA, who had a background in medical education and at least 5 years of experience as a medical teacher.
Step 2 -First expert panel review The initial translated instrument was then reviewed by an expert panel consisting of ER, NP, GA, and CC, who were all proficient in English and Indonesian.All members have a minimum of five years of experience as a mentor and educators in Medical Education and are involved actively in Medical Education Unit (MEU).They also had considerable experience in translating and developing instruments.The translated instrument was reviewed to consider the undergraduate medical students' perspective, the Indonesian sociocultural context, including the higher education learning culture in Indonesia, and medical terminology used in Indonesian medical schools.The expert panel put special attention on any potential conceptual discrepancies between the original instrument and the initial translation.
Particularly, the panel considered the sociocultural discrepancies between Western and Eastern contexts in higher education and used Content Validity Index (CVI) to decide the relevance of each item using a 4-point Likert Scale, ranging from 1 (Not Relevant), 2 (Need Major Revision), 3 (Need Minor Revision), and 4 (Relevant). 17The item was considered valid if the interrater agreement was at least 0.8.There were several items that repeatedly did not reach the minimum CVI score because one or more experts disagreed with the terminologies or words used in the translated version, as it might potentially change the concept and/or meaning of the original statement.Hence, the first expert panel review was conducted three times to discuss these controversial items. 17he translation.Participants were also asked to share their overall opinion on the generic skills assessment instrument.
Step 8 -Fourth expert panel review The Fourth expert panel was done to review the suggestions collected in step 7.The panel mainly discussed the sentence structure (negative or positive phrasing), the scale measurement, and wording preference on several English expressions that were not often used in Bahasa Indonesia (example: questioning one's assumption).
Step 9 -Psychometric Testing Steps 1-8 were parts of the content validity process where we sought to ensure the accuracy and quality of our translation qualitatively based on expert judgement.We then proceeded to pilot the translated instrument toward 35 (out of 36) fourthyear undergraduate medical students (5 male and 30 female students) that attended the Medical Education Elective Block at Atma Jaya Catholic University of Indonesia, School of Medicine and Health Sciences.This block was chosen for our pilot study because it employed active and collaborative learning using a project-based learning approach, which supported the development of students' generic skills.This block lasted for four weeks, beginning in the last week of August 2022 until the third week of September 2022.All participants were given an explanation of the study at the beginning of the block and were asked to provide their consent if they agreed to participate.Students were told that participation in this study was voluntary and would not affect their academic performance during the block.They were also told there would be no consequence for students who refused to participate.
After obtaining informed consent, participants were asked to fill in the self-assessment instrument online at the beginning (pretest) and the end (post-test) of the block.We combined the pre and posttest data (n = 70) to get a more representable sample size for the quantitative data analysis.We assume that the pre and post-test results would be different due to the impact of the block's learning activities; thus, we treated the pre and post-test results as different data points for the purpose of construct validation analysis and internal reliability test.We conducted Cronbach's Alpha test to measure the instrument's internal reliability along with corrected item-total correlation and confirmatory principal component analysis (PCA) to measure the construct validity.
The questionnaire would be considered to have good internal reliability if the Cronbach's Alpha is > .70 and have a good corrected item-total correlation if the Pearson correlation coefficient (R value) for each item is > .232.All quantitative analysis was done using IBM SPSS ver.22.
We hypothesize that each component in the generic skills assessment instrument has some degree of correlation with each other (for example, one's ability to communicate might impact one's ability to perform collaboration, gather necessary information for analysis in group learning, manage one's time, and perform one's responsibility professionally), therefore we used oblimin instead of orthogonal rotation during principal component analysis.KMO Bartlett's test of sphericity was conducted prior to performing PCA to check if the data had sufficient partial correlation for the PCA to be meaningful.Eigen value of > 1 was used as the cutoff point in determining the number of factors that existed in the translated questionnaire.

Step 10 -Final Version
Any revision made in step 9 became the final version of the Indonesian generic skills self-assessment for medical students.The validated instrument was then used to measure the generic skills of fourth-year medical students who participated in the Medical Education block for four weeks; the analysis and findings of this study will be published in a separate article.A schematic diagram depicting the translation and cross-cultural adaptation of the generic skills self-assessment instrument into Bahasa Indonesia is presented in Figure 1.

RESULTS AND DISCUSSIONS
The Indonesian translated instrument consisted of 33 items using 5-Likert scales, same as the original.Most changes were made to better reflect the Indonesian undergraduate medical education context.We changed the term "Tutorial" from the original version into "group discussions" to expand the instrument's applicability in various educational activities that encourage active learning.The term "speak up" and "fellow students" were also changed into "give opinion" and "group peers" to suit the group discussions context.Authors simplified some items to improve participants' understanding.Detailed changes in Indonesian translated versions are summarized in Table 1.The second discussion focused on adapting the instrument to assess generic skills during mentorled group discussions in project-based learning instead of a problem-based learning approach.We revised 13 out of 33 items to better reflect the learning interactions and group dynamics used in our setting while also deliberating on any possible cultural differences.Item no. 17 "I suggest an intervention to promote group dynamics" caused controversy during this second discussion.Two experts considered the influence of Indonesian sociocultural context on students' learning tendencies and claimed that Indonesian students typically chose to be passive during the teaching and learning activities.This situation might cause the term "intervention" to be irrelevant in assessing students' teamwork skills.This argument was debated by the other two experts who pointed out that there was not enough empirical evidence to support that claim.In the end, we kept the term "intervention" because it represented a higher and more specific form of group communication skill.
Still focusing on the Indonesian sociocultural context, the last discussion was centered to tackle irrelevant items that needed major revision to better reflect our sociocultural context.For example, item no.32 "I sometimes show my frustrations about the learning process" were considered irrelevant in Indonesian culture.The expert panel expressed contradicting arguments, where some panels agreed with the original version, while the rest disagreed.The panel members who disagreed pointed out that in Asian culture, students/"juniors" rarely showed frustration in public; hence, this would prevent participants from acknowledging this statement and propelled them to choose a normative answer.Similar to the previous debate, the panel agreed to keep the original version since there was not enough empirical evidence that supported the argument.At the end of the third discussion, we got a 0,99 CVI and considered the translated instrument to have great content and face validity.
In the proofreading step (step 3), 24 out of 33 items were adjusted with the Indonesian language's grammar.Most of the items were simplified by removing adjectives or verbs, such as "masalahnya masih belum jelas" (the issue is not yet clear) vs. "masalahnya tidak jelas" (the issue is unclear) and "saya merasa nyaman" (I am feeling comfortable) vs. "saya nyaman" (I am comfortable).At the third expert panel (step 6), eight out of 33 items were further discussed due to the slight deviation in meaning from the original version, such as the word "understand" vs. "appreciate", "draw" vs. "adapt", and "possess" vs. "master".These words have many synonyms in Bahasa Indonesia, so the expert panel focused on selecting words with the most similar meaning in context.
After the third expert panel review, we asked selected students to provide feedback on the readability of the translated instrument (step 7).All the participants in step 7 agreed that the instrument was easy to understand.Seven out of ten participants suggested changing the scale into "strongly disagree" until "strongly agree" instead of "This is a skill that definitely needs further training" until "I am fully capable of doing this".They argued that the original scales were confusing, hence they proposed to simplify the scale to assist them in filling out the self-assessment.Further, the original scale caused two participants to misinterpret the items due to the negative phrasing (for example, item no. 4 "I face difficulties summarizing other students' contributions").Aside from the negative phrasing, there were other items that were misinterpreted by the participants because those items were unusual expressions in Bahasa Indonesia.For example, item no.7 "I am comfortable questioning my assumptions and views" was misunderstood by two participants, where one participant defined the item as reflecting their assumptions, while the other did not understand what this item meant at all.Item no. 10 "I know whether or not the post-discussion has covered all issues raised during the pre-discussion" was also misinterpreted by two participants.One participant thought of this item as a skill to summarize the discussion, while the other defined it as a skill to know whether they have discussed all the Learning Objectives or not.Lastly, the word "intervention" was misinterpreted by one participant as "feedback", while the word "group dynamics" was misinterpreted by another participant as "differences within the group".
Despite the feedback obtained from participants' interview, the expert panel agreed not to turn the negative into positive sentences to keep the translated version as close to the original version.
The expert panel also agreed not to change the rating scale due to similar reasons.The word "summarizing" was kept despite being misinterpreted by the participants because it was the formal translation in Bahasa Indonesia.The expert panel changed the item "questioning my assumptions and views" into "reviewing my assumptions and views" and item "I know whether or not the post-discussion has covered all issues raised during the pre-discussion" into "I realize all the issues raised at the start of the discussion have been discussed" to eliminate the misinterpretations.Other words mentioned in the previous stage were still translated according to the dictionary.

Internal Reliability and Construct Validity
The Cronbach's Alpha of the 33 items is .955,indicating that this instrument has great internal consistency.We then continued to check the corrected item-total correlation of each item and found that the value ranges from .345 to .757.This result showed that each item had a moderate to relatively good discriminant validity.The details of the reliability statistics are presented in Table 1.
The KMO Bartlett's test indicated that the instrument has good partial correlation (KMO = .849,p < .000)and hence it was plausible to do confirmatory PCA using our dataset.We employed oblimin rotation to transform the vectors during our PCA and found 6 instead of 5 components suggested in the original instrument.Component 6 had a moderate negative correlation with component 1, 2, and 3 (r = -.502,r = -.413, and r = -.363respectively) and a weak negative correlation with component 4 and 5 (r = -.134 and r = -.054respectively).This negative correlation was also reflected in the factor loading for each item in component 6, which was indicative of a true negative correlation.On the other hand, component 5 had a weak negative correlation with factors 1, -.189, r = -.116,r = -.066, and r = -.054respectively) and a positive weak correlation with factor 2 (r = .187).However, the items grouped in component 5 had a positive factor loading value; hence, we decided to treat component 5 as having a positive correlation.On the other hand, component 1 had a moderate correlation (r = .515)with component 3 and a moderate reversed correlation with component 6 (r = -.502).Component 2 also had a moderate reversed correlation with component 6 (r=-.413).The details of the component correlation matrix are presented in Table 2.This paper described the cross-cultural adaptation and psychometric testing process of the Indonesian version of Generic Skills Self-Assessment for Medical Students that was proposed by Groen et al. 1 Initially, we meant to only translate the instrument into Bahasa Indonesia to keep the meaning of each item as close as possible to the original instrument as well as to simplify the process.Most of the changes we made during translation included simplifying the English terms and expressions.Hence, finding suitable words that closely resemble the original meaning is the main focus of the expert panel reviews.
During the translation process, several items were carefully reviewed to make them more relevant to Indonesian students.Item no.7 (challenging own assumption), 17 ("intervention"), and 32 (showing frustration) were the most challenging items that were discussed intensely during the expert panel reviews due to potential sociocultural discrepancies that might create additional challenges for students during their self-assessment.For example, item no.17 asks about student's ability to suggest an intervention to improve the group's dynamic.Merriam-Webster defines "intervention" as an active act of interfering to improve a situation.This is not a familiar concept for most Indonesian students, who typically value creating balance and harmony within the social structure. 18On the contrary, creating an intervention might be considered an unwanted behavior as it may potentially disrupt social harmony, 18 hence students might provide a contradicting answer on this item.][21] This situation layered a different translation issue on item no.32 "I sometimes show my frustrations about the learning process".All experts and cognitive interview participants agreed that the translation for item questions in this instrument.Further, we also found that teamwork, analytic, and communication skills seemed to be highly correlated.In particular, analytical skill has a moderate correlation with the teamwork domain (r = .515),which may suggest that Indonesian students' ability to analyze problems tend to be better when they were in a group situation, provided that the group functions properly. 25 the other hand, the communication skills domain has a weak to moderate-weak correlation with the rest of the generic skills domains (r-value ranging from -.134 -.167).However, items representing communication skills from the original instrument can be found in 4 other new generic skills domains.This finding may seem contradictory at first, but further analysis of Indonesian collectivist culture may provide a new perspective on this idiosyncrasy.Collectivist culture puts emphasis on high-context communication, where non-verbal cues and contextual and physical information play an important role in daily conversations. 25The importance of non-verbal communication is reflected in the social judgment domain, where students' ability to perform social judgment hinges on their ability to read the non-verbal cues as well as contextual and physical information.Communication skills are also one of the most highly regarded soft skills towards facilitating efficient teamwork and thus have a significant impact on the team performance. 26,27other intriguing finding is that all of the items in the perseverance domain relate to students' reluctance to admit their weaknesses in public.In a collectivist country, such as Indonesia, people tend to 'save face' by avoiding making mistakes or going against the norm ('trespasses') in public. 20,25Further, students may feel obligated to do their best in a team situation to save their group's face.This might explain why items in this domain relate to one's ability to deal with problems in a group setting (i.e, speaking up, being unprepared during a discussion, showing frustration).In the original instrument, these items illustrate one's ability to acknowledge and confront their weaknesses (professional behavior).It might not be too farfetched to conclude that professional behavior in Indonesia has two dimensions, the personal and the communal dimension.this item was suitable and it was easy to understand.However, there was a concern that students might consider this a weakness (unacceptable behavior), whereas the original statement referred to this as the student's ability to be self-conscious of their weaknesses (acceptable behavior).Students might give normative answers when filling in the selfassessment instrument, which might alter the professionalism construct of the original instrument.A common issue with translating a document using forward and backward translation technique is that we risk doing a literal translation that may impair the natural expression in the target language. 22An iterative process involving repeated reviews by the experts (who are proficient in the original and target language) and the target population is considered a more superior quality control measure in translating an instrument. 22Particularly in the healthcare field, a cross-cultural adaptation of an instrument is preferred to translation to better reflect the value and meaning behind the construct being measured by the instrument. 23However, this process indirectly influences the validity and reliability of the adapted instrument, and hence psychometric testing is encouraged to ensure the instrument's quality in measuring the intended construct. 23e psychometric testing more or less confirmed the need for cross-cultural adaptation of the instrument.Despite a high internal consistency (Cronbach's Alpha > 0.8), there were items that had relatively low item-total correlation, including item no. 2 "My group peers often do not appreciate my contributions" (r = .345),item no. 5 "I am often nervous to give an opinion" (r = .374),and item no.32 "I sometimes show my frustrations during the learning process" (r = .483).The low r value of these items may suggest that these items were less relevant in measuring the generic skills construct for Indonesian students compared to other items in the instrument. 24nsidering that factor analysis was not done on the original instrument, we conducted an exploratory instead of confirmatory factor analysis using principal component analysis.Results from this analysis reveal a more interesting result for further discussion.We found 6 instead of 5 generic skills domains, resulting in the reclassification of

Figure 1 .
Figure 1.Cross Cultural Adaptation Process of Indonesian Generic Skills Self-Assessment for Medical Students

Table 1 . Comparison between The Original and Indonesian Version of generic Skills Self-Assessment Instrument (Cronbach's Alpha = .955) Item number Original version Indonesian version Corrected Item-Total Correlation Cronbach's Alpha if Item Deleted Communication Skills Keterampilan Komunikasi
Saya kesulitan berefleksi secara kritis apa yang telah dibaca (I have difficulties to critically reflect on what I have read).