eprintid: 79402 rev_number: 9 eprint_status: archive userid: 1290 dir: disk0/00/07/94/02 datestamp: 2023-11-14 01:26:38 lastmod: 2023-11-14 01:26:38 status_changed: 2023-11-14 01:26:38 type: thesis metadata_visibility: show creators_name: Elvira, Melly creators_name: Retnawati, Heri title: Pengembangan Instrumen Penilaian Kinerja Praktikum Kimia. ispublished: pub subjects: D0 subjects: F3 divisions: pps_lit_evazdik full_text_status: restricted keywords: mahasiswa kimia, MFRM, penilaian kinerja, praktikum kimia abstract: Penelitian ini bertujuan (1) menghasilkan konstruk instrumen penilaian kinerja praktikum kimia, (2) menguji kelayakan instrumen penilaian kinerja yang dikembangkan, (3) Menghasilkan instrumen penilaian kinerja yang memiliki karakteristik baik sehingga dapat digunakan pada kegiatan praktikum kimia, dan (4) menentukan profil keterampilan mahasiswa dalam pelaksanaan praktikum berdasarkan instrumen yang dikembangkan tersebut. Penelitian ini merupakan penelitian kuantitatif dengan desain pengembangan instrumen (instrumen development). Penelitian ini melalui tiga tahap, yaitu tahap studi pendahuluan, tahap pengembangan instrumen, dan tahap uji coba. penelitian ini melibatkan tiga facets pengukuran yakni: mahasiswa kimia Program Studi Kimia, dosen kimia sebagai rater, dan kriteria penilaian praktikum kimia. Teknik sampling yang digunakan adalah convenience yang merupakan sampel dari populasi hipotetik sebagai populasi yang ditetapkan dengan kriteria yang dimiliki oleh sampel yang ada. Validitas isi instrumen diperoleh melalui expert judgement dan validitas konstruk diperoleh melalui CFA. Karakteristik facets instrumen diperoleh menggunakan Many-Facet Rasch Model (MFRM) menggunakan aplikasi Facets Rasch. Hasil penelitian adalah sebagai berikut. (1) Konstruk instrumen kinerja praktikum kimia mahasiswa dikembangkan melalui telaah literatur menghasilkan lima indikator yakni kemampuan literasi, kemampuan praktis, kemampuan analisis data, keterampilan komunikasi, dan sikap. (2) Instrumen kinerja praktikum kimia mahasiswa yang dikembangkan memiliki validitas yang baik dan reliabilitas yang sesuai dengan kriteria yang mendukung instrumen layak digunakan untuk mengukur kemampuan praktikum kimia mahasiswa. (3) Karakteristik instrumen menunjukkan 3 facet yang dilibatkan pada pengukuran kemampuan praktikum kimia yakni: tingkat keparahan rater, tingkat kesulitan kriteria penilaian, dan kemampuan mahasiswa secara umum cocok dengan model meskipun ada beberapa item yang tidak fit dengan model namun masih bisa direvisi. (4) Hasil pengukuran kinerja praktikum kimia mahasiswa menunjukkan bahwa kemampuan mahasiswa dapat dikelompokkan menjadi 11 tingkatan. Kemampuan mahasiswa cenderung memiliki penyebaran yang agak luas, sehingga beberapa mahasiswa terdeteksi memiliki nilai yang ekstrem. Secara global sekitar 52% mahasiswa memiliki kinerja praktikum yang sesuai dengan harapan bahkan melampaui harapan. date: 2023-08-29 date_type: published institution: Sekolah Pascasarjana department: Penelitian dan Evaluasi Pendidikan thesis_type: disertasi referencetext: Adams, C. J. (2020). A constructively aligned first-year laboratory course. Journal of Chemical Education, 97(7), 1863–1873. https://doi.org/10.1021/acs.jchemed.0c00166 Addabbo, T., Ales, E., Curzi, Y., Fabbri, T., Rymkevich, O., & Senatori, I. (2021). Performance Appraisal in Modern Employment Relations: An Interdisciplinary Approach. Springer International Publishing. https://books.google.co.id/books?id=85IYzgEACAAJ Agustian, H. Y., Finne, L. T., Jørgensen, J. T., Pedersen, M. I., Christiansen, F. V., Gammelgaard, B., & Nielsen, J. A. (2022). Learning outcomes of university chemistry teaching in laboratories: A systematic review of empirical literature. Review of Education, 10(2), e3360. https://doi.org/https://doi.org/10.1002/rev3.3360 Albright, J. J. (2006). Confirmatory factor analysis using AMOS, LISREL, and MPLUS. In Confirmatory Factor Analysis. The Trustees of Indiana University. Aldiyah, E. (2021). Lembar kerja peserta didik (LKPD) pengembangan sebagai sarana peningkatan keterampilan proses pembelajaran IPA di SMP. TEACHING: Jurnal Inovasi Keguruan Dan Ilmu Pendidikan, 1(1), 67–76. https://doi.org/https://doi.org/10.51878/teaching.v1i1.85 Ali, S. S. (2019). Problem based learning: A student-centered approach. English Language Teaching, 12(5), 73–78. Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. : Wadsworth. Ambusaidi, A., Al Musawi, A., Al-Balushi, S., & Al-Balushi, K. (2018). The impact of virtual lab learning experiences on 9th grade students’ achievement and their attitudes towards science and learning by virtual lab. Journal of Turkish Science Education, 15(2), 13–29. Andersen, E. B. (1973). A goodness of fit test for the Rasch model. Psychometrika, 38(1), 123–140. Andrich, D. (1978). Scaling attitude items constructed and scored in the likert tradition. Educational and Psychological Measurement, 38(3), 665–680. https://doi.org/10.1177/001316447803800308 Andrich, D. (2010). Rasch models. In International Encyclopedia of Education. https://doi.org/10.1016/B978-0-08-044894-7.00258-X Andrich, D. (2019). Rasch Rating-Scale Model. In W. J. van der Linden (Ed.), Handbook of item response theory (1st ed., pp. 75–94). Chapman and Hall/CRC. https://doi.org/https://doi.org/10.1201/9781315119144 310 Anggraini, W. D., Ramlawati, & Anwar, M. (2017). Pengembangan Perangkat Penilaian Kinerja dan Sikap Pada Praktikum Titrimetri dan Gravimetri Smk- Smti Makassar. Chemistry Education Review, 1(1), 35–44. Anthony, C. J., Styck, K. M., Volpe, R. J., & Robert, C. R. (2023). Using many- facet rasch measurement and generalizability theory to explore rater effects for direct behavior rating–multi-item scales. In School Psychology (Vol. 38, pp. 119–128). Educational Publishing Foundation. https://doi.org/10.1037/spq0000518 Antrakusuma, B., Indriyanti, N. Y., & Sari, M. W. (2021). Preliminary Study: Chemistry Laboratory Virtual Innovation as An Optimization of Science Learning During the Covid-19 Pandemic. Jurnal Pena Sains, Vol 8, No 2 (2021): Jurnal Pena Sains, 88–94. https://journal.trunojoyo.ac.id/penasains/article/view/12048/Virtual Lab Armstrong, M. (2006). Performance management: key strategies and practical guidelines (3rd ed). Kogan Page. Aurora Discovery, I. (2005). An Interview with Peter J. Coassin. Assay and Drug Development Technologies, 3(2), 125–131. Ausubel, D., Novak, J. D., & Hanesian, H. (1978). Educational Psychology: A Cognitive view New York. Werbel & Peck. Aykaç, N., Ulubey, Ö., Çelik, Ö., & Korkut, P. (2019). The Effects of Drama on Pre-service Teachers’ Affective Traits about Teaching. International Journal of Contemporary Educational Research, 6(2), 338–351. https://doi.org/10.33200/ijcer.587566 Azwar, S. (2015). Reliabilitas dan Validtas (IV). Pustaka Pelajar. Babin, B. J., & Zikmund, W. G. (2016). Exploring Marketing Research (11th Editi). Cengage Learning. Baharudin, H., Maskor, Z. M., & Matore, M. E. E. M. (2022). The raters’ differences in Arabic writing rubrics through the Many-Facet Rasch measurement model. Frontiers in Psychology, 13(December), 1–10. https://doi.org/10.3389/fpsyg.2022.988272 Bajpai, N. (2018). Business Research Methods. In Pearson Education (Second Edi). Pearson India. http://www.amazon.com/Business-Research-Methods- 2nd-Edition/dp/1741032539 Ballen, C. J., Wieman, C., Salehi, S., Searle, J. B., & Zamudio, K. R. (2017). Enhancing Diversity in Undergraduate Science: Self-Efficacy Drives Performance Gains with Active Learning. CBE—Life Sciences Education, 16(4), ar56. https://doi.org/10.1187/cbe.16-12-0344 Barak, M., & Dori, Y. J. (2005). Enhancing undergraduate students’ chemistry 311 understanding through project-based learning in an IT environment. Science Education, 89(1), 117–139. https://doi.org/https://doi.org/10.1002/sce.20027 Barkaoui, K. (2013). Multifaceted Rasch Analysis for Test Evaluation. In The Companion to Language Assessment (Issue January). https://doi.org/10.1002/9781118411360.wbcla070 Barrett, F. S., Robins, R. W., & Janata, P. (2013). A brief form of the affective Neuroscience Personality Scales. Psychological Assessment, 25(3), 826–843. https://doi.org/10.1037/a0032576 Bashooir, K., & Supahar. (2018). Validitas dan reliabilitas instrumen asesmen kinerja literasi sains pelajaran Fisika berbasis STEM. Jurnal Penelitian Dan Evaluasi Pendidikan, 22(2), 168–181. https://doi.org/10.21831/pep.v22i2.20270 Bejar, I. I. (1983). Achievement Testing: Recent Advances. In Journal of the American Statistical Association (Vol. 80, Issue 390). SAGE Publications, Inc. https://doi.org/10.2307/2287925 Bejar, I. I. (2012). Rater cognition: Implications for validity. Educational Measurement: Issues and Practice, 31(3), 2–9. https://doi.org/10.1111/j.1745- 3992.2012.00238.x Bejar, I. I., Williamson, D. M., & Mislevy, R. J. (2006). Automated Scoring of Complex Tasks in Computer-Based Testing. In Automated Scoring of Complex Tasks in Computer-Based Testing (Issue July). https://doi.org/10.4324/9780415963572 Belford, R. E., & Gupta, T. (2019). Introduction: Technology Integration in Chemistry Education and Research. In Technology Integration in Chemistry Education and Research (TICER) (Vol. 1318, p. 1). American Chemical Society. https://doi.org/doi:10.1021/bk-2019-1318.ch001 Berestneva, O., Marukhina, O., Benson, G., & Zharkova, O. (2015). Students’ Competence Assessment Methods. Procedia - Social and Behavioral Sciences, 166, 296–302. https://doi.org/10.1016/j.sbspro.2014.12.527 Billelo, E. D. (2022). Criterion-Referenced Assessments. Routledge. Black, P., Harrison, C., Lee, C., Marshall, B., & William, D. (2003). Assessment for Learning- putting it into practice. Open university Press. http://www.mcgraw-hill.co.uk/html/0335212972.html Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551–575. https://doi.org/10.1080/0969594X.2018.1441807 Blackburn, R. A. R., Villa-Marcos, B., & Williams, D. P. (2019). Preparing Students for Practical Sessions Using Laboratory Simulation Software. 312 Journal of Chemical Education, 96(1), 153–158. https://doi.org/10.1021/acs.jchemed.8b00549 Bogo, M., Regehr, C., Katz, E., Logie, C., Tufford, L., & Litvack, A. (2012). Evaluating an Objective Structured Clinical Examination (OSCE) Adapted for Social Work. Research on Social Work Practice, 22(4), 428–436. https://doi.org/10.1177/1049731512437557 Bolden, B., DeLuca, C., Kukkonen, T., Roy, S., & Wearing, J. (2020). Assessment of Creativity in K-12 Education: A Scoping Review. Review of Education, 8(2), 343–376. https://doi.org/https://doi.org/10.1002/rev3.3188 Bond, T. G., Yan, Z., & Heene, M. (2020). Applying the rasch model: Fundamental measurement in the human sciences (4th ed.). Routledge. https://doi.org/https://doi.org/10.4324/9780429030499 Boone, W. J., & Noltemeyer, A. (2017). Rasch analysis: A primer for school psychology researchers and practitioners. Cogent Education, 4(1). https://doi.org/10.1080/2331186X.2017.1416898 Boone, W. J., Townsend, J. S., & Staver, J. R. (2016). Utilizing Multifaceted Rasch Measurement Through FACETS to Evaluate Science Education Data Sets Composed of Judges, Respondents, and Rating Scale Items: An Exemplar Utilizing the Elementary Science Teaching Analysis Matrix Instrument. Science Education, 100(2), 221–238. https://doi.org/https://doi.org/10.1002/sce.21210 Börner, K., Bueckle, A., & Ginda, M. (2019). Data visualization literacy: Definitions, conceptual frameworks, exercises, and assessments. Proceedings of the National Academy of Sciences, 116(6), 1857–1864. https://doi.org/10.1073/pnas.1807180116 Boud, D., & Falchikov, N. (2007). Introduction: Assessment for the longer term. In Rethinking assessment in higher education (pp. 13–23). Routledge. Brazil, R. (2017). Curriculums for modern chemist. In Education in chemistry. University of Leeds. Brennan, R. L. (2000). Performance assessments from the perspective of generalizability theory. Applied Psychological Measurement, 24(4), 339–353. https://doi.org/10.1177/01466210022031796 Brennan, R. L. (2001). Generalizability Theory. Springer. Brennan, R. L. (2011). Generalizability theory and classical test theory. Applied Measurement in Education, 24(1), 1–21. https://doi.org/10.1080/08957347.2011.532417 Breton, G., Lepage, S., & North, B. (2008). Cross-language benchmarking seminar to calibrate examples of spoken production in English, French, German, 313 Italian and Spanish with regard to the six levels of the Common European Framework of Reference for Languages (CEFR). Bretz, S. L., Fay, M., Bruck, L. B., & Towns, M. H. (2013). What faculty interviews reveal about meaningful learning in the undergraduate chemistry laboratory. Journal of Chemical Education, 90(3), 281–288. https://doi.org/10.1021/ed300384r Briesch, A. M., Swaminathan, H., Welsh, M., & Chafouleas, S. M. (2014). Generalizability theory: A practical guide to study design, implementation, and interpretation. Journal of School Psychology, 52(1), 13–35. https://doi.org/https://doi.org/10.1016/j.jsp.2013.11.008 Brinton, B., Fujiki, M., & Fujiki, R. B. (2021). Principles of Assessment and Intervention. In The Handbook of Language and Speech Disorders (pp. 110– 127). https://doi.org/https://doi.org/10.1002/9781119606987.ch6 Brockman, R. M., Taylor, J. M., Segars, L. W., Selke, V., & Taylor, T. A. H. (2020). Student perceptions of online and in-person microbiology laboratory experiences in undergraduate medical education. Medical Education Online, 25(1), 1710324. https://doi.org/10.1080/10872981.2019.1710324 Brookhart, S. M. (2011). Educational Assessment Knowledge and Skills for Teachers. Educational Measurement: Issues and Practice, 30(1), 3–12. https://doi.org/10.1111/j.1745-3992.2010.00195.x Brookhart, S. M. (2013). How to create and use rubrics for formative assessment and grading. Association for Supervision & Curriculum Development. http://bit.ly/3JkVojK Brookhart, S. M., & McMillan, J. H. (2019). Classroom Assessment and Educational Measurement. Taylor & Francis. https://doi.org/10.4324/9780429507533 Brookhart, S. M., & Nitko, A. J. (2019). Educational assesment of students (8th ed.). Pearson. Brooks, C., Carroll, A., Gillies, R. M., & Hattie, J. (2019). A matrix of feedback for learning. Australian Journal of Teacher Education (Online), 44(4), 14–32. https://search.informit.org/doi/10.3316/ielapa.329362104776530 Bruck, A. D., & Towns, M. (2013). Development, implementation, and analysis of a national survey of faculty goals for undergraduate chemistry laboratory. Journal of Chemical Education, 90(6), 685–693. https://doi.org/10.1021/ed300371n Buchan, T. J. (2018a). Laboratory skills performance assessments in chemistry. Montana State University. Buchan, T. J. (2018b). Laboratory skills performance assessments in chemistry. 314 Montana State University-Bozeman, College of Letters & Science. Budiasih, E., Sukarianingsih, D., Su’aidy, M., & Bagus, I. (2018). Peningkatan Kualitas Pembelajaran Matakuliah Pemisahan Kimia melalui Strategi Penugasan Secara Think Pair Share (TPS) dengan Pemberian Advance Organizer (AO). Seminar Nasional Kimia Dan Pembelajarannya (SNKP) 2018, November, 34–47. Burkett, T. (2018). Norm-Referenced Testing and Criterion-Referenced Testing. The TESOL Encyclopedia of English Language Teaching, 1–5. https://doi.org/10.1002/9781118784235.eelt0351 Burrows, A. C., Breiner, J. M., Keiner, J., & Behm, C. (2014). Biodiesel and Integrated STEM: Vertical Alignment of High School Biology/Biochemistry and Chemistry. Journal of Chemical Education, 91(9), 1379–1389. https://doi.org/10.1021/ed500029t Byrnes, J. P., & Dunbar, K. N. (2014). The nature and development of critical- analytic thinking. Educational Psychology Review, 26(4), 477–493. https://doi.org/10.1007/s10648-014-9284-0 Caño de las Heras, S., Kensington-Miller, B., Young, B., Gonzalez, V., Krühne, U., Mansouri, S. S., & Baroutian, S. (2021). Benefits and Challenges of a Virtual Laboratory in Chemical and Biochemical Engineering: Students’ Experiences in Fermentation. Journal of Chemical Education, 98(3), 866–875. https://doi.org/10.1021/acs.jchemed.0c01227 Cappelli, P., & Conyon, M. J. (2018). What Do Performance Appraisals Do? ILR Review, 71(1), 88–116. https://doi.org/10.1177/0019793917698649 Carmel, J. H., Herrington, D. G., Posey, L. A., Ward, J. S., Pollock, A. M., & Cooper, M. M. (2019). Helping students to “do science”: Characterizing scientific practices in general chemistry laboratory curricula. Journal of Chemical Education, 96(3), 423–434. Carnduff, J., & Reid, N. (2003). Enhancing undergraduate chemistry laboratories: pre-laboratory and post-laboratory exercises. Royal Society of Chemistry. Carvalho-Knighton, K. M., & Keen-Rocha, L. (2007). Using Technology To Enhance the Effectivenes. Journal of Chemical Education, 84(4), 727. https://doi.org/10.1021/ed084p727 Cavinato, A. G. (2017). Challenges and successes in implementing active learning laboratory experiments for an undergraduate analytical chemistry course. Analytical and Bioanalytical Chemistry, 409(6), 1465–1470. https://doi.org/10.1007/s00216-016-0092-x Chairam, S., Klahan, N., & Coll, R. K. (2015). Exploring secondary students’ understanding of chemical kinetics through inquiry-based learning activities. Eurasia Journal of Mathematics, Science and Technology Education, 11(5), 315 937–956. https://doi.org/10.12973/eurasia.2015.1365a Chan, C. K. Y., & Lee, K. K. W. (2021). Constructive alignment between holistic competency development and assessment in Hong Kong engineering education. Journal of Engineering Education, 110(2), 437–457. https://doi.org/10.1002/jee.20392 Chen, H. J., She, J. L., Chou, C. C., Tsai, Y. M., & Chiu, M. H. (2013). Development and application of a scoring rubric for evaluating students’ experimental skills in organic chemistry: An instructional guide for teaching assistants. Journal of Chemical Education, 90(10), 1296–1302. https://doi.org/10.1021/ed101111g Chetty, K., Qigui, L., Gcora, N., Josie, J., Wenwei, L., & Fang, C. (2018). Bridging the digital divide: measuring digital literacy. Economics, 12(1), 1–20. https://doi.org/doi:10.5018/economics-ejournal.ja.2018-23 Cheung, D. (2008). Facilitating chemistry teachers to implement inquiry-based laboratory work. International Journal of Science and Mathematics Education, 6(1), 107–130. https://doi.org/10.1007/s10763-007-9102-y Cheung, D. (2011). Teacher beliefs about implementing guided-inquiry laboratory experiments for secondary school chemistry. Journal of Chemical Education, 88(11), 1462–1468. https://doi.org/10.1021/ed1008409 Chowdhury, F. (2019). Application of rubrics in the classroom: A vital tool for improvement in assessment, feedback and learning. International Education Studies, 12(1), 61–68. Christ, T. J., Riley-Tillman, T. C., Chafouleas, S. M., & Boice, C. H. (2010). Direct behavior rating (DBR): Generalizability and dependability across raters and observations. Educational and Psychological Measurement, 70(5), 825–843. https://doi.org/10.1177/0013164410366695 Chukwuere, J. E. (2021). The comparisons between the use of analytic and holistic rubrics in information systems discipline. Academia Letters, Article 3579. https://doi.org/10.20935/al3579 Chun, M., Kang, K. I., Kim, Y. H., & Kim, Y. M. (2015). Theme-based Project Learning: Design and Application of Convergent Science Experiments. Universal Journal of Educational Research, 3(11), 937–942. https://doi.org/10.13189/ujer.2015.031120 Churchill, G. A. (1979). A Paradigm for Developing Better Measures of Marketing Constructs. Journal of Marketing Research, 16, 64–73. Churchill, G. A., Ford, N. M., & Walker, O. C. (1974). Measuring the Job Satisfaction of Industrial Salesmen. Journal of Marketing Research, 11(3), 254–260. https://doi.org/10.1177/002224377401100303 316 Çıbık, A. S., & Aka, E. I. (2021). Student Views on Attitudes towards Chemistry Laboratory Skills. Online Science Education Journal, 6(2), 100–113. Cohen, R. J., Schneider, W. J., & Tobin, R. M. (2022). Psychological Testing and Assessment. An Introduction to Tests and Measurement (10th ed.). McGraw- Hill. https://libgen.rs/book/bibtex.php?md5=C56F1106D69623C12BFA5481B940 052C Company, P., Contero, M., Otey, J., Camba, J. D., Agost, M.-J., & Pérez-López, D. (2017). Web-based system for adaptable rubrics: case study on CAD assessment. Journal of Educational Technology & Society, 20(3), 24–41. Cooper, M. M., & Kerns, T. S. (2006). Changing the Laboratory: Effects of a Laboratory Course on Students’ Attitudes and Perceptions. Journal of Chemical Education, 83(9), 1356. https://doi.org/10.1021/ed083p1356 Council, N. (2012). A framework for K-12 science education: Practices, crosscutting concepts, and core ideas. National Academy Press. Crank, V. (2012). From high school to college: Developing writing skills in the disciplines. The WAC Journal, 23(1), 49–63. Crawford, A. R., Johnson, E. S., Moylan, L. A., & Zheng, Y. (2019). Variance and Reliability in Special Educator Observation Rubrics. Assessment for Effective Intervention, 45(1), 27–37. https://doi.org/10.1177/1534508418781010 Crawford, G. L., Kloepper, K. D., Meyers, J. J., & Singiser, R. H. (2019). Communicating chemistry: An introduction. ACS Symposium Series, 1327, 1– 15. https://doi.org/10.1021/bk-2019-1327.ch001 Creswell, J. W. (2015). ducational Research: Planning, Conducting, and Evaluating Quantitative and Qualitative Research (Five Editi). Pearson. Criswell, B. A., & Rushton, G. T. (2012). Conceptual Change, Productive Practices, and Themata: Supporting Chemistry Classroom Talk. Journal of Chemical Education, 89(10), 1236–1242. Crocker, L., Alglna, J., Staudt, M., Mercurio, S., Hintz, K., & Walker, R. A. (2008). Introduction to Classical and Modern Test Theory. In Cengage Learning. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297–334. https://doi.org/10.1007/BF02310555 Cronbach, L. J., & Meehl, P. E. (2017). Construct validity in psychological tests. Research Design: The Logic of Social Inquiry, 52(4), 225–238. https://doi.org/10.4324/9781315128498 D’Souza, M. J., Roeske, K. P., & Neff, L. S. (2017). Free inventory platform manages chemical risks, addresses chemical accountability, and measures cost-effectiveness. International Journal of Advances in Science, Engineering 317 and Technology, 5(3), 25–29. https://bit.ly/40v0HUC DeKorver, B. K., & Towns, M. H. (2015). General Chemistry Students’ Goals for Chemistry Laboratory Coursework. Journal of Chemical Education, 92(12), 2031–2037. https://doi.org/10.1021/acs.jchemed.5b00463 DeMars, C. E. (2018). Classical test theory and item response theory. In The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development (pp. 49–73). Wiley Online Library. https://doi.org/https://doi.org/10.1002/9781118489772.ch2 Denisi, A. S., & Murphy, K. R. (2017). Performance appraisal and performance management: 100 years of progress? Journal of Applied Psychology, 102(3), 421–433. https://doi.org/10.1037/apl0000085 Depdiknas. (2003). Undang-Undang Republik Indonesia No. 20 Tahun 2003 Tentang Sistem Pendidikan Nasional (pp. 1–22). http://dx.doi.org/10.1016/j.tecto.2012.06.047%0Ahttp://www.geohaz.org/ne ws/images/publications/gesi-report with prologue.pdf%0Ahttp://ec.europa.eu/echo/civil_protection/civil/pdfdocs/eart hquakes_en.pdf%0Ahttp://dx.doi.org/10.1016/j.gr.2011.06.005%0Ahttp:/ Devedžić, V. (2016). E-Assessment With Open Badges. September, 29–30. http://econference.metropolitan.ac.rs/wp-content/uploads/2016/10/01- Vladan-Devedzic-E-assessment-with-Open-Badges.pdf Dichev, C., & Dicheva, D. (2017). Towards Data Science Literacy. Procedia Computer Science, 108, 2151–2160. https://doi.org/https://doi.org/10.1016/j.procs.2017.05.240 Dickinson, P., & Adams, J. (2017). Values in evaluation–The use of rubrics. Evaluation and Program Planning, 65, 113–116. Downing, S. M. (2005). Threats to the validity of clinical teaching assessments: What about rater error? Medical Education, 39(4), 353–355. https://doi.org/10.1111/j.1365-2929.2005.02138.x Drisko, J. W. (2014). Competencies and their assessment. Journal of Social Work Education, 50(3), 414–426. https://doi.org/10.1080/10437797.2014.917927 Duncan, R. G., & Rivet, A. E. (2013). Science learning progressions. Science Education, 339(6118), 396–397. https://doi.org/10.1126/science.1228692 Eckes, T. (2005). Examining rater effects in testdaf writing and speaking performance assessments: a many-facet rasch analysis. Language Assessment Quarterly, 2(3), 197–221. https://doi.org/10.1207/s15434311laq0203_2 Eckes, T. (2009). On common ground? How raters perceive scoring criteria in oral proficiency testing. Tasks and Criteria in Performance Assessment: Proceedings of the 28th Language Testing Research Colloquium, 43–73. 318 Eckes, T. (2015). Introduction to many-facet rasch measurement:analyzing and evaluating rater-mediated assessments (2nd ed.). Peter Lang Verlag. https://doi.org/10.3726/978-3-653-04844-5 Eliyarti, E., & Zakirman, Z. (2020). Tinjauan Kontribusi Google Classroom Dalam Mendukung Perkuliahan Kimia Dasar. Jurnal Pendidikan Kimia Indonesia, 4(1), 32–39. Elvira, M. (2020). Analisis Penggunaan Instrumen Penilaian Kinerja dalam Praktikum Kimia: Studi Awal melalui Wawancara dengan Dosen Kimia. Tidak Dipublikasikan. Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. Lawrence Erlbaum Associates, Inc. Englehard, G. (2013). Invariant Measurement, using rasch models in the social, behavioral and health sciences. Routledge. Enneking, K. M., Breitenstein, G. R., Coleman, A. F., Reeves, J. H., Wang, Y., & Grove, N. P. (2019). The Evaluation of a Hybrid, General Chemistry Laboratory Curriculum: Impact on Students’ Cognitive, Affective, and Psychomotor Learning. Journal of Chemical Education, 96(6), 1058–1067. https://doi.org/10.1021/acs.jchemed.8b00637 Farah, Y. N., & Chandler, K. L. (2018). Structured Observation Instruments Assessing Instructional Practices With Gifted and Talented Students: A Review of the Literature. Gifted Child Quarterly, 62(3), 276–288. https://doi.org/10.1177/0016986218758439 Farida, I., Zahra, R. R., & Irwansyah, F. S. (2020). Experiment Optimization on the Reaction Rate Determination and Its Implementation in Chemistry Learning to Develop Science Process Skills. Jurnal Pendidikan Sains Indonesia, 8(1), 67–77. https://doi.org/10.24815/jpsi.v8i1.15608 Fisher, R. A. (1922). Fellow Statistician, Rothamsted Experimental Station, Communicated by. Royal Society, CCXXII, 309–368. Fisher, W. P. (2007). Rating scale instrument quality criteria. Rasch Measurement Transaction, 21(1), 095. https://www.rasch.org/rmt/rmt211m.htm Fornell, C., Larcker, D., Perreault, W., & Anderson, C. (1988). Structural Equation Modeling in Practice : A Review and Recommended Two-Step Approach. Psychological Bulletin, 103(3), 411–423. Foster, J. C. (2013). The Promise of Digital Badges. Techniques: Connecting Education & Careers, 88(8), 30–34. http://www.thefreelibrary.com/The promise of digital badges.-a0349490088 Foster, J. C. (2014). The practicality of digital badges. Techniques: Connecting Education & Careers, 89(6), 40–44. 319 Frederick, A. R. (2013). A Case Study of a First-Grade Teacher Team Collaboratively Planning Literacy Instruction for English Learners. University of Minnesota. Gaertner, M. N. (2022). Norm-Referenced Assessment. Routledge. Galloway, K. R., Malakpa, Z., & Bretz, S. L. (2016). Investigating Affective Experiences in the Undergraduate Chemistry Laboratory: Students’ Perceptions of Control and Responsibility. Journal of Chemical Education, 93(2), 227–238. https://doi.org/10.1021/acs.jchemed.5b00737 Galti, A. M., Saidu, S., Yusuf, H., & Goni, A. A. (2018). Rating scale in writing assessment: Holistic vs. Analytical scales: A review. International Journal of English Research, 4(6), 4–6. https://bit.ly/3nkLkA8 Gałuszka, A., Migaszewski, Z. M., Konieczka, P., & Namieśnik, J. (2012). Analytical Eco-Scale for assessing the greenness of analytical procedures. TrAC Trends in Analytical Chemistry, 37, 61–72. https://doi.org/https://doi.org/10.1016/j.trac.2012.03.013 Gao, X., Li, P., Shen, J., & Sun, H. (2020). Reviewing assessment of student learning in interdisciplinary STEM education. International Journal of STEM Education, 7(1), 24. https://doi.org/10.1186/s40594-020-00225-4 Garcia-Martinez, J., & Serrano-Torregrosa, E. (2015). Chemistry Education: Best Practices, Opportunities and Trends. Wiley-VCH Verlag GmbH & Co. Garfolo, B. T., Kelpsh, E. P., Phelps, Y., & Kelpsh, L. (2016). The Use of Course Embedded Signature Assignments and Rubrics in Programmatic Assessment. Academy of Business Journal, 1. Ghaemi, R. V., & Potvin, G. (2021). Hands-on education without the hands-on? An approach to online delivery of a senior lab course in chemical engineering while maintaining key learning outcomes. Proceedings of the Canadian Engineering Education Association (CEEA), 18(1), Paper 014. https://doi.org/10.24908/pceea.vi0.14834 Giammatteo, L., & Obaya, adolfo V. (2015). Assessing Chemistry Laboratory Skills Through a. Science Education International, 29(2), 103–110. Glazer, N. (2014). Formative Plus Summative Assessment in Large Undergraduate Courses: Why Both? International Journal of Teaching and Learning in Higher Education, 26(2), 276–286. http://www.isetl.org/ijtlhe/ Glazer, N. (2015). Student Perceptions of Learning Data-Creation and Data- Analysis Skills in an Introductory College-Level Chemistry Course. Chemistry Education Research and Practice, 16(2), 338–345. Glover, I., & Latif, F. (2013). Investigating perceptions and potential of open badges in formal higher education. Proceedings of World Conference on 320 Educational Multimedia, Hypermedia and Telecommunications, 1398–1402. http://shura.shu.ac.uk/7173/1/Glover_- _Investigating_perceptions_and_potential_of_open_badges_in_formal_highe r_education_-_proceeding_112141.pdf Goach, A. L., Croslow, S., McLaughlin, K., & Distin, S. (2022). Teaching the Use of Micropipettes through Direct Visualization of Data: An Experiment Focusing on Technique, Skill, and Accuracy. Journal of Chemical Education, 99(2), 886–891. https://doi.org/10.1021/acs.jchemed.0c01181 Gobaw, G. F., & Atagana, H. I. (2016). Assessing Laboratory Skills Performance in Undergraduate Biology Students. Academic Journal of Interdisciplinary Studies, 5(3), 113–122. https://doi.org/10.5901/ajis.2016.v5n3p113 Goodman, B. E., Barker, M. K., & Cooke, J. E. (2018). Best practices in active and student-centered learning in physiology classes. Advances in Physiology Education, 42(3), 417–423. https://doi.org/10.1152/advan.00064.2018 Guskey, T. R. (2003). Analyzing Lists of the Characteristics of Effective Professional Development to Promote Visionary Leadership. NASSP Bulletin, 87(637), 4–20. https://doi.org/10.1177/019263650308763702 Hadi, S., Ismara, I., & Tanumihardja, E. (2015). Pengembangan Sistem Tes Diagnostik Kesulitan Belajar Kompetensi Dasar Kejuruan Siswa SMK (Issue November). Hager, P., Gonczi, A., & Athanasou, J. (1994). Assessment & evaluation in higher education general issues about assessment of competence general issues about assessment of competence. Assessment & Evaluation in Higher Education, 19(1), 3–16. https://doi.org/10.1080/0260293940190101 Hair, J. F., & Gabriel, M. L. D. S. (2019). Development and validation of attitudes measurement scales: fundamental and practical aspects. RAUSP Management Journal, 54(4), 490–507. https://doi.org/10.1108/RAUSP-05-2019-0098 Hala, E.-S. (2020). How peer assessment could be interactive and effective. South African Journal of Education, 40(2), 1–14. https://doi.org/10.15700/saje.v40n2a1651 Haladyna, T. M., & Downing, S. M. (2004). Construct Irrelevant Variance in High- Stakes Testing. Educational Measurement: Issues and Practice, 23(1), 17–27. Haladyna, T. M., & Rodriguez, M. C. (2013). Developing and validating test items. Routledge. Halavais, A. M. C. (2012). A genealogy of badges: Inherited meaning and monstrous moral hybrids. Information Communication and Society, 15(3), 354–373. https://doi.org/10.1080/1369118X.2011.641992 Hall, W., & Saunders, J. (2004). Memahami Penilaian. Badan Nasional Sertifikasi 321 Profesi. Hambleton, R. K., Swaminathan, H., & Rogers, D. J. (1991). Fundamentals of Item Response Theory (F. Diane S (ed.)). SAGE Publications, Inc. Hanafi, N. M., Rahman, A. A., Mukhtar, M. I., Ahmad, J., & Warman, S. (2014). Validity and Reliability of Competency Assessment Implementation ( CAI ) Instrument Using Rasch Model. International Journal of Social, Education, Economics and Management Engineering, 8(1), 162–167. Hancock, L. M., & Hollamby, M. J. (2020). Assessing the Practical Skills of Undergraduates: The Evolution of a Station-Based Practical Exam. Journal of Chemical Education, 97(4), 972–979. https://doi.org/10.1021/acs.jchemed.9b00733 Hanifah, S., Sari, S., & Irwansyah, F. S. (2021). Making of web-based chemical laboratory equipment and materials inventory application. Seminar Nasional Tadris Kimiya 2020, 2(1), 97–110. http://bit.ly/3JNsVDB Harsh, J. A. (2016). Designing performance-based measures to assess the scientific thinking skills of chemistry undergraduate researchers. Chemistry Education Research and Practice, 17(4), 808–817. https://doi.org/10.1039/c6rp00057f Harwood, C. J., Hewett, S., & Towns, M. H. (2020). Rubrics for assessing hands- on laboratory skills. Journal of Chemical Education, 97(7), 2033–2035. https://doi.org/10.1021/acs.jchemed.0c00200 He, T. H., Gou, W. J., Chien, Y. C., Chen, I. S. J., & Chang, S. M. (2013). Multi- faceted Rasch measurement and bias patterns in EFL writing performance assessment. Psychological Reports, 112(2), 469–485. https://doi.org/10.2466/03.11.PR0.112.2.469-485 Hennah, N., & Seery, M. K. (2017). Using digital badges for developing high school chemistry laboratory skills. Journal of Chemical Education, 94(7), 844–848. https://doi.org/10.1021/acs.jchemed.7b00175 Henning, G. (1992). Dimensionality and construct validity. Language Testing, 9(1), 1–11. Hensiek, S., DeKorver, B., Harwood, C., Fish, J., O’Shea, K., & Towns, M. (2017). Digital Badges in Science: A Novel Approach to the Assessment of Student Learning. Journal of College Science Teaching, 046(03). https://doi.org/10.2505/4/jcst17_046_03_28 Hensiek, S., Dekorver, B. K., Harwood, C. J., Fish, J., O’Shea, K., & Towns, M. (2016). Improving and assessing student hands-on laboratory skills through digital badging. Journal of Chemical Education, 93(11), 1847–1854. https://doi.org/10.1021/acs.jchemed.6b00234 Hernández-de-Menéndez, M., Vallejo Guevara, A., Tudón Martínez, J. C., 322 Hernández Alcántara, D., & Morales-Menendez, R. (2019). Active learning in engineering education. A review of fundamentals, best practices and experiences. International Journal on Interactive Design and Manufacturing (IJIDeM), 13, 909–922. Hessel, V., Escribà-Gelonch, M., Bricout, J., Tran, N. N., Anastasopoulou, A., Ferlin, F., Valentini, F., Lanari, D., & Vaccaro, L. (2021). Quantitative Sustainability Assessment of Flow Chemistry–From Simple Metrics to Holistic Assessment. ACS Sustainable Chemistry & Engineering, 9(29), 9508–9540. https://doi.org/10.1021/acssuschemeng.1c02501 Hofstein, A. (2004). THE LABORATORY IN CHEMISTRY EDUCATION : THIRTY YEARS OF EXPERIENCE WITH DEVELOPMENTS , IMPLEMENTATION , AND RESEARCH Laboratory activities have long had a distinctive and central role in the science to quote from Ira Ramsen ( 1846-1927 ), who wrote his me. Chemistry Education, Research and Practice, 5(3), 247–264. Hofstein, A., & Lunetta, V. N. (2004). The Laboratory in Science Education: Foundations for the Twenty-First Century. Science Education, 88(1), 28–54. https://doi.org/10.1002/sce.10106 Hoidn, S., & Reusser, K. (2020). Foundations of student-centered learning and teaching. In The Routledge International Handbook of Student-Centered Learning and Teaching in Higher Education (pp. 17–46). Routledge. Hoque, M. E. (2016). Three Domains of Learning: Cognitive, Affective and Psychomotor. The Journal of EFL Education and Research, 2(February), 2520–5897. www.edrc-jefler.org Horst, S. J., & Prendergast, C. O. (2014). The Assessment Skills Framework: A Taxonomy of Assessment Knowledge, Skills and Attitudes. Journal of Research and Practice in Assesment, 15(1), 1–25. Houston, J. E., & Myford, C. M. (2009). Judges’ perception of candidates’ organization and communication, in relation to oral certification examination ratings. Academic Medicine, 84(11), 1603–1609. https://doi.org/10.1097/ACM.0b013e3181bb2227 Houston, W. M., Raymond, M. R., & Svec, J. C. (1991). Applied Psychological Measurement Adjustments for Rater Effects Performance Assessment in. Applied Psychological Measurement, 15(4), 409–421. Hoyt, W. T. (2000). Rater bias in psychological research: When is it a problem and what can we do about it? Psychological Methods, 5(1), 64. Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. A Multidisciplinary Journal, 6(1), 1–55. 323 https://doi.org/10.1080/10705519909540118 Hunter, C., Mccosh, R., & Wilkins, H. (2003). Integrating learning and assessment in laboratory work. Chemistry Education Research and Practice, 4(1), 67–75. https://doi.org/10.1039/b2rp90038f Hunter, R. A., & Kovarik, M. L. (2022). Leveraging the analytical chemistry primary literature for authentic, integrated content knowledge and process skill development. Journal of Chemical Education, 99(3), 1238–1245. https://doi.org/10.1021/acs.jchemed.1c00920 Istiyono, E. dkk. (2018). Pengembangan Tes. In Cakrawala Pendidikan (Edisi Ke- 2). UNY Press. Jansen, T., Vögelin, C., Machts, N., Keller, S., Köller, O., & Möller, J. (2021). Judgment accuracy in experienced versus student teachers: Assessing essays in English as a foreign language. Teaching and Teacher Education, 97, 103216. https://doi.org/https://doi.org/10.1016/j.tate.2020.103216 Janssen, G., Meier, V., & Trace, J. (2015). Building a better rubric: mixed methods rubric revision. Assessing Writing, 26(18), 51–66. https://doi.org/10.1016/j.asw.2015.07.002 Joanna Furtado. (2016). English Chemistry Curriculum Map. Royal Society of Chemistry, 0(May), 2016. Job, J. M., & Klassen, R. M. (2012). Predicting performance on academic and non- academic tasks: A comparison of adolescents with and without learning disabilities. Contemporary Educational Psychology, 37(2), 162–169. https://doi.org/10.1016/j.cedpsych.2011.05.001 Johnson, E. B. (2002). Contextual teaching and learning: What it is and why it’s here to stay. Corwin Press. Johnson, R. L., Penny, J. A., & Gordon, B. (2008). Assessing Performance: Designing, Scoring, and Validating Performance Tasks. The Guilford Press. Johnstone, A. H., & El-Banna, H. (2021). It depends on the problem and on the solver: An overview of the working memory overload hypothesis, its applicability and its limitations. Problems and Problem Solving in Chemistry Education: Analysing Data, Looking for Patterns and Making Deductions. Jolley, D. F., Wilson, S. R., Kelso, C., O’Brien, G., & Mason, C. E. (2016). Analytical Thinking, Analytical Action: Using Prelab Video Demonstrations and e-Quizzes To Improve Undergraduate Preparedness for Analytical Chemistry Practical Classes. Journal of Chemical Education, 93(11), 1855– 1862. https://doi.org/10.1021/acs.jchemed.6b00266 Jones, M. L. B., & Seybold, P. G. (2016a). Combining Chemical Information Literacy, Communication Skills, Career Preparation, Ethics, and Peer Review 324 in a Team-Taught Chemistry Course. Journal of Chemical Education, 93(3), 439–443. https://doi.org/10.1021/acs.jchemed.5b00416 Jones, M. L. B., & Seybold, P. G. (2016b). Combining Chemical Information Literacy, Communication Skills, Career Preparation, Ethics, and Peer Review in a Team-Taught Chemistry Course. Journal of Chemical Education, 93(3), 439–443. https://doi.org/10.1021/acs.jchemed.5b00416 Jonsson, A., & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational consequences. Educational Research Review, 2(2), 130–144. https://doi.org/10.1016/j.edurev.2007.05.002 Kadioglu-Akbulut, C., & Uzuntiryaki-Kondakci, E. (2021). Implementation of self- regulatory instruction to promote students’ achievement and learning strategies in the high school chemistry classroom. Chemistry Education Research and Practice, 22(1), 12–29. Kanatzidis, M. G., Poeppelmeier, K. R., Bobev, S., Guloy, A. M., Hwu, S.-J., Lachgar, A., Latturner, S. E., Raymond, Schaak, E., Seo, D.-K., Sevov, S. C., Stein, A., Dabrowski, B., Greedan, J. E., Greenblatt, M., Grey, C. P., Jacobson, A. J., Keszler, D. A., Li, J., ... Seshadri, R. (2008). Report from the third workshop on future directions of solid-state chemistry: The status of solid- state chemistry and its impact in the physical sciences. Progress in Solid State Chemistry, 36(1), 1–133. https://doi.org/https://doi.org/10.1016/j.progsolidstchem.2007.02.002 Karataş, F. (2016). Pre-service chemistry teachers’ competencies in the laboratory: A cross-grade study in solution preparation. Chemistry Education Research and Practice, 17(1), 100–110. https://doi.org/10.1039/c5rp00147a Kaydos, W. (2020). Operational performance measurement: increasing total productivity. CRC press. Keen, C., & Sevian, H. (2022). Qualifying domains of student struggle in undergraduate general chemistry laboratory. Chemistry Education Research and Practice, 23(1), 12–37. https://doi.org/10.1039/D1RP00051A Kim, H. (2020). Effects of rating criteria order on the halo effect in L2 writing assessment: a many-facet Rasch measurement analysis. Language Testing in Asia, 10(1), 16. https://doi.org/10.1186/s40468-020-00115-0 Kimberlin, C. L., & Winterstein, A. G. (2008). Validity and reliability of measurement instruments used in research. American Journal of Health- System Pharmacy, 65(23), 2276–2284. https://doi.org/10.2146/ajhp070364 Kirton, S. B., Al-Ahmad, A., & Fergus, S. (2014). Using structured chemistry examinations (SChemEs) as an assessment method to improve undergraduate students generic, practical, and laboratory-based skills. Journal of Chemical Education, 91(5), 648–654. https://doi.org/10.1021/ed300491c 325 Koizumi, R., Kaneko, E., Setoguchi, E., Innami, Y., & Naganuma, N. (2019). Examination of CEFR-J spoken interaction tasks using many-facet Rasch measurement and generalizability theory. Papers in Language Testing and Assessment, 8(2), 1–33. Kondo-Brown, K. (2002). A FACETS analysis of rater bias in measuring Japanese second language writing performance. Language Testing, 19(1), 3–31. https://doi.org/10.1191/0265532202lt218oa Kotz, J. C., Treichel, P. M., Townsend, J., & Treichel, D. (2014). Chemistry & chemical reactivity. Cengage Learning. Kozma, R., & Russell, J. (2005). Students Becoming Chemists: Developing Representationl Competence BT - Visualization in Science Education (J. K. Gilbert (ed.); pp. 121–145). Springer Netherlands. https://doi.org/10.1007/1- 4020-3613-2_8 Kulatunga-Moruzi, C., & Norman, G. R. (2002). Validity of Admissions Measures in Predicting Performance Outcomes: The Contribution of Cognitive and Non- Cognitive Dimensions. Teaching and Learning in Medicine, 14(1), 34–42. https://doi.org/10.1207/S15328015TLM1401_9 Lagowski, J. J. (2002). The role of the laboratory in chemical education. 化学教育 , 23(12), 4–10. Lane, S., & Stone, C. A. (2006). Performance assessment. In R. L. Brennan (Ed.), Educational measuremen (pp. 387–431). American Council on Education/Praeger. Latif, K. F., Latif, I., Farooq Sahibzada, U., & Ullah, M. (2019). In search of quality: measuring Higher Education Service Quality (HiEduQual). Total Quality Management and Business Excellence, 30(7–8), 768–791. https://doi.org/10.1080/14783363.2017.1338133 Lau, P. N., Teow, Y., Low, X. T. T., & Tan, S. T. B. (2023). Integrating chemistry laboratory–tutorial timetabling with instructional design and the impact on learner perceptions and outcomes. Chemistry Education Research and Practice, 24(1), 12–35. https://doi.org/10.1039/D2RP00055E Lawshe, C. H. (1975). a Quantitative Approach To Content Validity. Personnel Psychology, 28(4), 563–575. https://doi.org/10.1111/j.1744- 6570.1975.tb01393.x Lee, M., & Cha, D. (2016). A comparison of generalizability theory and many facet rasch measurement in an analysis of mathematics creative problem solving test. Journal of Curriculum Evaluation, 19(2), 251–279. https://doi.org/https://doi.org/10.29221/jce.2016.19.2.251 Lesmond, G., Mccahan, S., & Beach, D. (2017). Development of Analytic Rubrics for Competency Assessment. 326 Li, H., Xiong, Y., Hunter, C. V., Guo, X., & Tywoniw, R. (2020). Does peer assessment promote student learning? A meta-analysis. Assessment & Evaluation in Higher Education, 45(2), 193–211. https://doi.org/10.1080/02602938.2019.1620679 Liao, M., Lan, K., & Yao, Y. (2022). Sustainability implications of artificial intelligence in the chemical industry: A conceptual framework. Journal of Industrial Ecology, 26(1), 164–182. https://doi.org/10.1111/jiec.13214 Lichti, D., Mosley, P., & Callis-Duehl, K. (2021). Learning from the trees: using project budburst to enhance data literacy and scientific writing skills in an introductory biology laboratory during remote learning. Citizen Science: Theory and Practice, 6(1), 1–12. https://doi.org/10.5334/CSTP.432 Linacre, J. M. (1989). Many-faceted Rasch measurement [The University of Chicago]. http://dx.doi.org/10.1016/j.jaci.2012.05.050 Linacre, J. M. (1994a). FACET: Rasch Model (2nd ed.). Mesa Press. Linacre, J. M. (1994b). Many-facet rasch measurement (2nd ed.). Mesa Press. Linacre, J. M. (1998). Detecting Multidimensionality: Which Residual Data-type Works Best? John Michael Linacre University of Chicago. Journal of Outcome Measurement, 2, 266–283. Linacre, J. M. (2002a). Understanding Rasch measurement: Optimizing Rating Scale Category Effectiveness. Journal of Applied Measurement, 3(1), 85–106. Linacre, J. M. (2002b). What do infit and outfit, mean-square and standardized mean. Rasch Measurement Transactions, 16(2), 871–882. https://www.rasch.org/rmt/rmt162.pdf Linacre, J. M. (2010). Predicting responses from rasch measures. Journal of Applied Measurement, 11(1), 1–10. Linacre, J. M. (2021). A User Guides to FACET Rasch Model Computer Programs. In European University Institute (Issue 2). John M. Linacre. All rights reserved. Linacre, J. M., & Wright, B. D. (2014). Facets. Computer Program for Many- Faceted Rasch Measurement, 1–22. Little, J. W. (2012). Understanding data use practice among teachers: The contribution of micro-process studies. American Journal of Education, 118(2), 143–166. https://doi.org/10.1086/663271 Lo, C.-M., Han, J., Wong, E. S. W., & Tang, C.-C. (2021). Flexible learning with multicomponent blended learning mode for undergraduate chemistry courses in the pandemic of COVID-19. Interactive Technology and Smart Education, 18(2), 175–188. https://doi.org/10.1108/ITSE-05-2020-0061 327 Lubis, R. A. H. (2021). Pengembangan Lembar Penilaian Aspek Psikomotorik Mahasiswa Pada Kegiatan Praktikum Makromolekul ProdiI Pendidikan Kimia FTK UIN Ar Raniry Banda Aceh. UIN Ar-raniry. Lumley, T. (2005). Assessing second language writing. The rater’s perspective. 3. https://doi.org/10.1016/j.asw.2008.02.005 Lunardi, C. N., Gomes, A. J., Rocha, F. S., De Tommaso, J., & Patience, G. S. (2021). Experimental methods in chemical engineering: Zeta potential. Canadian Journal of Chemical Engineering, 99(3), 627–639. https://doi.org/10.1002/cjce.23914 Lunz, M. E., Wright, B. D., & Linacre, J. M. (1990). Measuring the Impact of Judge Severity on Examination Scores. Applied Measurement in Education, 3(4), 331–345. https://doi.org/10.1207/s15324818ame0304_3 Mahaffy, P. (2015). hemistry education and human activity. Chemistry Education. In Best Practices, Innovative Strategies and New Technologies. Wiley. Mahat, M. (2008). The Development of A Psychometrically-Sound Instrument to Measure Teachers’ Multidimensional Attitudes Toward Inclusive Education. International Journal of Special Education, 23(1), 82–92. Majid, A. (2014). Pembelajaran Tematik Terpadu. Remaja Rosdakarya. Maknun, D. (2015). Evaluasi keterampilan laboratorium mahasiswa menggunakan asesmen kegiatan laboratorium berbasis kompetensi pada pelaksanaan praktek pengalaman lapangan (PPL) [Evaluation of students’ laboratory skills using competency-based laboratory activity assessment. Jurnal Tarbiyah, 22(1), 21– 47. https://bit.ly/40iDjJT Mardapi, D. (2008). Teknik Penyusunan Instrumen Tes dan Non Tes. Mitra Cendikia. Mardapi, D. (2017). Pengukuran Penilaian dan Evaluasi Pendidikan [Educational Assessment and Evaluation Measurement] (Edisi 2). Parama Publishing. Maryati, Prasetyo, Z. K., Wilujeng, I., & Sumintono, B. (2019). Measuring teachers’ pedagogical content knowledge using many-facet rasch model. Cakrawala Pendidikan, 38(3), 452–464. https://doi.org/10.21831/cp.v38i3.26598 Masters, G. N. (2018). Partial credit model. In W. J. van der Linden (Ed.), Handbook of item response theory (pp. 109–126). Taylor and Francis Group. Matias, J. A. L. (2021). Materials Chemistry applied to bismuth cobaltite-rich nanocomposites with sillenite crystal structure. Universidade Federal do Rio Grande do Norte. Mc Donnell, C., O’Connor, C., & Seery, M. K. (2007). Developing practical chemistry skills by means of student-driven problem based learning mini- 328 projects. Chemistry Education Research and Practice, 8(2), 130–139. https://doi.org/10.1039/B6RP90026G McDavid, J. C., Huse, I., & Ingleson, L. R. L. (2018). Program Evaluation and Performance Measurement: An Introduction to Practice. SAGE Publications, Inc. https://libgen.rs/book/bibtex.php?md5=12E0BD559C967483243827D83DB 15895 McNamara, T. F. (1996). Measuring Second Language Performance (Applied Linguistics and Language Study). Addison Wesley Longman. McNamara, T. F. (2011). Applied linguistics and measurement: A dialogue. Language Testing, 28(4), 435–440. https://doi.org/10.1177/0265532211413446 McNamara, T. F., Knoch, U., Fan, J., & Rossner, R. (2019). Fairness, justice & language assessment. Oxford University Press. Millar, R. (2004). The role of practical work in the teaching and learning of science (Commissioned Paper-Committee on High School Science Laboratories: Role and Vision, Issue June). Miller, M. D., Linn, R. L., & Gronlund, N. E. (2013). Measurement and Assessment in Teaching (11th ed.). Pearson. Mistry, N., & Gorman, S. G. (2020). What laboratory skills do students think they possess at the start of University? Chemistry Education Research and Practice, 21(3), 823–838. https://doi.org/10.1039/c9rp00104b Mohajan, H. K. (2017). Two criteria for good measurements in research: Validity and reliability. Annals of Spiru Haret University. Economic Series, 17(4), 59– 82. https://doi.org/https://doi.org/10.26458/1746 Mokhtari, K., Delello, J., & Reichard, C. (2015). Connected yet distracted: Multitasking among college students. Journal of College Reading and Learning, 45(2), 164–180. https://doi.org/10.1080/10790195.2015.1021880 Muchtar, A., Madhakomala, & Abdullah, T. (2023). Evaluation of Implementation of Standard Laboratory Midwifery Diploma III Program in Jakarta 2017. American Journal of Educational Research, 6(3), 181–187. http://pubs.sciepub.com/ Mulaik, S. A. (1988). Confirmatory factor analysis. Handbook of Multivariate Experimental Psychology, 259–288. https://doi.org/10.1007/978-94-007- 0753-5_524 Muraki, E. (1992). A Generalized Partial Credit Model: Application of an EM Algorithm. Applied Psychological Measurement, 16(2), 159–176. https://doi.org/10.1177/014662169201600206 329 Murphy, K. R. (2020). Performance evaluation will not die, but it should. Human Resource Management Journal, 30(1), 13–31. https://doi.org/https://doi.org/10.1111/1748-8583.12259 Myers, M. J., & Burgess, A. B. (2003). Inquiry-Based Laboratory Course Improves Students’ Ability to Design Experiments and Interpret Data. Advances in Physiology Education, 27(1), 26–33. https://doi.org/10.1152/advan.00028.2002 Myford, C. M., & Wolfe, E. W. (2000). Strengthening the ties that bind: Improving the linking network in sparsely connected rating designs. In ETS Research Report Series (Vol. 2000, Issue 1). https://beta.unglobalpulse.org/wp- content/uploads/2017/05/3rd-Research-Dive.pdf#page=29 Myford, C. M., & Wolfe, E. W. (2004). Detecting and Measuring Rater Effects Using Many-Facet Rasch Measurement: Part II. Journal of Applied Measurement, 5(2), 189–227. Narayanan, S., Kommuri, V. S., Subramanian, S. N., & Bijlani, K. (2017). Question bank calibration using unsupervised learning of assessment performance metrics. 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 19–25. https://doi.org/10.1109/ICACCI.2017.8125810 National Research Council. (2012). National Research Council. (2012). Discipline- based education research: Understanding and improving learning in undergraduate science and engineering. National Academies Press. Newby, T. J., & Cheng, Z. (2020). Instructional digital badges: effective learning tools. Educational Technology Research and Development, 68(3), 1053–1067. https://doi.org/10.1007/s11423-019-09719-7 Neyman, A. J., & Scott, E. L. (1948). Consistent Estimates Based on Partially Consistent Observations. Econometrica, 16(1), 1–32. Ng, S. B. (2019). Exploring STEM competences for the 21st century (C. Gallagher, L. Ji, & T. Kiyomi (eds.)). UNESCO International Bureau of Education. https://bit.ly/40dMwmE Noyes, J. A., Welch, P. M., Johnson, J. W., & Carbonneau, K. J. (2020). A systematic review of digital badges in health care education. Medical Education, 54(7), 600–615. https://doi.org/https://doi.org/10.1111/medu.14060 Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). McGraw- Hili, Inc. All rights reserved. Ohta, R., Plakans, L. M., & Gebril, A. (2018). Integrated writing scores based on holistic and multi-trait scales: A generalizability analysis. Assessing Writing, 38(May), 21–36. https://doi.org/10.1016/j.asw.2018.08.001 330 Okolo, G. N., Neomagus, H. W., Everson, R. C., Roberts, M. J., Bunt, J. R., Sakurovs, R., & Mathews, J. P. (2015). Chemical–structural properties of South African bituminous coals: Insights from wide angle XRD–carbon fraction analysis, ATR–FTIR, solid state 13C NMR, and HRTEM techniques. Fuel, 158, 779–792. Opara, J. A., & Oguzor, N. S. (2011). Inquiry instructional method and the school science curriculum. Current Research Journal of Social Sciences, 3(3), 188– 198. Ostini, R., & Nering, M. L. (2006). Polytomous Item Response Theory Models. SAGE Publications. Pakpahan, D. N., Situmorang, M., Sitorus, M., & Silaban, S. (2021). The Development of Project-Based Innovative Learning Resources for Teaching Organic Analytical Chemistry. In Advances in Social Science, Education and Humanities Research. Atlantis Press. https://doi.org/10.2991/assehr.k.211110.180 Park, Y. (2017). Examining South Korea’s elementary physical education performance assessment using assessment literacy perspectives. International Electronic Journal of Elementary Education, 10(2), 207–213. https://doi.org/10.26822/iejee.2017236116 Parker, H. E. (2013). Digital Badges as Effective Assessment Tools. British Journal of Educational Technology, 30(3), 195. http://oro.open.ac.uk/40593/1/__userdata_documents_sc8457_Documents_A ssessment_Journal Paper 2014_Cross2014_UseRoleReceptionOfOpenBadges.pdf%5Cnhttp://link.spri nger.com/10.1007/s11423-015-9388- 3%5Cnhttp://ezproxy.lib.utexas.edu/login?url=http://search.ebs Permendiknas. (2007). Peraturan Menteri Pendidikan Nasional Republik Indonesia Nomer 16 Tahun 2007 tentang Standar Kualifikasi Akademik dan Kompetensi Guru. 1–31. Permenristekdikti. (2015). Peraturan Menteri Riset, Teknologi, dan Pendidikan Tinggi Nomor 44 Tahun 2015 tentang Standar Nasional Perguruan Tinggi (SNPT). Polat, M., Turhan, N. S., & Toraman, Ç. (2022). Comparison of Classical Test Theory vs. Multi-Facet Rasch Theory. Pegem Journal of Education and Instruction, 12(2), 213–225. https://doi.org/10.47750/pegegog.12.02.21 Prades, A., & Espinar, S. R. (2010). Laboratory assessment in chemistry: An analysis of the adequacy of the assessment process. Assessment and Evaluation in Higher Education, 35(4), 449–461. https://doi.org/10.1080/02602930902862867 331 Primi, R., Silvia, P. J., Benedek, M., & Jauk, E. (2019). Aplying Many-Facet Rasch Modeling in the Assessment of Creativity. Psychology of Aesthetics, Creativity, and the Arts, 13(2), 176. Pufpaff, L. A., Clarke, L., & Jones, R. E. (2015). The effects of rater training on inter-rater agreement. Mid-Western Educational Researcher, 27(2), 117–141. https://mwera.org/MWER/volumes/v27/issue2/v27n2-Pufpaff-FEA TURE- ARTICLE.pdf Punturat, S., Suwannoi, P., & Ketchatturat, J. (2014). Intellectual Skills Assessment for the Teacher Students at the Faculty of Education, Khon Kaen University. Procedia - Social and Behavioral Sciences, 116, 1704–1708. https://doi.org/10.1016/j.sbspro.2014.01.459 Puspitasari, N., & Haryani, S. (2015). Pengembangan Rubrik Performance Assessment Pada Praktikum Hidrolisis Garam. Jurnal Inovasi Pendidikan Kimia, 8(1), 1250–1259. Putri, F. S., & Istiyono, E. (2017). The Development of Performance Assessment of Stem- Based Critical Thinking Skill in the High School Physics Lessons. International Journal of Environmental And Science Education, 12(5), 1269– 1281. Qin, S. J., & Chiang, L. H. (2019). Advances and opportunities in machine learning for process data analytics. Computers & Chemical Engineering, 126, 465–473. Qonita, R., A’tourrohman, M., Ulwiyah, E. W., & Wijayanti, E. (2021). Student learning difficulties in online biochemistry practicum: An experiences during COVID-19. Bioeduscience, 5(1), 74–79. Qureshi, S., Bradley, K., Vishnumolakala, V. R., Treagust, D. F., Southam, D. C., Mocerino, M., & Ojeil, J. (2016). Educational Reforms and Implementation of Student-Centered Active Learning in Science at Secondary and University Levels in Qatar. Science Education International, 27(3), 437–456. Ragupathi, K., & Lee, A. (2020). Beyond fairness and consistency in grading: The role of rubrics in higher education. Diversity and Inclusion in Global Higher Education: Lessons from across Asia, 73–95. Rahayu, C., & Eliyarti, E. (2019). Deskripsi efektivitas kegiatan praktikum dalam perkuliahan kimia dasar mahasiswa teknik. Edu Sains: Jurnal Pendidikan Sains Dan Matematika, 7(2), 51–60. Rahmadani, S., Jamaluddin, & Zulkifli, L. (2015). Pengembangan petunjuk praktikum biologi dan instrumen penilaian kinerja praktikum berbasis model pembelajaran kooperatif dan efektivitasnya terhadap kemampuan berpikir kritis siswa SMA/MA Kelas XI. Jurnal Penelitian Pendidikan IPA, 1(2), 1– 12. Rahmi, A. (2020). Analisis Kesulitan Belajar dan Hubungannya dengan Hasil 332 Belajar Siswa pada Mata Pelajaran Kimia Materi Stoikiometri. UNIVERSITAS ISLAM NEGERI SULTAN SYARIF KASIM RIAU. Ramachandran, R., Bernier, N. A., Mavilian, C. M., Izad, T., Thomas, L., & Spokoyny, A. M. (2021). Imparting Scientific Literacy through an Online Materials Chemistry General Education Course. Journal of Chemical Education, 98(5), 1594–1601. https://doi.org/10.1021/acs.jchemed.1c00138 Rasch, G. (1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests. Nielsen & Lydiche. Razali, S. N., & Shahbodin, F. (2016). Questionnaire on perception of online collaborative learning: Measuring validity and reliability using rasch model. 4th International Conference on User Science and Engineering, i-USEr 2016, August, 199–203. https://doi.org/10.1109/IUSER.2016.7857960 Reid, N., & Shah, I. (2007). The role of laboratory work in university chemistry. Chemistry Education Research and Practice, 8(2), 172–185. https://doi.org/10.1039/B5RP90026C Reigosa, C., & Jiménez‐Aleixandre, M. (2007). Scaffolded problem‐solving in the physics and chemistry laboratory: difficulties hindering students’ assumption of responsibility. International Journal of Science Education, 29(3), 307–329. https://doi.org/10.1080/09500690600702454 Retnawati, H. (2016a). Analisis kuantitatif Instrumen Penelitian. In Parama Publishing. Retnawati, H. (2016b). Proving content validity of self-regulated learning scale (The comparison of Aiken index and expanded Gregory index). Research and Evaluation in Education, 2(2), 155. https://doi.org/10.21831/reid.v2i2.11029 Reynders, G., Suh, E., Cole, R. S., & Sansom, R. L. (2019). Developing student process skills in a general chemistry laboratory. Journal of Chemical Education, 96(10), 2109–2119. https://doi.org/10.1021/acs.jchemed.9b00441 Rice, J. W., Thomas, S. M., & Toole, P. O. (2020). Science Education in the 21st Century. In Science Education in the 21st Century. https://doi.org/10.1007/978-981-15-5155-0 Riconscente, M. M., Kamarainen, A., & Honey, M. (2013). STEM Badges Current Terrain and the Road Ahead. https://badgesnysci.files.wordpress.com/2013/08/nsf_stembadges_final_repo rt.pdf Riscaputantri, A., & Wening, S. (2018). Pengembangan instrumen penilaian afektif siswa kelas IV sekolah dasar di Kabupaten Klaten. Jurnal Penelitian Dan Evaluasi Pendidikan, 22(2), 231–242. https://doi.org/10.21831/pep.v22i2.16885 333 Rocha, A. (2022). Science and Global Challenges of the 21st Century-Science and Technology. Springer Nature. Rohaeti, E., & Prodjosantoso, A. K. (2020). Oriented Collaborative Inquiry Learning Model: Improving Students’ Scientific Attitudes in General Chemistry. Journal of Baltic Science Education, 19(1), 108–120. Rost, J., & Walter, O. (2006). Multimethod Item Response Theory. In Handbook of multimethod measurement in psychology (pp. 249–268). American Psychological Association. https://doi.org/https://doi.org/10.1037/11383-018 Royal Society of Chemistry. (2016a). Guidelines for using the English Chemistry Curriculum map. https://edu.rsc.org/download?ac=16049 Royal Society of Chemistry. (2016b). Chemistry curriculum support. RSC Education. https://edu.rsc.org/resources/curriculum-support Rubin, D. B., Cronbach, L. J., Gleser, G. C., Nanda, H., & Rajaratnam, N. (1974). The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles. Journal of the American Statistical Association, 69(348), 1050. https://doi.org/10.2307/2286194 Rudd, J. A., Greenbowe, T. J., & Hand, B. M. (2007). Using the science writing heuristic to improve students’ understanding of general equilibrium. Journal of Chemical Education, 84(12), 2007–2011. https://doi.org/10.1021/ed084p2007 Ryan, L. (2016). AQA GCSE Chemistry Student Book. Oxford University Press. Ryser, G. R. (2018). Qualitative and quantitative approaches to assessment. In S. K. Johnsen (Ed.), Identifying Gifted Students (3rd Editio, pp. 33–57). Routledge. https://doi.org/https://doi.org/10.4324/9781003235682 Sa’adah, E. N. L., & Sigit, D. (2018). Pengembangan instrumen penilaian sikap dan keterampilan psikomotorik pada materi elektrokimia [Development of attitudes and psychomotor skills assessment instruments in electrochemical materials]. Teori, Penelitian, Dan Pengembangan, 3(8), 1023—1026. https://bit.ly/3JSClyS Sa’adah, N., Langitasari, I., & Wijayanti, I. E. (2020). Implementasi pendekatan science writing heuristic pada laporan praktikum berbasis multipel representasi terhadap kemampuan interpretasi [Implementation of the science writing heuristic approach to multiple representation-based practicum reports on interpr. Jurnal Inovasi Pendidikan IPA, 6(2), 195–208. https://doi.org/10.21831/jipi.v6i2.31078 Sabri, A., Nazleen Nur Ain, Z., & Fatin Izzati, K. (2016). Assessing the fitness of a measurement model using Confirmatory Factor Analysis (CFA). International Journal of Innovation and Applied Studies, 17(1), 159–168. 334 Sainuddin, S., Subali, B., Jailani, & Elvira, M. (2022). The development and validation prospective mathematics teachers holistic assessment tools. Ingenierie Des Systemes d’Information, 27(2), 171–184. https://doi.org/10.18280/isi.270201 Salas-Pilco, S. Z., Yang, Y., & Zhang, Z. (2022). Student engagement in online learning in Latin American higher education during the COVID-19 pandemic: A systematic review. British Journal of Educational Technology, 53(3), 593– 619. https://doi.org/https://doi.org/10.1111/bjet.13190 Salkind, N. J. (2006). Encyclopedia of measurement and statistics. SAGE publications. Sandi-Urena, S., Cooper, M., & Stevens, R. (2012). Effect of Cooperative Problem- Based Lab Instruction on Metacognition and Problem-Solving Skills. Journal of Chemical Education, 89(6), 700–706. https://doi.org/10.1021/ed1011844 Saputri, N., Adlim, A., & Inda Rahmayani, R. F. (2018). Pengembangan Instrumen Penilaian Psikomotorik Untuk Praktikum Kimia Dasar. JTK (Jurnal Tadris Kimiya), 3(2), 114–124. https://doi.org/10.15575/jtk.v3i2.3444 Sari, S., Ferawati, S. A., Farida, I., Sobandi, O., & Kariadinata, R. (2018). Online based performance assessment for general chemistry laboratory. IOP Conference Series: Materials Science and Engineering, 434(1). https://doi.org/10.1088/1757-899X/434/1/012190 Sarmouk, C., Ingram, M. J., Read, C., Curdy, M. E., Spall, E., Farlow, A., Kristova, P., Quadir, A., Maatta, S., & Stephens, J. (2020). Pre-laboratory online learning resource improves preparedness and performance in pharmaceutical sciences practical classes. Innovations in Education and Teaching International, 57(4), 460–471. Schwartz, A. T. (2006). Contextualized chemistry education: The American experience. International Journal of Science Education, 28(9), 977–998. https://doi.org/10.1080/09500690600702488 Secolsky, C., & Denison, D. B. (2012). Handbook on Measurement, Assessment, and Evaluation in Higher Education. In Handbook on Measurement, Assessment, and Evaluation in Higher Education. Taylor and Francis. https://doi.org/10.4324/9780203142189 Seery, M. K., Agustian, H. Y., Doidge, E. D., Kucharski, M. M., O’Connor, H. M., & Price, A. (2017). Developing laboratory skills by incorporating peer-review and digital badges. Chemistry Education Research and Practice, 18(3), 403– 419. https://doi.org/10.1039/c7rp00003k Seery, M. K., Agustian, H. Y., & Zhang, X. (2019). A Framework for Learning in the Chemistry Laboratory. Israel Journal of Chemistry, 59(6), 546–553. https://doi.org/10.1002/ijch.201800093 335 Selvaraj, A. M., & Azman, H. (2020). Reframing the effectiveness of feedback in improving teaching and learning achievement. International Journal of Evaluation and Research in Education, 9(4), 1055–1062. https://doi.org/10.11591/ijere.v9i4.20654 Semmelroth, C. L., & Johnson, E. (2014). Measuring rater reliability on a special education observation tool. Assessment for Effective Intervention, 39(3), 131– 145. https://doi.org/10.1177/1534508413511488 Shallcross, D. E., Harrison, T. G., Shaw, A. J., Shallcross, K. L., Croker, S. J., & Norman, N. C. (2013). Lessons in Effective Practical Chemistry at Tertiary Level: Case Studies from a Chemistry Outreach Program. Higher Education Studies, 3(5), 1–10. Sharp, J. T. (2012). Practical organic chemistry: a student handbook of techniques. Springer Science & Business Media. Shavelson, R. J., & Webb, N. M. (1991). Generalizability Theory: A Primer. Sage Publications, Inc. Shi, D., Lee, T., & Maydeu-Olivares, A. (2019). Understanding the Model Size Effect on SEM Fit Indices. Educational and Psychological Measurement, 79(2), 310–334. https://doi.org/10.1177/0013164418783530 Shultz, G. V, & Li, Y. (2016). Student development of information literacy skills during problem-based organic chemistry laboratory experiments. Journal of Chemical Education, 93(3), 413–422. https://doi.org/10.1021/acs.jchemed.5b00523 Shweta, Bajpai, R. C., & Chaturvedi, H. K. (2015). Evaluation of inter-rater agreement and inter-rater reliability for observational data: An overview of concepts and methods. Journal of t citation: Elvira, Melly and Retnawati, Heri (2023) Pengembangan Instrumen Penilaian Kinerja Praktikum Kimia. S3 thesis, Sekolah Pascasarjana. document_url: http://eprints.uny.ac.id/79402/1/disertasi-melly%20elvira-17701261021.pdf