eprintid: 83146 rev_number: 10 eprint_status: archive userid: 1290 dir: disk0/00/08/31/46 datestamp: 2024-09-04 04:10:55 lastmod: 2024-09-04 04:10:55 status_changed: 2024-09-04 04:10:55 type: thesis metadata_visibility: show creators_name: Sumin, Sumin creators_name: Retnawati, Heri title: Model Asesmen Nilai-Nilai Budaya Kerja Kementerian Agama Republik Indonesia. ispublished: pub subjects: B5 subjects: C3 subjects: D0 divisions: pps_lit_evazdik full_text_status: restricted keywords: budaya kerja, model asesmen, teori respons butir. abstract: Penelitian ini bertujuan untuk (1) menghasilkan konstruk nilai-nilai budaya kerja Kementerian Agama Republik Indonesia yang relevan dan sesuai dengan konteks organisasi, (2) mengembangkan instrumen model asesmen nilai nilai budaya kerja yang valid dan reliabel, (3) menghasilkan butir instrumen model asesmen budaya kerja yang berkualitas, dan (4) mengeksplorasi potret nilai-nilai budaya kerja Kementerian Agama Republik Indonesia. Penelitian ini menggunakan pendekatan kuantitatif dengan desain penelitian pengembangan yang terdiri dari tiga tahapan: identifikasi masalah dan analisis kebutuhan, pengembangan rancangan dan implementasi, dan evaluasi. Subjek penelitian ini mencakup 20 orang pada uji coba terbatas, 250 orang pada uji coba diperluas, dan implementasi model penilaian sebanyak 500 orang yang ditentukan dengan teknik stratified random sampling dari populasi sejumlah 232.770 pegawai di Kementerian Agama Republik Indonesia. Analisis data menggunakan teori respons butir untuk skor politomi. Penelitian ini berhasil mengembangkan model asesmen nilai-nilai budaya kerja yang valid, reliabel, praktis, dan fleksibel, dilengkapi dengan instrumen yang mencakup konstruk integritas, profesionalitas, tanggung jawab, inovasi, keteladanan, dan perilaku religius. Instrumen terdiri dari 25 butir penilaian diri dan 30 butir penilaian teman sejawat. Validitas isi instrumen dengan CVR berkisar antara 0,750-1,00 dan reliabilitas inter-rater dengan koefisien Kripendorff’s Alpha berkisar antara 0,723-0,787. Diskriminasi butir instrumen penilaian teman sejawat berkisar antara 0,100-1,088, dan penilaian diri antara 0,635-1,034. Parameter threshold bervariasi antara −1,057-8,069 untuk penilaian teman sejawat, dan antara -3,909-2,666 untuk penilaian diri. Kurva ICC menunjukkan karakteristik butir yang efektif dalam membedakan kemampuan peserta, baik pada penilaian teman sejawat maupun penilain diri, sedangkan Kurva IIC mengungkapkan tingkat informasi yang tinggi pada kedua instrumen. Total informasi butir optimal berada pada θ≈ -1,5-0,2 untuk penilaian teman sejawat dan θ≈ -1,0-3,0 untuk penilaian diri, menandakan akurasi pengukuran pada kemampuan rendah, sedang dan tinggi di kedua instrumen. Profil budaya kerja menunjukkan tingkat internalisasi dan implementasi nilai-nilai budaya kerja yang berbeda-beda di kalangan pegawai. Kesimpulannya, model asesmen dapat digunakan untuk menilai implementasi nilai-nilai budaya kerja, dengan didukung melalui pendekatan dinamis, adaptif, dan strategis untuk memperkuat budaya kerja Kementerian Agama. Berdasarkan temuan ini, disarankan agar Kementerian Agama mengadopsi instrumen yang telah dikembangkan dan menggunakan hasil asesmen untuk merancang program pengembangan pegawai. date: 2024-07-01 date_type: published institution: Sekolah Program Pascasarjana department: Penelitian dan Evaluasi Pendidikan thesis_type: disertasi Employee Relations: The International Journal, 42(3), 698–716. https://doi.org/10.1108/ER-08-2019-0344 Aiken, L. R. (1985). Three coefficients for analyzing the reliability and validity of ratings. Educational and Psychological Measurement, 45(1), 131–142. https://doi.org/https://doi.org/10.1177/0013164485451012 Ali, U. S., Chang, H., & Anderson, C. J. (2015). Location indices for ordinal polytomous items based on item response theory. ETS Research Report Series, 2015(2), 1–13. Allen, M. J., & Yen, W. M. (2001). Introduction to measurement theory. Waveland Press. Allport, G. W., Clark, K., & Pettigrew, T. (1979). The nature of prejudice (25th Anniv). Perseus Publishing. Alphen, A. Van, Halfens, R., Hasman, A., & Imbos, T. (1994). Likert or Rasch? Nothing is more applicable than good theory. Journal of Advanced Nursing, 20(1), 196–201. American Psychological Association. (2015). APA dictionary of psychology (P. Gary R. VandenBos (ed.); 2nd ed.). American Psychological Association. Andryanto, S. D. (2021). KPK sindir Kemenag sering tersandung kasus korupsi, 2 menteri basuk bui. https://tempo.com Anton, A., & Abdullah, A. (2020). Relationship the work culture and training programs within performance. International Journal of Progressive Sciences and Technologies (IJPSAT), 20(1), 92–101. http://ijpsat.ijsht-journals.org Arfiansyah, M. R. (2020). Implementasi perilaku kerja berbasis nilai budaya kerja (NBK) pada Kementerian Agama Republik Indonesia: Analisis dengan pendekatan modifikasi theory of planned behavior (TPB). 320 Arnold, K. F., Harrison, W. J., Heppenstall, A. J., & Gilthorpe, M. S. (2019). DAG- informed regression modelling, agent-based modelling and microsimulation modelling: a critical comparison of methods for causal inference. International Journal of Epidemiology, 48(1), 243–253. Ashkanasy, N. M., Wilderom, C. P. M., & Peterson, M. F. (2000). Handbook of organizational culture and climate. Sage. Astin, A. W., Astin, H. S., & Lindholm, J. A. (2010). Cultivating the spirit: How college can enhance students’ inner lives. John Wiley & Sons. Atwater, L., & Carmeli, A. (2009). Leader–member exchange, feelings of energy, and involvement in creative work. The Leadership Quarterly, 20(3), 264–275. Ayala, R. . De. (2022). The theory and practice of item response theory. The Guilford Press. Azwar, S. (2016). Reliabilitas dan validitas aitem. Buletin Psikologi, 3(1), 19–26. Azwar, S. (2021). Penyusunan skala psikologi. In Pustaka Pelajar (3rd ed.). Pustaka Pelajar. Baker, F. B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques. CRC press. Barclay, J. (2014). Conscious Culture: How to build a high performing workplace through values, ethics, and leadership. Morgan James Publishing. Beckler, D. T., Thumser, Z. C., Schofield, J. S., & Marasco, P. D. (2018). Reliability in evaluator-based tests: using simulation-constructed models to determine contextually relevant agreement thresholds. BMC Medical Research Methodology, 18. https://api.semanticscholar.org/CorpusID:53726581 Berkel, L. A., Armstrong, T. D., & Cokley, K. O. (2004). Similarities and differences between religiosity and spirituality in African American college students: A preliminary investigation. Counseling and Values, 49(1), 2–14. Biro Kepegawaian Sekretarial Jenderal Kementerian Agama RI. (2020). Buku statistik kementerian agama. Aparatur sipil negara Kementerian Agama RI. Kementerian Agama Repbulik Indonesia. https://simpeg.kemenag.go.id Bloom, B. S., Hastings, J. T., & Madaus, G. F. (1971). Handbook on formative and summative evaluation of student learning. (McGrew-Hill (ed.)). ERIC. Bolger, J., & Walker, P. (2014). Models of assessment. In Social work: an introduction (pp. 169–183). 321 https://www.sagepub.com/sites/default/files/upm- binaries/62946_Lishman.pdf Bourque, J., Skinner, H., Dupré, J., Bacchus, M., Ainslie, M., Ma, I. W. Y., & Cole, G. (2020). Performance of the Ebel standard-setting method for the spring 2019 Royal College of Physicians and Surgeons of Canada internal medicine certification examination consisting of multiple-choice questions. Journal of Educational Evaluation for Health Professions, 17, 12. https://doi.org/10.3352/jeehp.2020.17.12 Braeken, J., & Van Assen, M. A. L. M. (2017). An empirical Kaiser criterion. Psychological Methods, 22(3), 450. Brookhart, S. M., & McMillan, J. H. (2020). Classroom assessment and educational measurement. Taylor & Francis. Brookhart, S. M., & Nitko, A. J. (2019). Educational assessment of students. Pearson Education. Burhanuddin, B. (2019). The scale of school organizational culture in Indonesia. International Journal of Educational Management, 33(7), 1582–1595. https://doi.org/10.1108/IJEM-01-2018-0030 Cameron, K. S., & Quinn, R. E. (2011). Diagnosing and changing organizational culture: Based on the competing values framework. John Wiley & Sons. Chalmers, R. . (2012). MIRT: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29. Chen, P. P., & Bonner, S. M. (2020). A framework for classroom assessment, learning, and self-regulation. Assessment in Education: Principles, Policy & Practice, 27(4), 373–393. https://doi.org/Chen, P. P., & Bonner, S. M. (2020). A framework for classroom assessment, learning, and self-regulation. Assessment in Education: Principles, Policy & Practice, 27(4), 373-393. Chew, K. J., Ross, A., Katz, A., & Matusovich, H. M. (2022). Defining Assessment: Foundation Knowledge Toward Exploring Engineering Faculty’s Assessment Mental Models. 2022 IEEE Frontiers in Education Conference (FIE), 1–8. https://doi.org/10.1109/FIE56618.2022.9962667 Christensen, K. B., Comins, J. D., Krogsgaard, M. R., Brodersen, J., Jensen, J., Hansen, C. F., & Kreiner, S. (2021). Psychometric validation of PROM instruments. Scandinavian Journal of Medicine & Science in Sports, 31(6), 1225–1238. Creemers, B., & Kyriakides, L. (2018). Developing, testing, and using theoretical 322 models for promoting quality in education. In Educational Effectiveness Theory (pp. 99–116). Routledge. Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. ERIC. David, S. N. J., Valas, S., & Raghunathan, R. (2018). Assessing organization culture–a review on the ocai instrument. International Conference on Management and Information Systems, 21, 182–188. DeMars, C. E. (2018). Classical test theory and item response theory. The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development, 49–73. Deng, N., Wells, C., & Hambleton, R. (2008). A confirmatory factor analytic study examining the dimensionality of educational achievement tests. Denison, D., Hooijberg, R., Lief, C., & Lane, N. (2012). Leading culture change in global organizations: Aligning culture and strategy. John Wiley & Sons. Denison, D. R., Janovics, J., Young, J., & Cho, H. J. (2006). Diagnosing organizational cultures: Validating a model and method. Documento de Trabajo. Denison Consulting Group, 1(1), 1–39. Deshpande, R., & Webster Jr, F. E. (1989). Organizational culture and marketing: defining the research agenda. Journal of Marketing, 53(1), 3–15. Desjardins, C. D., & Bulut, O. (2018). Handbook of educational measurement and psychometrics using R. CRC Press. Dev, S., & Sengupta, S. (2017). The impact of work culture on employee satisfaction-empirical evidence from the Indian banking sector. International Journal of Human Resources Development and Management, 17(3–4), 230– 246. DeVellis, R. F. (2006). Classical test theory. Medical Care, S50–S59. DeVellis, R. F., & Thorpe, C. T. (2017). Scale development: Theory and applications. Sage publications. Djidu, H., Ismail, R., Rachmanintyas, N. A., Sumin, Imawan, O. R., Suhariyono, Aviory, K., Prihono, E. W., Kurniawan, D. D., & Syahbrudin, J. (2022). Analisis instrumen penelitian dengan teori tes klsik dan modern menggunakan program R (S. Hadi & H. Retnawati (eds.)). UNY Press. Dobni, C. B. (2008). Measuring innovation culture in organizations. European 323 Journal of Innovation Management, 11(4), 539–559. https://doi.org/10.1108/14601060810911156 Downing, S. M. (2003). Item response theory: Applications of modern test theory in medical education. Medical Education, 37(8), 739–745. Eagleton, T. (2016). Culture. Yale University Press. Edelen, M. O., & Reeve, B. B. (2007). Applying item response theory (IRT) modeling to questionnaire development, evaluation, and refinement. Quality of Life Research, 16(1), 5–18. El Alaoui, M., El Yassini, K., & Ben-Azza, H. (2019). Peer Assessment Improvement Using Fuzzy Logic (pp. 408–418). https://doi.org/10.1007/978- 3-030-11196-0_35 Embley, D. W., & Thalheim, B. (2012). Handbook of conceptual modeling: theory, practice, and research challenges. Springer. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Associates, Publishers. Etalong, T. A. (2020). Self Assessment Tools (SAT) as One of the Basic Tools for Performance Management: A Critique. Journal of Economics, Management and Trade, 1–10. https://doi.org/10.9734/jemt/2020/v26i1130303 Fernandez, M. R. (2016). Social responsibility and financial performance: The role of good corporate governance. BRQ Business Research Quarterly, 19(2), 137– 151. Finch, W. H., & French, B. F. (2018). Educational and psychological measurement. Routledge. Flamholtz, E. G., & Randle, Y. (2012). Corporate culture, business models, competitive advantage, strategic assets and the bottom line: Theoretical and measurement issues. Journal of Human Resource Costing & Accounting. Fowler Jr, F. J. (2013). Survey research methods. Sage publications. Franchignoni, F., Horak, F., Godi, M., Nardone, A., & Giordano, A. (2010). Using psychometric techniques to improve the balance evaluation system’s Test: the mini-BESTest. Journal of Rehabilitation Medicine: Official Journal of the UEMS European Board of Physical and Rehabilitation Medicine, 42(4), 323. Geisinger, K. F., Bracken, B. A., Carlson, J. F., Hansen, J.-I. C., Kuncel, N. R., Reise, S. P., & Rodriguez, M. C. (2013). APA handbook of testing and 324 assessment in psychology, Vol. 3: Testing and assessment in school psychology and education. American Psychological Association. Gilbert, N., & Doran, J. (2018). Simulating societies: The computer simulation of social phenomena. Routledge. Glaser, R. (1976). Components of a psychology of instruction: Toward a science of design. Review of Educational Research, 46(1), 1–24. Gonçalves, L. M., Tsuge, M. L. T., Borghi, V. S., Miranda, F. P., Sales, A. P. de A., Lucchetti, A. L. G., & Lucchetti, G. (2018). Spirituality, religiosity, quality of life and mental health among Pantaneiros: a study involving a vulnerable population in Pantanal Wetlands, Brazil. Journal of Religion and Health, 57, 2431–2443. Good, M., & Willoughby, T. (2006). The role of spirituality versus religiosity in adolescent psychosocial adjustment. Journal of Youth and Adolescence, 35, 39–53. Gregory, R. J. (2015). Psychological testing: History, principles and applications seventh edition. In Pearson Education (Global Edi). Pearson Education Limited. Gupta, M., Samant, K. T. S., & Tripathi, S. (2023). Measuring job performance of employees during WFH. Technology, Agility and Transformation: Emergent Business Practices, 240. Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate data analysis (8th ed). Eight Edition, Cengage: Learning EMEA. https://books.google.co.id/books/about/Multivariate_Data_Analysis.html?id= 0R9ZswEACAAJ&redir_esc=y Hair Jr, J. F., Howard, M. C., & Nitzl, C. (2020). Assessing measurement model quality in PLS-SEM using confirmatory composite analysis. Journal of Business Research, 109, 101–110. https://doi.org/https://doi.org/10.1016/j.jbusres.2019.11.069 Hambleton, R. K., & Swaminathan, H. (1985). Item response theory principles and applications. Kluwer Nijhoff Publishng. Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory (Vol. 2). Sage. Harrison, J. S., Thurgood, G. R., Boivie, S., & Pfarrer, M. D. (2019). Measuring 325 CEO personality: Developing, validating, and testing a linguistic tool. Strategic Management Journal, 40(8), 1316–1330. Hayes, A. F., & Krippendorff, K. (2007). Answering the call for a standard reliability measure for coding data. Communication Methods and Measures, 1(1), 77–89. https://doi.org/10.1080/19312450709336664 Hofstede, G., Neuijen, B., Ohayv, D. D., & Sanders, G. (1990). Measuring organizational cultures: A qualitative and quantitative study across twenty cases. Administrative Science Quarterly, 286–316. Indiyaningsih, K. M. H., Murdyastuti, A., & Puspitaningtyas, Z. (2020). Efeect of human resource competency, work culture and utilization of information technology to performance of employees. International Journal of Scientific and Technology Research, 9(4), 3636–3641. https://doi.org/10.19184/issrd.v2i1.17468 Iramaneerat, C., Smith Jr, E. V, & Smith, R. M. (2008). An introduction to Rasch measurement. Best Practices in Quantitative Methods, 50–70. Irwing, P., Booth, T., & Hughes, D. J. (2018). The Wiley handbook of psychometric testing: A multidisciplinary reference on survey, scale and test development. John Wiley & Sons. Joo, S., Lee, P., Park, J. Y., & Stark, S. (2021). Assessing dimensionality of the ideal point item response theory model using posterior predictive model checking. Organizational Research Methods. https://doi.org/10.1177/10944281211050609 Joseph, M., Chang, J., Buck, S. G., Auerbach, M., Wong, A. H., Beardsley, T. D., Reeves, P. M., Ray, J., & Evans, L. V. (2020). A novel application of the modified angoff method to rate case difficulty in simulation-based research. Simulation in Healthcare the Journal of the Society for Simulation in Healthcare. https://doi.org/10.1097/sih.0000000000000530 Judge, T. A., & Robbins, S. P. (2017). Organizational behavior. Pearson. Juster, F. R., Baum, R. C., Zou, C., Risucci, D., Ly, A., Reiter, H., Miller, D. D., & Dore, K. L. (2019). Addressing the diversity–validity dilemma using situational judgment tests. Academic Medicine, 94(8), 1197–1203. Kabigting, J., Loures, L., & Brooks, D. (2019). The Denison organizational culture survey (DOCS): A culture measurement critique. Claremont Graduate University. Kamran, R., Rodrigues, J. N., Dobbs, T. D., Wormald, J. C. R., Trickett, R. W., & 326 Harrison, C. J. (2022). Computerized adaptive testing of symptom severity: a registry-based study of 924 patients with trapeziometacarpal arthritis. Journal of Hand Surgery (European Volume), 47(9), 893–898. Kanaslan, E. K., & Iyem, C. (2016). Is 360 degree feedback appraisal an effective way of performance evaluation. International Journal of Academic Research in Business and Social Sciences, 6(5), 172–182. Kashima, Y. (2019). What is culture for. Handbook of Culture and Psychology, 123–160. Kementerian Agama-Republik Indonesia. (2023). Perkembangan jumlah PNS Kementerian Agama. https://satudata.kemenag.go.id/dataset/detail/perkembangan-jumlah-pns- kementerian-agama Kementerian Agama Republik Indonesia. (2017). Keputusan Menteri Agama Republik Indonesia Nomor 582 Tahun 2017 tentang perubahan atas lampiran Keputusan Menteri Agama Nomor 447 Tahun 2015 tentang Road Map Reformasi Birokrasi Kementerian Agama Tahun 2015-2019 (582; pp. 1–69). Kementerian Agama Republik Indonesia. (2022a). Al-Qur’an terjemahan dan tafsir online Kementerian Agama (1). Kementerian Agama Repbulik Indonesia. https://quran.kemenag.go.id/ Kementerian Agama Republik Indonesia. (2022b). 214.306 ASN Kemenag ikuti survei indeks profesionalisme dan moderasi beragama. https://kemenag.go.id/read/214-306-asn-kemenag-ikuti-survei-indeks- profesionalisme-dan-moderasi-beragama-bgkpe Kementerian Agama Repulik Indonesia. (2020). Keputusan Menteri Agama Republik Indonesia Nomor 633 Tahun 2020 tentang pedoman pelaksanaan reformasi birokrasi pada Kementerian Agama (633; pp. 1–60). https://cms.kemenag.go.id/storage/flm/files/shares/files/kma-no-633-tahun- 2020.pdf Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi Republik Indonesia, B. (2016). Kamus besar bahasa Indonesia. In Badan Pengembangan dan Pembinaan Bahasa, Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi Republik Indonesia. https://kbbi.kemdikbud.go.id/ King, J. E., & Crowther, M. R. (2004). The measurement of religiosity and spirituality: Examples and issues from psychology. Journal of Organizational Change Management, 17(1), 83–101. Kitzman, J. O., MacKenzie, A. P., Adey, A., Hiatt, J. B., Patwardhan, R. P., 327 Sudmant, P. H., Ng, S. B., Alkan, C., Qiu, R., Eichler, E. E., & Shendure, J. (2011). Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nature Biotechnology. https://doi.org/10.1038/nbt.1740 Kline, P. (1986). A handbook of test construction: Introduction to psychometric design. New York: Methuen. Inc. Kline, R. B. (2015). Principles and practice of structural equation modeling. Guilford publications. Koçak, R. (2006). The validity and reliability of the teachers’ performance evaluation scale. Educational Sciences: Theory & Practice, 6(3), 799–808. Kohli, N., Koran, J., & Henn, L. (2015). Relationships among classical test theory and item response theory frameworks via factor analytic models. Educational and Psychological Measurement, 75(3), 389–405. Kozina, S., Kowalski, M., Vlastelica, M., Mastelić, T., & Borovac, J. A. (2019). Traumatic memory of one’s son gone missing in war: Content analysis using Krippendorff’s Alpha. Sage Open. https://doi.org/10.1177/2158244019839627 Krippendorff, K. (2011). Computing Krippendorff’s alpha-reliability. Krippendorff, K. (2018). Content analysis: An introduction to its methodology. Sage publications. Kurniawati, D., Kusumawati, D., & Arifah, M. (2019). Developing a Decision Support System with Dynamic Criteria for The Best Employee Assessment. Journal of International Conference Proceedings, 2(2), 60–68. https://doi.org/10.32535/jicp.v2i2.603 Leventhal, B. C., & Stone, C. A. (2018). Bayesian analysis of multidimensional item response theory models: A discussion and illustration of three response style models. Measurement: Interdisciplinary Research and Perspectives, 16(2), 114–128. Lidwa, S. (2020). Ensiklopedi Hadis-Kitab 9 Imam. Jakarta: Salnatera. Lievens, F. (1998). Factors which improve the construct validity of assessment centers: A review. International Journal of Selection and Assessment, 6(3), 141–152. Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology. 328 Linden, W. J. Van der. (2018). Handbook of item response theory: Three volume set. CRC Press. Linden, W. J., & Hambleton, R. K. (1997). Item response theory: Brief history, common models, and extensions. In Handbook of modern item response theory (pp. 1–28). Springer. Lord, F. M., & Novick, M. R. (2008). Statistical theories of mental test scores. IAP. Lutfah, A., Hariyati, N., & Handayaningrum, W. (2019). Improved teacher performance through work culture and environment. International Journal for Educational and Vocational Studies, 1(8), 859. https://doi.org/10.29103/ijevs.v1i8.2240 Lynn, M. L., Naughton, M. J., & VanderVeen, S. (2011). Connecting religion and work: Patterns and influences of work-faith integration. Human Relations, 64(5), 675–701. Lynn, P., Levy, P. S., & Lemeshow, S. (1993). Sampling of populations: Methods and applications. The Statistician, 42(2), 199. https://doi.org/10.2307/2348995 Lyons, P., & Bandura, R. P. (2018). Measures aiding in identifying voluntarily helpful employees: an accounting student sample. European Journal of Training and Development, 42(5/6), 305–318. https://doi.org/10.1108/EJTD- 01-2018-0008 Mackenzie, S. (1995). Surveying the organizational culture in an NHS trust. Journal of Management in Medicine, 9(6), 69–77. https://doi.org/10.1108/02689239510101157 Mardapi, D. (2017). Pengukuran penilaian dan evaluasi pendidikan edisi 2. Yogyakarta: Parama Publishing. Mathew, G. C., Prashar, S., & Ramanathan, H. N. (2018). Role of spirituality and religiosity on employee commitment and performance. International Journal of Indian Culture and Business Management, 16(3), 302–322. Mathew, J. (2018). Organisational culture and effectiveness: A multi-perspective evaluation of an Indian knowledge-intensive firm. Employee Relations: The International Journal. Maydeu-Olivares, A., Morera, O., & D’Zurilla, T. J. (1999). Using Graphical Methods in Assessing Measurement Invariance in Inventory Data. Multivariate Behavioral Research, 34(3), 397–420. https://doi.org/10.1207/S15327906MBR3403_5 329 McDonald, R. P. (2013). Test theory: A unified treatment. psychology press. McMillan, J. H. (2013). Classroom assessment. Principles and Practice for Effective Instruction. Boston: Ed Allyn and Bacon. Meijer, R. R., & Tendeiro, J. N. (2018). Response theory. The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development, 413. Mislevy, R. J. (1996). Test theory reconceived. Journal of Educational Measurement, 33(4), 379–416. Morrison, M. (2015). Reconstructing reality: Models, mathematics, and simulations. Oxford Studies in Philosophy o. Moss, C. M., & Brookhart, S. M. (2019). Advancing formative assessment in every classroom: A guide for instructional leaders. ASCD. Mousavi, S.-F. (2017). Analysis of time series in hydrological processes using chaos theory (Case study: Monthly rainfall of Urmia Lake). Modares Civil Engineering Journal, 17(2), 213–223. Muijen, J. J. Van. (1999). Organizational culture: The focus questionnaire. European Journal of Work and Organizational Psychology, 8(4), 551–568. Munawwaroh, S., Larasati, E., Suwitri, S., & Warsono, H. (2020). Work culture change in Ministry of Religious Affairs (MoRA) Indonesia. IEOM Society International, 2843–2851. Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16(2), 159–176. https://doi.org/10.1177/014662169201600206 Muraki, E., & Bock, R. D. (1997). PARSCALE-3: IRT based test scoring and item analysis for graded items and rating scales [Computer software]. Chicago: Scientific Software International. Nakayama, M., Sciarrone, F., Uto, M., & Temperini, M. (2021). Estimating Student’s Performance Based on Item Response Theory in a MOOC Environment with Peer Assessment (pp. 25–35). https://doi.org/10.1007/978- 3-030-52287-2_3 O’Reilly III, C. A., Caldwell, D. F., Chatman, J. A., & Doerr, B. (2014). The promise and problems of organizational culture: CEO personality, culture, and firm performance. Group & Organization Management, 39(6), 595–625. 330 O’Reilly III, C. A., Chatman, J., & Caldwell, D. F. (1991). People and organizational culture: A profile comparison approach to assessing person- organization fit. Academy of Management Journal, 34(3), 487–516. Obregon, S. L., Lopes, L. F. D., Kaczam, F., da Veiga, C. P., & da Silva, W. V. (2022). Religiosity, spirituality and work: A systematic literature review and research directions. Journal of Business Ethics, 179(2), 573–595. Ostini, R., & Nering, M. L. (2006). Polytomous item response theory models (Issue 144). Sage. Özaydin, Z., & Arslan, Ç. (2022). Assessment of mathematical reasoning competence in accordance with PISA 2021 mathematics framework. Kuramsal Eğitimbilim. https://doi.org/10.30831/akukeg.1027601 Paek, I., & Cole, K. (2019). Using R for item response theory model applications. Routledge. Pargament, K. I. (2001). The psychology of religion and coping: Theory, research, practice. Guilford press. Park, J., Yim, M. K., Kim, N. J., Ahn, D. S., & Kim, Y.-M. (2020). Similarity of the cut score in test sets with different item amounts using the modified Angoff, modified Ebel, and Hofstee standard-setting methods for the Korean Medical Licensing Examination. Journal of Educational Evaluation for Health Professions, 17, 28. https://doi.org/10.3352/jeehp.2020.17.28 Plomp, T. (2013). Educational design research: An introduction. Educational Design Research, 11–50. Raykov, T., Marcoulides, G. A., & Patelis, T. (2015). The importance of the assumption of uncorrelated errors in psychometric theory. Educational and Psychological Measurement, 75(4), 634–647. Reckase, M. D. (1985). Models for multidimensional tests and hierarchically structured training materials. American Coll Testing Program Iowa City IA Test Development Diva. Reckase, M. D. (2009a). Estimation of item and person parameters. Multidimensional Item Response Theory, 137–178. https://doi.org/10.1007/978-0-387-89976-3_6 Reckase, M. D. (2009b). Historical background for multidimensional item response theory (MIRT). Multidimensional Item Response Theory, 57–77. https://doi.org/10.1007/978-0-387-89976-3_3 331 Reise, S. P., & Rodriguez, A. (2016). Item response theory and the measurement of psychiatric constructs: some empirical and conceptual issues and challenges. Psychological Medicine, 46(10), 2025–2039. Rest, J., Thoma, S., & Edwards, L. (1997). Designing and validating a measure of moral judgment: Stage preference and stage consistency approaches. Journal of Educational Psychology, 89(1), 5–28. https://doi.org/10.1037/0022- 0663.89.1.5 Retnawati, H. (2014). Teori respons butir dan penerapannya: Untuk peneliti, praktisi pengukuran dan pengujian, mahasiswa pascasarjana. Nuha Medika. Rohmad, R., Dharin, A. D., & Azis, D. K. (2023). Developing self-assessment instruments of affective domain on belief and morality (Aqidah Akhlak) subject in Madrasah Tsanawiyah. Pegem Journal of Education and Instruction, 13(1). https://doi.org/10.47750/pegegog.13.01.21 Rusch, T., Lowry, P. B., Mair, P., & Treiblmaier, H. (2017). Breaking free from the limitations of classical test theory: Developing and measuring information systems scales using item response theory. Information & Management, 54(2), 189–203. Rust, J., & Golombok, S. (2014). Modern psychometrics: The science of psychological assessment. Routledge. Saini, V. P., & Bhaker, S. (2015). Exploring the dimensions of performance appraisal system in private ceramic tiles companies in India. International Journal of Education and Management Studies, 5(3), 273. Salam, R. (2021). The importance performance assessment and its impact on improving performance of public service organizations in South Tangerang City. Sosiohumaniora, 23(2), 226. https://doi.org/10.24198/sosiohumaniora.v23i2.31963 Samejima, F. (1997). Graded response model. In Handbook of modern item response theory (pp. 85–100). Springer. Santoso, A. B., Sutisna, D., & Tarmidi, D. (2019). Building a work culture that impacts on employee’s performance improvement. International Journal of Innovation, Creativity and Change, 6(6), 80–87. Sari, M., Lubis, A. F., Maksum, A., Lumbanraja, P., & Muda, I. (2018). The influence of organization’s culture and internal control to corporate governance and Its impact on state-owned enterprises corporate. Journal of Applied Economic Sciences, 13(3). 332 Sarle, W. S. (1995). Measurement theory: Frequently asked questions. Disseminations of the International Statistical Applications Institute, 1(4), 61– 66. Sattler, C., Sonntag, K., & Götzen, K. (2016). The quality culture inventory (QCI): An instrument assessing quality-related aspects of work. In Advances in Ergonomic Design of Systems, Products and Processes (pp. 43–56). Springer. Scheaffer, R. L., Mendenhall III, W., Ott, R. L., & Gerow, K. G. (2011). Elementary survey sampling. Cengage Learning. Schein, E. H. (2010). Organizational culture and leadership (Vol. 2). John Wiley & Sons. Schneider, B., & Barbera, K. M. (2014). The Oxford handbook of organizational climate and culture (P. E. Nathan (ed.)). Oxford University Press. Schriber, J. B., & Gutek, B. A. (1987). Some time dimensions of work: Measurement of an underlying aspect of organization culture. Journal of Applied Psychology, 72(4), 642. Schumacker, R. (2019). Psychometric packages in R. Measurement: Interdisciplinary Research and Perspectives, 17(2), 106–112. https://doi.org/10.1080/15366367.2018.1544434 Seijts, G., Espinoza, J. A., & Carswell, J. (2020). Utility analysis of character assessment in employee placement. Leadership & Organization Development Journal, 41(5), 703–720. https://doi.org/10.1108/LODJ-07-2019-0314 Seppala, E., & Cameron, K. (2015). Proof that positive work cultures are more productive. Harvard Business Review, 12(1), 44–50. Sharp, R. H. (2022). Authenticity, Religiosity, and Organizational Opportunity: Exploring the Synergies Between Religion, Spirituality, and Organizational Behavior. In Religion and Its Impact on Organizational Behavior (pp. 65–94). IGI Global. Silverberg, J. I., Lai, J. S., Kantor, R., Dalal, P., Hickey, C., Shaunfield, S., Kaiser, K., Correia, H., & Cella, D. (2020). Development, validation, and interpretation of the PROMIS Itch Questionnaire: A patient-peported outcome measure for the quality of life impact of Itch. Journal of Investigative Dermatology. https://doi.org/10.1016/j.jid.2019.08.452 Smit, E. vd M., Van der Post, W. Z., & De Coning, T. J. (1997). An instrument to measure organizational culture. South African Journal of Business Management, 28(4), 147–161. 333 Solís, M., & Mora-Esquivel, R. (2020). Development and validation of a measurement scale of the innovative culture in work teams. International Journal of Innovation Science, 11(2), 299–322. https://doi.org/10.1108/IJIS- 07-2018-0073 Souza, A. C. de, Alexandre, N. M. C., & Guirardello, E. de B. (2017). Psychometric properties in instruments evaluation of reliability and validity. Epidemiologia e Servicos de Saude, 26, 649–659. Spearman, C. (1987). The proof and measurement of association between two things. The American Journal of Psychology, 100(3/4), 441–471. Stanek, K. (2021). Self-assessment and coping with stress in the area of competence on the example of future social workers. Praca Socjalna, 36(2), 59–71. https://doi.org/10.5604/01.3001.0014.8732 Sugianingrat, I. A. P. W., & Sarmawa, I. W. G. (2017). Effect of work culture on employee performance with work motivation as mediator: study at non- star hotel in Denpasar-Bali, Indonesia. International Journal of Economics, Commerce and Management, V(12), 858–867. Suskie, L. (2018). Assessing student learning: A common sense guide. John Wiley & Sons. Tan, B.-S. (2019). In search of the link between organizational culture and performance. Leadership & Organization Development Journal, 40(3), 356– 368. https://doi.org/10.1108/LODJ-06-2018-0238 Team, R. C. (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Http://Www. R- Project. Org/. Teltemann, J., & Schunck, R. (2020). Standardized Testing, Use of Assessment Data, and Low Reading Performance of Immigrant and Non-immigrant Students in OECD Countries. Frontiers in Sociology, 5. https://doi.org/10.3389/fsoc.2020.544628 Thontowi, Z. S., Qowim, M., & Dardiri, A. (2019). Implementasi lima nilai budaya kerja di Kantor Kementerian Agama Kabupaten Banyumas. FIKROTUNA: Jurnal Pendidikan Dan Manajemen Islam, 9(1), 1160–1178. Tim Puslitbang LKKMO. (2022). Strategi implementasi lima nilai budaya kerja di lingkungan Kementerian Agama “Memperkuat etos dan produktivitas kerja”, Policy Brief. https://simlitbangdiklat.kemenag.go.id/n Kementerian Agama “Memperkuat Etos Dan Produktivitas Kerja”, Policy Brief. https://simlitbangdiklat.kemenag.go.id/ 334 Topa, G., Jose Fernandez Muñoz, J., Rey Juan Carlos, U., Cristina García-Ael, S., Chapman, D. S., Reeves, P., & Chapin, M. (2018). A Lexical Approach to Identifying Dimensions of Organizational Culture . Frontiers in Psychology , 9. https://doi.org/10.3389/fpsyg.2018.00876 Tracey, P. (2012). Religion and organization: A critical review of current trends and future directions. Academy of Management Annals, 6(1), 87–134. Traub, R. E. (1997). Classical test theory in historical perspective. Educational Measurement, 16, 8–13. Ulum, R., Sugiyarto, W., Wahab, A. J., & Muntafa, F. (2019). Indeks kesalehan sosial. In Litbangdiklat Press. Litbangdiklat Press. https://simlitbangdiklat.kemenag.go.id/simlitbang/assets_front/pdf/16111281 95Indeks_Kesalehan_Sosial_2019.pdf Wainer, H., & Thissen, D. (1996). How is reliability related to the quality of test scores? What is the effect of local dependence on reliability? Educational Measurement: Issues and Practice, 15(1), 22–29. Warsah, I., & Imron, I. (2019). The Discourse of Spirituality Versus Religiosity in Islam. AL ALBAB, 8(2), 225–236. Wesolowski, B. C. (2020). “Classroometrics”: The Validity, Reliability, and Fairness of Classroom Music Assessments. Music Educators Journal, 106(3), 29–37. https://doi.org/10.1177/0027432119894634 Wilensky, U., & Rand, W. (2015). citation: Sumin, Sumin and Retnawati, Heri (2024) Model Asesmen Nilai-Nilai Budaya Kerja Kementerian Agama Republik Indonesia. S3 thesis, Sekolah Program Pascasarjana. document_url: http://eprints.uny.ac.id/83146/1/disertasi_sumin_21701261018.pdf