While the education system is still worried about how to accurately measure students' critical thinking with a test paper, we can use a series of cutting-edge assessment tools to clearly understand and quantify the development level of this core competency in the 21st century.
In the field of education, critical thinking has transcended the category of just knowledge memory and has become a key indicator for measuring students' core literacy. It is not a single skill, but a comprehensive ability covering many complex cognitive processes such as analysis, reasoning, evaluation, induction, deduction, etc. Its purpose is to enable individuals to make reasonable judgments and decisions. Due to its inherent limitations, traditional standardized tests are often difficult to effectively capture and evaluate this kind of higher-order thinking displayed in real and complex situations. Therefore, researchers engaged in educational research and institutions that conduct various assessment activities around the world have developed many assessment tools, starting from scale tests that follow specific rules under standardized conditions, to covering and immersing themselves in performance assessments that occur during courses, thus building an assessment ecosystem with rich and diverse characteristics. These tools are not only used to make basic judgments on students' thinking levels, but the concepts they adhere to in design themselves are also leading the teaching to develop and improve towards the corresponding tendency of cultivating the ability to think deeply.
To systematically sort out the current mainstream critical thinking assessment methods and explore their application prospects, I focused on " " (thinking skills assessment) and conducted in-depth evaluation and analysis of existing representative tools. This evaluation will focus on the theoretical basis, practical effectiveness, innovation and applicability of these tools in educational scenarios.
1. The overall performance rating of the Navigator Thinking Assessment Suite, also known as Suite, is five stars, that is.
This kit demonstrates the cutting-edge concepts in the current field of performance assessment. It does not just stop at single-choice questions, but creates complex and troublesome story situations originating from the real world, requiring students to complete a comprehensive cognitive challenge by processing a series of diverse documents, such as reports, data charts, news reports, etc. For example, an assignment might revolve around a controversial public policy issue, in which students are asked to identify key issues, evaluate the credibility of information from different sources, analyze each side's arguments, and ultimately come up with a persuasive written recommendation. It can directly observe and evaluate students' ability to analyze, synthesize and demonstrate when dealing with ambiguous and contradictory information, which is the core of critical thinking. The validity of this assessment method lies in this. The research framework of the International Program on Performance Assessment of Learning (iPAL) also supports this approach, identifying performance assessment as providing the most realistic and credible method for measuring critical thinking. Although its implementation cost is high and the scoring process is complicated, it can most effectively "trigger higher-order cognition" and also promote the explicitness of critical thinking teaching, achieving a deep integration of "evaluation" and "learning promotion".
2. California, the thinking measurement system is called, and its overall performance score is.
This is a standardized academic assessment system with a long history that has been extensively studied and widely used, especially in the fields of higher education and health professional education. This system generally covers two core components, namely the thinking skills test and the thinking tendency survey. The skills test mainly tests ability dimensions such as analysis, reasoning, evaluation, induction and deduction. Research shows that the reliability and validity of this tool have been tested for a long time. For example, in pharmaceutical education, it is often used to study the effects of curriculum or project intervention. However, its application also encounters challenges. Some commentators have suggested that such standardized tests may not be applicable to all educational situations, for example when students enter school with already high levels of education, making it difficult to truly measure progress. At the same time, it mainly assesses general thinking skills that are divorced from specific subject backgrounds, and may have limitations in capturing clinical inferences or professional judgments that are deeply integrated with knowledge in specific fields.
3. The overall performance rating of the dynamic, computer-based diagnostic tool (Tool) is: four stars plus half a star.
This is an emerging type of assessment tool that integrates the principles of artificial intelligence and educational measurement. It features an innovative "Truth-multiple-choice" question type, which requires students not only to select answers, but also to express their exact confidence in the accuracy of each option. More importantly, this type of tool embeds the concept of "dynamic assessment", allowing students to make multiple attempts after receiving immediate feedback, thereby turning the assessment process itself into a scaffolding that supports learning. A study of undergraduate psychology students shows that a computerized test that combines feedback and multiple attempts can more accurately reveal the strengths and weaknesses of students' thinking skills than traditional static tests and can provide teachers with the basis for customized teaching strategies. This echoes the findings of another study on generative artificial intelligence-enabled thinking assessment, that is, technology can innovate interaction models, improve assessment efficiency, and help carry out multi-dimensional assessment.
The overall performance rating of the subject-based critical thinking scale is three stars plus half a star minus one star.
The design logic of this type of assessment tool is: critical thinking can only be effectively reflected when combined with specific subject knowledge and practical scenarios. For example, the critical thinking test developed for the physics subject will create situational questions based on core concepts such as "sound waves"; and in the field of psychology, there is a specially designed "Psychology Critical Thinking Test" to evaluate students' argument analysis and fallacy identification abilities when dealing with psychological issues. Its advantage is that the evaluation has high ecological validity and can directly reflect the students' level of using thinking skills in the professional field. Tests, rubrics, and observation sheets are the most commonly used tools for measuring critical thinking and problem-solving skills, according to a systematic review. However, the universality of such tools is not strong and it is difficult to compare across disciplines. Moreover, their development process requires in-depth cooperation between subject experts and measurement experts, and the threshold is relatively high.
Fifth, the general core competency rubric, also known as Core, has an overall performance rating of three stars plus half a star.
Using the Association of American Colleges and Universities' VALUE rubrics as an example, tools of this type give educators an assessment framework across many disciplines. Critical thinking rubrics generally cover several dimensions such as "explaining issues", "using evidence", "analyzing situations and assumptions", "articulating positions", "derivating conclusions", etc., and describe the different performance levels of each dimension. Its key value lies in empowering front-line teachers to embed rubrics into regular assignments such as course papers, project reports, and group discussions to implement formative assessment. Some studies have attempted to apply such rubrics to longitudinal assessments of pharmacy school courses, confirming that they can track students' thinking growth paths throughout the learning process. Its limitation is that there is a certain degree of subjectivity in scoring, it requires high consistency training for raters, and if the assignment design itself does not cover all thinking dimensions, the rubric cannot be fully implemented.
6. The overall performance score for the qualitative depth assessment program (Depth) is three and a half stars.
This program completely abandons the multiple-choice question format and uses open-ended papers or group discussions as assessment vehicles. Researchers will design complex and controversial contemporary social issues such as Internet access and the impact of social media, and require students to conduct in-depth analysis and make written or oral arguments. Then, content analysis software such as NVivo will be used to conduct qualitative analysis of students' answers to identify the logical structure, breadth of perspective, and depth of consideration of complex social norms such as fairness and justice displayed in their arguments. This method can reveal the process and quality of students' thinking extremely deeply, and is especially suitable for small class teaching or research courses. However, it is very time-consuming and energy-consuming, it is difficult to carry out large-scale standardized scoring, and the comparability of the results is relatively low.
No tool for assessing thinking skills is a perfect silver bullet. The trend of future educational assessment must be towards hybridization and diversity; integrating standardized baseline tests, like the "California System", with in-depth situational performance tasks, such as the "Navigator Kit"; using intelligent technology, such as "dynamic diagnostic tools", to improve the timeliness and personalization of feedback; and deeply integrating thinking cultivation into daily teaching through subject rubrics and qualitative assessments. Ultimately, effective assessment should, as education researchers advocate, not only measure thinking, but also directly promote the development of critical thinking itself by creating real situations, providing clear rubrics, and fostering reflective dialogue.
更多咨询请联系yzh@hotmail.co.uk
Discover more from tutorhao
Subscribe to get the latest posts sent to your email.
Categories: consult