AI4TL Artificial Intelligence for Teaching & Learning: Balancing Precision and Flexibility: Using AI for Grading and Feedback in Education

Background

Grading or marking student work and providing timely and accurate feedback is a time-consuming and monotonous, yet essential task for their learning. Recently, Owlerai.com offers a grading service for essay questions. My prompts below were tested against it, and found to be performing well. Check it out while it is still free!

Human grading of student work is far from perfect. Research n the USA based on 30 million records has shown that teachers who mark student work alphabetically by last name award lower grades towards the final letters of the alphabet, likely due to exhaustion after hours of grading. Research in Italy among 40.000 students in Northern Italy showed that girls get systematically higher grades than boys, probably because they exhibit fewer behavior issues that disrupt classes or irritate teachers.

However, recent advancements in artificial intelligence (AI) have made it possible to use large language models (LLMs) for grading written work, potentially saving teachers time and improving the accuracy and consistency of feedback.

Prompt Design Suggestions

When using LLMs for grading written work, it's important to keep in mind that prompt design is different from programming and requires a balance between precision and flexibility. Here are some suggestions for using LLMs for grading written work:

Choose the right LLM: For short answer or essay questions, it's best to use an LLM that is good at language tasks, such as Claude or Mistral.
Generate a rubric and suggested answers: Most LLMs are capable of generating a rubric and one or more suggested answers. This will help ensure that the marking is consistent and accurate.
Compare LLM marking with your own: It's important to always compare the LLM's marking with your own to ensure that it's reliable and accurate. Students have a reasonable expectation that a human, their teacher or lecturer, will be marking their work.
Do an internal reliability test: Mark the same work twice to ensure that the LLM's marking is consistent.
Rank students from best to worst: This will help you gauge whether the LLM's marking is accurate and consistent with your own.
Cross-check and validate the LLM's marking and comments: It's important to always double-check the LLM's marking and comments to ensure that they are accurate and relevant.

Adjust your prompt based on the output you obtain.

Conclusion and Call for Experimentation

While LLMs are not yet perfect tools for grading, they are improving and can be a valuable tool for teachers in grading written work and providing feedback. However, it's important to remember that AI is not a replacement for human judgment and that teachers should always be the "human-in-the-loop", protecting students from inaccurate or irrelevant grading and feedback.

By following the prompt suggestions outlined below, teachers can use LLMs to save time and improve the accuracy and consistency of feedback, while still maintaining the essential human touch.

Here is an example prompt for using an LLM to mark written work:

Assess the student's answer above, determine the marking band, and award the appropriate marks. Provide a brief explanation of your marking and suggestions on how to improve the answer.

Here is the rubric for this question: [insert rubric as text or link]

Here are suggested answers: [insert suggested answers]

Here are general marking principles for this assessment: [insert marking principles]

Here are subject-specific marking principles: [insert marking principles]

Here is the specific marking scheme for the assessment: [insert marking scheme]

Examples of excellent marking: [insert examples]

Generating Suggested Answers

For your own grading, it can be useful to have a suggested answer. Here is a specific prompt that can be used when exams (also called papers in the UK and IB) are provided with an answer key. It contains a self-evaluation routine which can be useful.

Your task is to write a suggested answer to an IB Business Management Paper 2 Higher Level exam question, following the provided guidelines and based on given sources and the official marking scheme.

Here are the numbered sources to use in formulating your answer:
<sources>
{{SOURCES paste sources or case descriptions here}}
</sources>

Here are the numbered exam questions:
<questions>
{{QUESTIONs}}
</questions>

Here is the official marking scheme to evaluate your answer against:
<marking_scheme>
{{MARKING_SCHEME paste raw text here}}
</marking_scheme>

Here are some general guidelines for writing good IB Business Management exam answers
<guidelines>
{{GUIDELINES <past links or text here}}
</guidelines>

Start by carefully reviewing the sources and marking scheme. Then, brainstorm the key points you will need to cover in your answer to fully address the question and meet the marking scheme requirements. Write your brainstorm inside <brainstorm> tags.

Next, write your full answer to the question inside <answer> tags. Make sure to incorporate all the key points you identified in your brainstorm. Structure your answer with a clear introduction, body paragraphs that flow logically, and a conclusion. Follow the provided guidelines for effective Business Management answers.

After writing your answer, review it carefully against the marking scheme. In a <score_justification> section, explain how well you think your answer addresses each component of the marking scheme and what score you believe it merits based on the scheme.

Finally, state the score you would assign your answer based on the marking scheme inside <score> tags.

Remember, your goal is to produce an answer that would score as highly as possible according to the marking scheme while following IB Business Management best practices. Rely only on the provided sources in formulating your answer.

This prompt provided answers grade 80% or more, according to owlerai.com

Recommendations

Remember to always test the LLM's grading and cross-check it with your own marking or with owlerai.com. The more information you provided in the prompt in terms of guidance and examples, the better the output. You are the human-in-the-loop, protecting students from inaccurate or irrelevant marking.

By using LLMs in a thoughtful and deliberate way, you can improve the accuracy and consistency of feedback while still maintaining the essential human touch.

Here is an older post with some general recommendations on this topic.

#AIinEducation #TeachersCheatSheet #ChatGPT #Grading #Marking

Works cited

Blake, J. (2024, April 22). Study shows grading by alphabetical ordered hurts fairness. Retrieved from https://www.insidehighered.com/news/quick-takes/2024/04/18/study-shows-grading-alphabetical-ordered-hurts-fairness

Di Liberto, A., Casula, L., & Pau, S. (2022). Grading practices, gender bias and educational outcomes: evidence from Italy. Education Economics. Retrieved from https://www.tandfonline.com/doi/full/10.1080/09645292.2021.2004999

🄴🄼🄿🄾🅆🄴🅁 🅃🄴🄰🄲🄷🄴🅁🅂

AI4TL Artificial Intelligence for Teaching & Learning

Followers

Monday, April 22, 2024

Balancing Precision and Flexibility: Using AI for Grading and Feedback in Education

Background

Prompt Design Suggestions

Conclusion and Call for Experimentation

Generating Suggested Answers

Recommendations

Works cited

No comments:

Post a Comment

I owe my existence to a study abroad program. That personal story is why I’m driven to help leaders and educators unlock the transformative power of education

Report Abuse