Teaching the NYS Global Regents CRQ

When the new Frameworks for social studies in New York State came out about six years ago, the first item that caught my eye was the constructed-response question sets. If you are not familiar with this task, in brief, it calls upon the student to examine historical sources (I use exclusively primary sources) by providing historical or geographic context, identifying the point of view, intended audience, or purpose of a document and then using the two documents in either compare-contrast, cause-effect analysis, or turning point identification.

Originally, the task was supposed to call upon the student to evaluate the reliability of the source. This was scrapped, probably after field testing questions. I regret their decision to scrap this (though it does appear as part of the analogous task, the short essay, in US History grade 11). Students can be taught to evaluate the reliability of sources at an early age (I incorporated it into my sixth- through eighth-grade social studies work). Students can be trained to evaluate based on features like the point of view (bias), intended audience, purpose, time and place, and authorship. Teaching this does require patience and practice. What I think happened is this: few teachers address the reliability of sources and so when they field tested the new exam format, everyone did so poorly that they scrapped the question. I wish I could find out how my kids did. I had been teaching the reliability of sources since they were in middle school.

I advocate a strategy of assigning one CRQ in each unit of study in grades nine and ten without access to notes. I used this as one of the tests at the end of a unit of study. I would tell students about what kinds of documents would appear and what historical context they should be able to recall in advance. It is a challenging task for them.

The first challenge for novices is to understand what it means to provide context. Faced with the question “What is the historical context of this document?”, beginners will retell what the document says. The reason for this mistake is that, since they were little kids, teachers have asked them to relate what a text means to prove they understood it. This task asks students to bring to bear what they know about the history behind the document, which requires a level of recall I don’t think we’re asking students to do enough.

Another challenge for students writing the CRQ is the third question where they must use two documents in analysis. Cause-effect relationships are the easiest for students, it seems, so I teach those first. When students are asked to compare and contrast the documents, some students have to be cautioned to select significant elements to compare. The fact that one document is a map and one is a diary entry, for example, is not a significant fact. The turning point question is the hardest, mostly because it calls on students to recall history and to understand historical trends both before and after the event. I find it useful to break this up into small tasks: first say what the turning point is, then say what is so special about it.

After a difficult task, especially for writing, I like to do a “debriefing”. In the debriefing after the CRQ, I like to share student answers anonymously in a slide show and discuss them. It is a very effective strategy for stamping out common errors early and permanently. Sample debriefing PowerPoints from an eighth-grade and ninth-grade CRQ are posted below.

Certain constant reminders were recurring for all my classes:

— Things are not reliable because of what they say or show.

— Things are not reliable because they are in quotation marks.

— Things are not reliable because they happen to match what you think is already true.

— Unreliable statements can be true.

— When you’re asked for historical context, do not describe what the document says.

And what about middle school? Can the constructed-response task be modified for that level?

Students with an IEP often are very challenged by these tasks because of the reading level of the texts. Since I use primary sources exclusively in my CRQ’s, this is doubly difficult for students with reading limitations. The answer to this is to consider obtaining CRQ’s designed with a lower reading level.

If I may be so bold as to give advice, I would suggest that students do these throughout Global nine and ten, perhaps once a month. I suggest that they be as tests instead of open-book or take-home assignments because students need to prepare for recall situations. The debriefing afterward is vital to class improvement and it makes for a long-lasting correction. I would advocate resisting the temptation to leave this off for the Global ten teacher. It really does take a long time to learn to do this well.

Why didn’t anyone tell me about Standardized Scoring in Teacher School?

There was not a lot of math in my teacher preparation in the 1980s. Actually, I managed to take my BA without a math course (I took extra science instead). I sure wish I had statistics from the start!

Sometimes no matter our experience or preliminary testing, we are surprised at how poorly a class does on a test. Most teachers resort, quite rightly, to some kind of “curve” to alter the scores. Even a valid, reliable test can be too difficult for students.

In effect, the standardized scoring was a useful set of training wheels that naturally disappeared once the class met the normal performance level!

A second issue of interest is in scaffolding difficult tasks for our students. Some things require time and practice to learn to do. Students may face low scores at first on such tasks. We don’t want their grades or their confidence to suffer. “Curving” the grades on a task while students are still in training for it is a good practice.

Enter the z-score standardization procedure. I don’t want this to be a post about mathematics (mostly because I am not confident to do so), but I would like to promote this as one of the best ways to alter a set of test scores in situations where (1) the group’s scores are lower than expected (say, more than 5 points below the class average overall in the course) or (2) the class is still practicing a difficult skill.

My interest in z-score began around 2010 when I was working to establish that the different capstone unit tasks I let kids choose were, in fact, of equal difficulty. I value differentiated instruction, but I also strongly value fairness. Z-score standardization let me establish how the rubrics for tasks in my class compared to a state test.

Standardizing the scores requires data that basically establishes a norm. How “should” the class have performed based on how a large set of previous students have performed? One of the things that makes grade standardizing hard is that one does not always have access to this data. How should my kids have done compared to how all my previous kids have done? Well, I saved my data.

I taught French for the first thirteen years of my career (plus two years later on) and then the other eighteen years I taught social studies. Now, mind you, I’m not a person to save a lot of stuff. My classroom was always pretty bare and I threw out stuff I wasn’t using. But data, that’s something I like to save. I have hard copies of my final grade sheets for all my students from 1994 to 2013. I also have all my Regents results (for those of you not in new York State, “Regents” are standardized state tests in different subjects). Permit me to share my data with the reader. This data will give you the mean and standard deviation on population sizes of around 100 (between 92 and 100) for French grades 8-10 and for social studies grades 8-11. See below.

I learned to do this with the help of my colleague in the math department, to whom I am grateful for answering a lot of my questions over the years and helping me learn basic statistics. I am a computer programmer and I wrote an app to do the calculations. It’s available for free and I invite you to use it. Just enter your class’ test scores, then the mean and standard deviation of the standard test you’re standardizing the scores to. The app generates a table of standardized scores and some statistical information.

A good example of using this is when I was teaching Global Studies 10 and US History 11. The new New York State Regents exams in these subjects have stimulus-based multiple-choice questions. These are hard for students at first. I had them do one each unit as a test. Standardized scoring let me modify their scores so that their grades were not harmed and their confidence preserved. So here’s the beauty of the standardized scoring: the method sets the mean of the current task to that of the standard, then adjusts everyone’s score using standard deviations. As the class improved on this task month by month, the class average approached the standard mean, so the grades were affected less and less. In effect, the standardized scoring was a useful set of training wheels that naturally disappeared once the class met the normal performance level!

Click here to shop for stimulus-based multiple-choice questions arranged by topics for Global Studies 10 and US History 11

A number of my education courses back in the ’80s were kind of useless. Hopefully, teacher training is better today. (I only had a one credit course in behavior management theory! Sheesh!) Statistics would have been a good course for me because I used it so extensively in my career. Readers who are interested might subscribe to InnovationAssessments.com to see the other statistical apps that you might find useful.

Basic Proficiency: A Classroom Differentiated Instruction Model

In the early 2000’s, there was a lot of talk about differentiated instruction picking up steam. Whether or not it was just another bandwagon remains to be seen, but it strikes me that differentiating middle and high school social studies classes is not only de rigeur right now but it is the right thing to do. Like anything else, how this is done is worth considering.

As a beginning teacher in the early ’90s, my unconscious goal was to teach the course so every kid would aim to get an A. The unspoken, unchallenged notion seemed to have been that every kid who works hard enough can, and therefore should, get an A. Disabusing myself of this notion improved my teaching practice and, therefore, my students’ progress.

I offer the analogy of a paper grocery bag, when overfilled, breaks at the bottom and all the groceries are lost. It is probably a question of cognitive load, or overload, as the case may be, that inhibits encoding of information into long-term memory.

The concept begins with the idea that the average student across the world earns probably a C or C+ in most courses. That is, after all, the definition of average. This is really the goal of instruction for the majority of students. The mandate is to pass. I was greatly influenced by a paper from 1993 (Dempster, F. N. (1993). Exposing Our Students to Less Should Help Them Learn More. Phi Delta Kappan) in which the author argued that limiting the amount of content delivered will increase how much students retain. Much instruction seems to be guided by the notion that if we inundate students with knowledge they will scoop up more of it, as much as they can. The opposite is the case. I offer the analogy of a paper grocery bag, when overfilled, breaks at the bottom and all the groceries are lost. It is probably a question of cognitive load, or overload, as the case may be, that inhibits encoding of information into long-term memory. The “Basic Proficiency” curriculum is a parallel set of modified tasks for the regular classroom that may be accessed by anybody any time whether special education or not.

The beneficiary of this program is the student who struggles to get that C+ in the course. Their performance and satisfaction is enhanced by offering less; more manageable chunks of data to process, integrate, and retain.

Click here to shop my TpT store for differentiated curriculum materials for social studies grades seven through ten.

So that’s the theory. In practice, it calls for a lot of preparation. I selected key assignments for modification: the reading task, the multiple-choice quiz, a writing task, and capstone options. The system worked thus (scroll to the bottom for sample parallel curricula): students electing basic proficiency at the start of a unit needed to declare their intention at the start. Then as the unit progressed, they accessed the modified versions of each assignment. When I moved my course materials completely online to go paperless in 2019, students changed from pulling modified assignments from a different folder to accessing a different set of links at Innovation Assessments.

For this kind of differentiated instruction to realize its full benefit, it must be paired with some remediation time outside class.

Fairness

I began developing this strategy about 2007. There are a number of key things I learned along the way. The first was to address the issue of fairness. A reasonable critique of differentiated instruction, and an element of this practice that was often overlooked by its proponents in the beginning, was the importance of ensuring that assignments that were different were actually of equivalent value. If student A is doing less than student B, how can student A reasonably expect to earn the same marks as student B? Furthermore, there were cases where strong students chose basic proficiency because they wanted an easy grade. Both of these problems were resolved by setting limits on the maximum score a student could earn on a modified task.

One of my friends in the special education department once made the case that a student with disabilities who was doing their best within their ability should be entitled to an A. I took a more conservative position on this, maintaining that the value of a work product was little influenced by the effort of the producer. There were valid and reliable ways of measuring the quality of student essays and the criteria were unaffected by ability.

Differentiated Reading

I would suggest that one of the key features of this plan, one which I embraced with some reservations at first, was to offer different levels of textbook reading for students on the basic plan. I was able to find history textbooks at a fifth grade reading level for all my classes (I was teaching US history in grades seven and eight and global history in grades nine and ten). In advance of the school year, I selected page numbers of articles that mirrored what was to be assigned from the standard textbook. Students process text in my classes using one of two methods, the “five and Three summary” (blog post coming soon on that) or Cornell Note Taking.

It should be noted that these texts contained about half as many words in a larger font and less than half the information of the standard text. Students choosing the easier reading were also choosing to learn less content. While they may learn some of the missing content from other activities such as my lecture, they were still having less served up to them.

Many assignments were the same for everybody. Each unit progressed through the same type of activities, key elements of which were offered on a modified basis. Management considerations dictated that students had to choose one package or another, Standard Inquiry or Basic Proficiency, on a unit-by-unit basis. the could not choose on a task-by-task basis. It was not possible to manage it.

Don’t Worry: There was plenty of enrichment for advanced students.

In the spirit of differentiation, I maintained a collection of college level books my ambitious learners could choose for their reading assignments. I had developed rubrics for more sophisticated versions of our essay work so students could attempt the next grade level of work. Differentiating for these students was done on a task-by-task basis and less formally, but being self-directed scholars they managed this mostly themselves. I only needed to provide the materials and encouragement to challenge them.

So How Did It Go?

I carried out the plan for a little over ten years in all my classes, grades seven through ten social studies. I found some interesting things. Firstly, I discovered that weaker students who chose basic proficiency in grades seven and eight mostly moved to standard inquiry by grades nine and ten. They tired of the maximum score limitations and they developed the skills to approach academics more effectively by having materials at their ability level to work on. Secondly, I discovered that some students who would be candidates for basic proficiency would sometimes choose standard inquiry if they liked the topic. The American Civil War was often one that had most people doing standard work. The effect of this was to give the weaker students confidence.

In 2012, I did a study of student progress on the plan to see whether I wished to continue it. Results were strong enough to continue the practice.

But then I could not do it anymore…

In my last year teaching social studies before retiring, I had to discontinue the program. The reasons were practical. My course assignments and rosters increased during this time period. A big disappointment for me was the loss of this extra remediation time in my schedule. I lost my remediation periods in the schedule in favor of teaching more courses (we taught six different subjects / grade levels where I worked). This showed me that modifying the work was sometimes not enough. Some students needed more time with their teacher outside of class and being denied this was a serious blow to my program. For this kind of differentiated instruction to realize its full benefit, it must be paired with some remediation time outside class even if only thirty minutes a week.

When I did a study of the work submission rates of my students during the remote learning of the pandemic, I discovered that there was a huge drop in work submission and homework completion overall starting after 2017 when my remedial class periods were cancelled. I invite the reader to return in the future and read my blog post on my experiences teaching in the pandemic. Pertinent to this discussion, the detrimental effect of eliminating extra help for my students was demonstrable over time.

I present this basic proficiency idea to the reader as a possibility they might consider for their classrooms. I found it to be a recipe for success, especially when paired with appropriate remediation. One needs to bear in mind that even when the assignments are all prepared in advance, there is a significant investment in time for management of the plan and for scoring a wider variety of assignments. Technology can help a lot with this (Like the Innovation Assessments learning platform!) but schools are advised to provide teachers who do this enough planning time and student contact time to make it happen. It is rewarding for the learner and well worth the investment.

Sample Parallel Curricula

Click here to shop my TpT store for differentiated curriculum materials for social studies grades seven through ten.

The Case for Prioritizing Debate for Critical Thinking in Secondary Social Studies

When I switched from teaching French to social studies in 2004, one of my first projects was to develop lesson plans for formal debate and mock trials for my classes. In time, these became a centerpiece of my units, second only to primary source work.

I value the critical discourse of debate in my middle and high school classes very highly, in the first place because it causes participating students to learn a greater quantity of history.

Maybe my interest in debate comes from my own youth. I was sent to parochial schools run by Franciscans who valued debate and the clash of ideas. While I was aware that I caused no end of irritation to my teachers by my willingness to play devil’s advocate in just about any discussion, my patient teachers helped me refine rhetorical practices in writing and speech. One of my religion teachers lent me a book on Aristotle to help me get my reasoning act together! Reasoned discourse was valued in our lessons and appealed to my innate rebel and, while I left high school without the belief system they sought to embed, I did leave well educated.

I value the critical discourse of debate in my middle and high school classes very highly, in the first place because it causes participating students to learn a greater quantity of history. In order to argue effectively, one needs facts and to understand the relationships between events such as cause-effect. Having to improvise arguments, or even plan and compose them for that matter, causes the student to develop schema of information that is long-lasting. In the second place, I hold rhetoric in high esteem because it develops the kind of critical thinking skills so necessary in a democracy. Citizens who are too easily swayed by propaganda or who consume social media without a critical eye are less citizens than they are pawns of powers seeking to use them. I think I left teaching French back in ’04 because I did not feel what I was doing was important enough somehow… but that’s for another post…

From my TeachersPayTeacher Store: Click here for a set of rubrics and training manual for teaching debate.

Discussion Style Debate

The most basic type of debate is the “discussion style”. Teams sit across from each other at a table with a moderator at the head. They give timed, prepared speeches in turn and then engage in improvised cross-examination. The rules are fairly simple and many students came to really look forward to the debate. I required everyone to do this at first, but I soon learned it was best left as an elective unit capstone task for the willing and to offer other things for students who do not like public speaking.

Students who were fearful of public speaking needed rhetorical training as well. The next development was the online discussion. I coded an app at InnovationAssessments.com that worked especially well for moderated class discussions and streamlined the grading and scoring process for me. In every unit, there was an online discussion topic students had to address. Their assignment was similar to one I was given in some online college classes I took. They were to post their response to the prompt, giving two grounds for their position (sometimes I assigned them a position even if they held the opposing view). In step two, they were to reply in the opposing view to the student above them in the feed on the app. Finally, they were to go back later and reply in defense to the student in the class who offered them an opposing view. Once trained, students completed this assignment as a matter of course in each unit. During working periods in class, discussions would often erupt as students wrote and this was marvelous.

I gave formal lessons in rhetoric and identifying logical fallacies and I built a logical fallacy tracking function into the forum app so students could flag posts that contained logical fallacies.

From my TeachersPayTeacher Store: Click Here for a PowerPoint slide show for teaching rhetoric and logical fallacies.

I hold rhetoric in high esteem because it develops the kind of critical thinking skills so necessary in a democracy.

An Elective Course in Rhetorical Strategies

A few years ago, I had the opportunity to teach “Debating Current Events”, an elective course in different rhetorical styles and skills. The course was assigned to me in response to a discussion at the Board of Education level about how the school can try to foster greater understanding and act as a block to rising polarization in society. The course syllabus opened with: “Debating Current Events gives students the chance to participate in the great clash of ideas of our democracy. But it’s not really just about clashing ideas: it’s about forming understanding of the opposing view that leads to appreciation of diverse opinions. Students will examine the historical and social context of emerging current events. They will become skillful consumers of information, learning to carefully examine sources and to read critically. They will understand persuasion not only as a strategy in their own discussions but as a tool to understand and evaluate media communication.” I had six brilliant and engaged students whose opinions fell across the political spectrum. It was a great way to start each day.

My course materials for Debating Current Events are available for sale at my store. Units of study are also sold separately. Unit 1: The political spectrum, tolerance and toleration, Robert’s Rules of Order (I give the class a great deal of control over our work using Robert’s Rules. Unit 2: Aristotelian rhetoric (does not include training in syllogisms) and Logical Fallacies. Unit 3: Toulmin’s Rhetorical Method. Unit 4: Rogerian Rhetoric

The Mock Trial

By far the most popular activity in my classes was the mock trial. In this simulation, students are placed in a historical context as actors in a murder mystery trial they write themselves. I do not know what first gave me the idea to do this, but it was one of the first units I designed when I started teaching social studies. The unit begins with the election of judges and attorneys. The rest of the participants are randomly assigned to teams to role play as witnesses for the defense and the prosecution. I would seed the development of the historical fiction story we were going to write as a class by giving a basic scenario. The murder was placed at some time in history we were just learning about. Over three days, a story emerged, as attorneys and witnesses imagined their side of the story and the means, motive and opportunity were fleshed out. Judges completed an online mini-course in courtroom procedure during these days. The trial commenced on day four, with opening statements and the prosecution putting on its case. It took six to seven days to do this, so we only did one a year (although one year I had the opportunity to teach a half year elective class just in historic mock trials). Embedded in history, the stories we composed really stuck with us. Students would come back a decade later and comment on one of the trials we did and maybe some striking event on the witness stand. I invite the reader to return to the blog in the near future for a detailed account of how to teach an unscripted mock trial unit. To the point of this post, the four attorneys and three judges needed to be models for the class in the kind of evidentiary reasoning that I think we would all agree every citizen would benefit to possess.

Click here to visit my store to purchase a Mock Trial Classroom Kit.

The Model House of Representatives

In the 2019-2020 academic year, I developed a Model House of Representatives unit for my US History classes. Like the mock trial, this unit plan was “modular”: able to be set in any historical time period. Students were trained in a basic version of one of the debate formats used by the United States House of Representatives after being assigned to political parties based on a survey of their personal political leanings. Parties elected their leaders and the majority was set in alignment with which party was in majority in the year in which the session was taking place. Members drafted bills, spoke on the house floor in debating a bill, etc. Crafting bills turned out to be an extremely useful activity in and of itself. Students were assigned a problem of the time period and to craft a law that would address it. I invite the reader now to return to this blog in the near future for a more detailed account of how one to use a Model House of Representatives activity.

Click Here to view Model House of Representatives unit materials in my TpT store.

I have often wondered whether an alternative career choice for me would have been as an attorney. I have an interest in law and justice that no doubt influenced me. But beyond that, these activities bring two important elements to the course in social studies: a deeper knowledge of historical context and the ability to reason well. A positive side benefit is in the ability to spot propaganda in social media, a lesson to be addressed in a future blog, so do stay tuned!

You can preview a set of lessons on consuming social media with a critical eye here and this lesson set is on sale at my TpT store here.

Teaching the NYS Global Regents Enduring Issue Essay

The first time I sat in on the regional scoring* for the new New York State Global History and Geography Regents, the other scorers and I had conversations about the difficulties teaching the enduring issue concept to students. Some waited until the end of the school year to practice these, since the essay prompt draws on documents across cultures and eras and calls on the student to observe some patterns. They argued that this could not be done during the year because it required many time periods. I think there’s a better method.

Learning to write the enduring issue essay calls upon the student to read sources, identify some common issue (as opposed to a theme), and then select three documents to combine with their own recollections of historical context to relate their conclusions. Frankly, I think it’s an outstanding task. I love that it calls upon students to draw conclusions from evidence — to synthesize for themselves. The class discussions we had while practicing these were interesting as students came to observe patterns I would not have thought of and to defend them admirably. Waiting until the end of the year during review sessions is a bad time to teach students how to do this. Most all students whose papers I scored from other districts scored 2 out of 5 on these essays.

Bringing most of my students’ scores to 3-4 out of 5 came from committing ourselves to write one of these essays every ten weeks in Global 9 and 10. For those of you not familiar, in New York State students take Global History and Geography I in grade nine and the second half, part II, in grade ten. The state Regents exam only now covers the tenth grade course. I mention this because there is a temptation for Global 9 teachers to skip the enduring issue essay and wait to grade ten just before the Regents. Permit me to suggest that this is a mistake and a missed opportunity.

Click here to visit my TeachersPayTeachers.com store where you can shop for enduring issue essay prompts for grades nine and ten.

The enduring issue essay was 45% of a student’s score on my ten week interim exams. The first few in grade nine were heavily supported, as I coached students to bring up historical context to connect with the documents they chose.

The first challenge for students was to learn to distinguish a “theme” from an “issue”. “Movement of People” is a theme. The violent conflicts caused by movement of people is an issue. Students who have a more narrowly defined issue that societies have to address now have a clear direction for their writing. When students merely notice a theme, such as “power”, they fall into the trap of just proving that this was something that was a “thing” because it’s in the three documents they chose. It may seem like splitting hairs here, but it is a very important distinction and it matters in the quality of their essay (and therefore their score). The essays we wrote at weeks ten and twenty both included a lot of coaching on my part on formulating an enduring issue that was focused enough to lay the groundwork for an excellent essay.

The problem mentioned by my colleagues from other districts, namely that they felt they could not teach this until they had covered a lot of history and therefore not until the end of the year, is resolved by composing essay prompts only on the topics we have already studied. So on the week ten essay in Global 9 an 10, the documents I selected were only from the civilizations / time periods we studied at the time. The reader is invited to browse my online store for essay prompts designed for different points in the course.

The next challenge to overcome for novice writers was to avoid the temptation to merely summarize what the document says. This pretty much goes against all their reading experience to date, where teachers demanded they say what they read or answer questions on it in order to prove they understood. Writing this essay well demands that students draw upon their recalled background knowledge. The tendency for students to just summarize the documents was a very difficult habit to break. I came to advise them to spend no more than a few sentences in a paragraph to summarize the document and to spend the rest of the words in that paragraph to bring in background history to the document and to state explicitly how it supports their issue.

My goal throughout the training is to get students to write a level 3 paper. I realize I am tempted to try to teach everyone to shoot for a 5, but that is not reasonable. A score of 5 represents above grade level. A score of 4 is reserved pretty much for those who recall a lot of history. Despite all our best efforts, our students do not really on average recall a lot of history. So in my training I shoot for polishing a level of writing that is still above what I was seeing in the compositions of neighboring schools and that was do-able given the typical memory of your average student. Those students capable of the 4’s and 5’s suffered no disadvantage from this because, once they perfected the method of identifying issues (as opposed to mere themes), selecting and interpreting documents, and then bringing in historical context then all they needed to do to impress the raters was to dump a ton of historical knowledge in there.

This essay assignment is a strong feature of the new New York State Global Regents examination. It gives evidence of critical thinking and it promotes a rich classroom experience. Training in this should not wait for the end of the year and is best done throughout both grades nine and ten. The effort pays off and students often come to like this task, the latter being a surprising result of this training program.

Click here to visit my TeachersPayTeachers.com store where you can shop for enduring issue essay prompts for grades nine and ten.

* For the reader not familiar with scoring high school Regents exams in New York State, about a decade ago the state instituted regional scoring. They felt there was some kind of funny business going on when teachers were scoring their own Regents exams, so they mandated that we could no longer grade our own students’ work. So since I worked in a tiny district, I had to schlepp off to another school where I would grade their exams and they would grade mine. The security was absurd: I could not even handle my students’ papers, lest I be accused of foul play. For the essays, each student’s essay was scored by two teachers trained on a rubric and sample papers from prior field testing. If the raters disagreed in score by more than 1 point, a third rater was called in. The system has some advantages but in hindsight it seems a little unnecessary.

AI Won’t Make Teachers Obsolete

When I started teaching a long time ago, I never imagined that I would see a little laptop on each of my students’ desks and something like ChatGPT. For a little less than half my career, I taught French. Google translate never occurred to me as anything more than science fiction. Yet here we are.

Could I write an app that would grade my students’ summaries? The answer turned out to be a pretty decent “yes!”

Like all technological advances, we’re going to do it even though it wouldn’t be so bad if we didn’t. Human beings just can’t help themselves. Luddites smashed the textile machines that took their jobs, but innovation in manufacturing went on anyway. Self-restraint is not a human virtue we can ever maintain for long. Our inner drives, born in Paleolithic desperation, eventually will have their way. We cannot ban AI development because somebody, somewhere is going to do it and then those that didn’t will be at a disadvantage. All things considered, I think educators can look forward to the AI developments soon to be upon us with a sense of optimism instead of dread.

AI cannot replace teachers. One big reason is that AI cannot develop the kind of personal rapport with students that has always been the foundation youngsters need in order to learn.

My interest in artificial intelligence grew from my interest in computer programming. When I started out learning BASIC in the early 1990s, I built a program that would block vulgar words in my students’ data entry fields. We were using pre-Windows DOS machines, 286’s donated from a closing Air Force base nearby. The project occurred to me to try to make what I now know to be called a “chatbot”. I tried devising software that would converse with me in simple sentences and such that, if it didn’t know how to respond, it would ask me and then store that as an option for future response. The reader will not be surprised to find that this project did not work. In hindsight, I now know I was way, way out of my league in attempting something like that. Besides that, the computing power necessary for machine learning, let alone the troves of digital data needed to train an AI on, did not exist in 1994. But I am contented to know that I had the basic gist of the idea of machine learning that real engineers would eventually put to use.

When I submitted the algorithm-generated summary into the AI grading assistant, which evaluates it based on comparison to human-composed ones, it scored 100%. Every. Time.

About five years ago, I started exploring the idea of automating some of my grading. I was teaching social studies and I would assign summarizing as the way students were to process textbook articles. I am convinced this is a far better method that having students answer questions on text they are reading. The problem was that I had about a hundred students across six different grade levels. The beginning of a unit would generate several hundred summaries a week to grade. Could I write an app that would grade my students’ summaries? The answer turned out to be a pretty decent “yes!”

AI-Scored Summaries

The AI grading assistant here at Innovation Assessments was trained on 500 human-scored summaries. The algorithm looks at eleven features of the text and compares it to the same text features of up to seven other models of summaries scoring 100%. These features include things like a Flesch-Kincaid readability measure, word count, common proper nouns and verb phrases, and statistical comparisons like cosine and Jaccard similarity. Before analysis, the app removes stop words, reduces words to their root form (lemma) and reduces many words to a common synonym (so the app can understand ideas written in slightly different wording). The method I used to establish the scoring algorithm was to chart these comparisons in a spreadsheet and adjust them until the AI scored the work about as I would have most of the time.

I was very pleased with the results on this. The scoring of student work became a lot faster. The app brings up the AI score estimate and I can check it to confirm. This is why I call it an “AI grading Assistant”: it still needs a human supervisor. As time went on, though, I came to trust the app more and more. When I set up the assignment, I would enter my own summary of the target text from the start. Once students completed the task, I went first to score the work of students who usually get 100%. I could add up to six of these to the “corpus”, which is the body of model text the software uses to judge. The next step was to run the AI grading assistant on the work submissions of the rest of the class.

The scoring of summaries in this way required one or more human-composed models. Next, I wondered whether I could write an algorithm that would summarize a text. I am not able to write code that can write “in its own words”. Instead, my little bot mainly extracts the first sentence of each paragraph and then some selected other sentences verbatim if they meet certain criteria (such as the presence of key words identified by frequency in the text). I had my doubts about how effective this would be. Surely, it would lose some important meaning sometimes since it was a formula and not really “reading” like a human would. Well, get this …

… When I submitted the algorithm-generated summary into the AI grading assistant, which evaluates it based on comparison to human-composed ones, it scored 100%. Every. Time.

AI-Scored Short Answer Tests

Another challenge of teaching social studies with a lot of reading and writing was the large volume of grading student work in the form of short answer tests, particularly document-based analyses. Could some similar software assist in scoring short answer tests?

The app development method was about the same: I had hundreds of student work samples to analyze. Using some similar methods as for grading summaries, the new app allowed the teacher to add up to five versions of full-credit answers into the corpus for comparison. One feature that was not examined in the summary AI grading assistant was the degree to which a student’s writing was analytical (as opposed to merely descriptive). This project went fairly well – well enough for an amateur programmer and accurate enough such that the short answer scoring was a huge help to me. Click here to read more about development of an algorithm to measure the degree of analysis in a student writing sample.

The AI-assisted scoring of short answer tests was most successful at evaluating responses that had a limited range of credit-worthy answers. The AI performed well for questions like “What caused the fall of the Roman Empire?” The AI did not perform well on questions such as evaluating the reliability of a primary source, since the range of possible correct answers would have required a lot more models to train on than the five the software allows. Nonetheless, the short answer AI grading assistant saved me tons of time. It allowed me to maintain a teaching method that was very time consuming by lightening the workload so I could spend my time in curriculum development.

Opportunities for AI to Coach Students

I came to have so much confidence in the AI grading assistant that I built in access for my students. Students composing their summaries at InnovationAssessments can access the coach, which gives them a pretty accurate score estimate while they write. This take a little mystery out of “how am I doing?” and helps develop strong summarizing skills. That’s reading comprehension and basic composition.

The AI grading assistant is also an effective coach in short answer exercises. Enabling the coach for a practice run at short answer tasks permits students to have instant estimates of the quality of their work submissions and the AI offers little hints and suggestions drawn from the corpus of model answers on which it was trained.

We’re Not Being Replaced Yet

AI used in the way described here did not replace me. It still required supervision. I would assert that it enhanced my work, allowing me to use a better teaching methodology that was not very practical otherwise. The way I wanted to teach was really a recipe for burnout in the context of my particular teaching job. Assigning three summary tasks to a hundred students over a two week period, well, the reader can do the math. AI assisted scoring let me do the best job I could without burning myself out. That is a great reason to continue AI development and research, even for amateur programmers like myself.

There is a very solid reason why AI will not replace us. AI cannot replace teachers. One big reason is that AI cannot develop the kind of personal rapport with students that has always been the foundation youngsters need in order to learn. AI cannot form emotional bonds with people. If the day ever comes that it can do this, then we have something more than intelligence that is artificial, we will have a consciousness.

Can Video Replace Textbook?

In the 2021-22 academic year, I had the opportunity to finally teach a section of Regents United States History and Government. It’s something I had looked forward to after teaching US History in grades seven and eight for nearly two decades. The students were awesome – I had taught them all since they were in eighth grade since I was working in such a small school. We worked together to devise a course that would be most suitable for them and in the process we discovered that classroom textbooks can be replaced by video lessons of a certain design. News to me!

I would like to term these “enhanced video lessons” to distinguish them from a lesson plan that merely asks students to watch a video. The enhanced video lesson includes embedded questions in order of appearance in the video …

My teaching practice for social studies always entailed a strong commitment to promoting reading. I am sold on the idea that middle school students, recently relieved of their elementary reading instruction, needed continued instruction in reading and that the content areas were a fine place to do that. I invite the reader to my other blog posts on reading in the content areas. The very idea that I would abandon this commitment was a little hard to swallow.

Check out this post “Teaching with Video: Three Paths to Engagement and Accountability” on research-based video teaching.

The school year began with lessons devised in my customary fashion. I selected textbook articles that gave good summaries of the US history my students needed to learn. I offered articles at a lower reading level for slightly reduced credit for those who do not yet read on grade level. My students were given a choice of either summarizing the selected text in a specific format or doing Cornell notes on the pages. This was my reading practice, developed over almost twenty years.

I should tell you about the class, warning you first of my bias about them since they stand as one of my favorites. The class was homogeneously grouped. Nearly all were enrolled half day in a vocational education program. Their path to earning a living was to be in the trades: heavy equipment operator, welder, mechanic, machinist, forestry management, etc. Those who were still pursuing a traditional educational program all shared a general distaste for traditional academic work. The textbook – summary or Cornell notes thing had never gone over well with them so when we began they were resigned to it at best.

Click here to Visit my TeachersPayTeachers store if you want to purchase access to my social studies enhanced videos.

In the system of my units of study, the reading days alternate with presentation days. On these days, I opened the class with a lecture I would classify as “semi-formal”. That is, I brought to it a list of ideas I wanted to discuss and a general plan of presentation which I put on the board in a few words, but which was not a “formal lecture” in the traditional sense. These lasted around 15 minutes, after which students would engage with a video lesson that was a more formal lesson. These ran about 15 minutes in length. The semi-formal presentation and the video lecture amplified what was in the text and zeroed in on important or interesting things that were not.

Click Here to Try one of the Enhanced Video Lessons for yourself. Go to TestDrive. Use the code 6MNU-9FBX-A11136Z-934-JON.

Presentation days were well received. I am known to be a good presenter and we often had good discussions. The video lessons also were well-received. It is this that actually came to replace the textbook.

I would like to term these “enhanced video lessons” to distinguish them from a lesson plan that merely asks students to watch a video. The enhanced video lesson includes embedded questions in order of the video which focus on what is important to remember or which ask students to reason out something important from the video. The videos themselves I made. They were simple voice-overs of slideshows. The embedded questions could be in both multiple-choice and short-answer formats. The multiple-choice was auto-corrected so students could see how they did right away. In addition, once the class had finished, I could open access to the answers so students could see the correct ones. This could be used as a study guide before tests later on.

Class averages on assessments throughout the year

The enhanced video lessons run here at InnovationAssessments.com. They are called “Etudes”. Thanks to the input of these students, an effective and elaborate learning tool evolved!

The test results were mostly consistent throughout the year and I was very satisfied with them. The class pretest average was 37 and it moved to 67 by the final exam. The class average on unit tests (numbered 11.1, 11.2, etc.) was mostly very good. I used only questions drawn from old New York State US History and Government Regents exams so that I would be sure the questions measured the standards and had been field-tested.

Click Here to check out the final exam. Use code: L5VC-JMM5-A11812Z-8452-JON

I do not recall how the suggestion came up to abandon the textbook assignments. Maybe I felt like giving up since so few actually did the work on time even though I did not assign homework in the class. The reading class periods were often punctuated by jokes or horsing around that I have no doubt was inspired by a collective willingness to procrastinate a task they did not like doing.

A younger version of myself would have clamped down on this misbehavior with stern reprimands and consequences. This older self takes a few progressive discipline steps first, one of which is to examine the lesson itself and the learners. There were reasons I could challenge the concept of the textbook assignment lesson plan. Firstly, these were not middle school students who needed continued reading support. These eleventh graders were nearly done with their program and it is well known that reading instruction has little impact on most older learners. Secondly, the tests showed they were learning the material despite not turning in the reading.

From course pre-test, through unit tests, and then to final exam. Textbook stopped after unit 2.

It could not have been the semi-formal lectures. Though popular and often entertaining, these did not provide the breadth of information on the test. So it must have been the video lessons.

So sometime in October of that year, I offered a deal: the class would take a pretest at the start of each unit and then a post-test. The test would have items from old New York State US History and Government Regents exams. I said that, as long as the class showed adequate progress from the pretest to the post-test and as long as the standard deviation was fairly small (showing the class mostly performing clustered together), then I would not require any textbook reading.

It worked! Throughout the rest of the year, the assessments showed that the enhanced video lessons were delivering the content that students had to retain to meet the New York State standards for social studies.

This is not to say that reading was abolished. The New York State Standards for social studies are very heavily document-based. My courses had been moving in this direction for years. Our reading experiences in the course now entailed excerpts from numerous primary source documents. This required more scaffolding for these students to be able to grasp, but they seemed willing to persevere through the difficulties with my assistance.

I used only test questions drawn from old New York State US History and Government Regents exams so that I would be sure the questions measured the standards and had been field-tested.

I would not advocate for a wholesale elimination of textbooks in secondary social studies classes. I know a strong case can be made that in middle school the kind of reading tasks described above serve a very important function. But in situations like this eleventh-grade class, it proves to be a viable alternative. Granted, I had some advantages: I had been making video lessons since 2011 and I had over 400 of them across US and global history subjects. I am also a programmer and so I could code apps that did exactly what I wanted. But I think teachers without these things can still adopt this strategy. YouTube has many good history lessons made by teachers. InnovationAssessments.com offers the Etude app that I wrote for a very reasonable annual fee.

Click Here to Visit my TeachersPayTeachers store if you want to purchase access to my social studies enhanced videos.