Wisconsin Science and STEM Education: July 2016

Monday, July 25, 2016

Creating Rubrics for Performances Tasks Aligned to NGSS – Part 1

Evaluating current efforts to move science education forward, such as that framed by the Next Generation Science Standards, requires “assessments that are significantly different from those in current use” (National Academies report, Developing Assessments for the Next Generation Science Standards, 2014). Performance tasks in particular offer significant insights into what students know and are able to do. “Through the use of rubrics [for such tasks] … students can receive feedback” that provides them a “much better idea of what they can do differently next time” (Conley and Darling-Hammond, 2013). Learning is enhanced! Building from a vision of the skills and knowledge of a science literate student, rubrics can allow students (and other educators) to see a clearer path toward that literacy.

Of course, using rubrics with performance tasks is generally a more time-intensive process than creating a multiple-choice or fill-in-the-blank exam. In order to be used as part of a standardized testing system or as reliable common assessments, scoring these types of tasks requires more technical considerations:

Tasks must include a clear idea for what proficient and non-proficiency looks like;
Scoring must involve multiple scorers who all have a clear understanding of the criteria in the rubric; and,
Designers need to develop clear rubric descriptors and gather multiple student anchor responses at each level for reference.

Often, when teachers grade student projects or other performance tasks on a rubric, it looks something like this (click for larger image/pdf):

While this example comes from a 4^th grade classroom, secondary rubrics often have similar characteristics.

Considering alignment to the NGSS, and effective rubric qualities in general, there are several changes I’d make:

What’s the science learning involved? They appear to be drawing or making a model. About what? What understanding would a proficient model of that phenomenon display?
What would mistakes look like in a model? I’m a little worried that the “model” here is just memorizing and recreating a diagram from another source. Students’ models should look different. There might be a mistake in not including a key element of a model to describe the phenomenon, or not noting a relationship between two of those elements. Types of mistakes indicating students aren’t proficient should be detailed.
Neatness and organization are important, but I question the use of those terms as their own category. I would connect that idea to the practice of scientific communication. Does the student clearly and accurately communicate his or her ideas? Do they provide the necessary personal or research evidence to support their ideas? The same is true within the data category. I’m more concerned about whether the students can display the data accurately and explain what the data means, than whether they use pen, markers, and rulers to make their graphs…
It’s not a bad idea to connect to English Language Arts (ELA) standards—done here with the “project well-written” category. At the elementary level in particular that makes sense; however, I’d want to ensure that I’m connecting to more, actual ELA standards, such as the CCSS ELA anchor standards for writing, which emphasize ideas like using relevant and sufficient evidence. At upper grades, I’d also want to emphasize disciplinary literacy in science (e.g., how do scientists write?) over general literacy skills.
This rubric actually does better than many at focusing on student capacity rather than behaviors. I see many rubrics that score responsibility and on-task time, rather than scientific skills and understanding. Check out Rick Wormeli’s ideas.
What does “somewhat” really suggest? Is the different between two mistakes and three mistakes really a critical learning boundary? I see a lot of rubrics that substitute always, sometimes, and never for a true progression of what students should know and be able to do. I also see many rubrics that differentiate rankings by saying things like no more than two errors, three to five, errors, more than five errors. What do we really know from that? What types of errors are made? Is it the same error multiple times? What exactly can’t the student do in one proficiency category vs. the next? I really can’t tell by just saying two vs. four errors.
Use anchors for clarity – this rubric notes that models are “self-explanatory” and that sentences have “good structure.” Do students really know what that means? Have they seen and discussed a self-explanatory model vs. one that is not self-explanatory? If there is space, an example of a model or sentence meeting the standard could be embedded write into the rubric in the appropriate column. If there isn’t a space, a rubric on a Google doc could link to those types of examples.

Looking around, really looking, I have found very few rubrics that make an attempt to align to the NGSS. I suspect some people are still nervous about sharing. In my next blog post, I’ll share an NGSS-aligned, three-dimensional rubric I created and detail the process involved. I’ll also share some resources from other groups tackling this work.

Monday, July 18, 2016

Formative Assessment and the NGSS – Part 2

A new colleague here at the Wisconsin Department of Public Instruction, Lauren Zellmer, read through my last blog on formative assessments. Her question was, “What specifically do teachers do with the information once they’ve conducted these formative assessments?” Great question! I decided to write a Part 2 of the formative assessment blog, where I’ll share a few more details for possible instructional next steps based on hypothetical results.

First teacher – Rubric on modeling

In the first example, the teacher collected whole-class and individual information using a modeling practice rubric, as he walked around asking probing questions and jotting down student names across the rubric continuum. After some reflection in pairs, he had a few students share their models in order to highlight key aspects of the practice, which will help build capacity in all students. Depending on students’ level of understanding, further support might include the following:

If he found that most students did not fully understand this element of modeling (mostly 1’s and 2’s), he could provide scaffolded modeling instruction in the next part of the lesson requiring modeling. He would prepare a modeling handout which lists possible elements of the model and requires students to note whether to include those aspects and why. Students already proficient would complete models without that scaffold.
If he found that understanding is fairly varied (largely 2’s and 3’s, with some 1’s and 4’s), the teacher could provide further time for group reflection and sharing. That further reflection would best happen immediately—after a few, selected groups shared elements of their models, the class could get back in pairs to improve their models based on those ideas. And, next time modeling occurs in a lesson, the teacher could repeat a similar in-depth process, like the first time, to continue to provide significant support.
If he found that most students proficiently performed this aspect of the practice (largely 3’s, with some 2’s and 4’s), he could make a note of which students are still struggling. The next time modeling happens, instead of moving around the class generally to assess where students are at, he could narrowly focus his support and questioning on those students. He might provide them some in-depth small group help, with scaffolds provided. He could also pair them with proficient students where he knows they won’t just be given answers, but meaningfully supported in their learning.

Second teacher – Testable questions
In the second example, the teacher had students write questions about biodiversity while on a nature walk, thus collecting whole-class and individual information on whether students could write testable questions. The next day, with the whole class, she shared useful “yellows,” where students’ questions needed more work, and “greens,” where students’ questions were testable. She also made some notes in a file as to where the class seemed to be overall with this skill. Depending on students’ level of understanding, further support might include the following:

If she found that most students could not write a testable question (lots of “yellow”), she should do more than read through notable yellows and greens. After that review, students could receive more practice in a guided, whole class discussion, evaluating a series of questions, noting whether or not they’re testable, and fixing them to make them testable. Then, she could ask students in small groups to collaboratively revise their questions to make them testable. These groups would include a student who did proficiently demonstrate this skill where possible.
If she found that about half could write testable questions and half could not, I would again suggest she have students rewrite their questions in small groups. For students still struggling after a round-two attempt, she could support them in a small group with an activity like that noted above (evaluating examples together). The remainder of the students might begin some independent research or brainstorming on the design of the investigation to answer their questions.
If most could write testable questions, she might provide individual help to students still struggling with their questions after the discussion of the greens and yellows. Those students could use the reviewed questions as models to revise their own, with the teacher ensuring they can explain why their original ideas were not testable.

Formative assessment is “assessment for learning”. Teachers need specific ideas as to what strategies they’re going to implement depending on the results of the formative assessment. They also need some way to record progress, not only relying on “gut feelings” and memory (my brain and gut, at least, aren’t that reliable). Having some class assessment notes on a page or a rubric can be a quick means to do so. Finally, coming together with peers to discuss the data and possible strategies is critical to moving forward as a science department (and community).