WestEd’s Learning Together Webinar Series: What Leaders Need to Know about Science Assessment Transcript

Featured Speakers:

Jenny Sarna, Project Director, Science and Engineering at WestEd
Jill Werthiem, Senior Research Associate, Science and Engineering at WestEd

Host:

Aimee McCarty, Communications Support Specialist, WestEd

Aimee McCarty:

Hello, everyone, and welcome to the 10th session of our Leading Together Series. In these 30-minute learning webinars, WestEd experts are sharing research and evidence-based practices that help bridge opportunity gaps, support positive outcomes for children and adults, and help build thriving communities. Our topic today, What Leaders Need to Know about Science Assessment. Our featured speakers today are Jenny Sarna, director of NextGenScience, and Jill Wertheim. Jill is director of SCALE Science. Thank you all very much for joining us. My name is Aimee McCarty, I’ll be your host.

Now, before we move into the contents of today’s webinar, I’d like to take a brief moment to introduce WestEd. As a nonpartisan research, development, and service agency, WestEd works to promote excellence, improve learning, and increase opportunity for youth, children, and adults. Our staff partner with policymakers, district leaders, school leaders, communities, and others, providing a broad range of tailored services, including research and evaluation, professional learning, technical assistance, and policy guidance. We work to generate knowledge and apply evidence and expertise to improve policies, systems, and practices. Now I’d like to pass the mic over to Jenny. Jenny, take it away.

Jenny Sarna:

Thanks so much, Aimee, and thank you so much for joining us today for our webinar. I’m Jenny Sarna, again, and I’m joined by my colleague, Jill Wertheim, and we are so excited to talk to you today about this important conversation, What Leaders Need to Know about Science Assessments. And in our work at WestEd, Jill directs the SCALE Science project, and I lead the NextGenScience project. And together we focus on this question of supporting state, district, and school leaders with improving science education systems.

And you’ll see here a few areas we work in around instructional materials, professional learning, and, finally, high-quality science assessments. And all of these system areas are really dependent on the coherence within and between those system areas as well as this question of alignment. Alignment to standards and alignment to research-based teaching and learning. And so why we’re gonna focus on instruction and particularly in assessment today is because we know it matters, but we know it’s also a really tricky piece of the puzzle.

And so today we’re gonna take a look in our work at this question of innovative science assessments and why they matter, what they actually look like, and then how leaders can work with these types of assessments to improve their systems. And we know as leaders you play a critical role in the design of these local assessment systems, so we’re really gonna talk specifically about the question of local assessment systems. And we’re really excited to share with you some of the work we’ve been doing in this area.

And as we learn today, we really want you to think about your own context and any ways that something new from assessments might help you improve your work. And so, Jill, I am gonna turn it over to you because I know you and the SCALE Science team have been doing some pretty exciting work nationally in the area of science assessments. And would you tell us a bit more about what you’ve been working on?

Jill Wertheim:

Yes, absolutely. Thank you, Jenny. When I’ve been talking with teachers and district and state leaders about assessment, it’s pretty clear no one’s satisfied with where we are right now. Students are getting tested a whole lot, using a lot of instructional time. And the value of those assessments just isn’t clear for how it’s benefiting teaching and learning. And so in our work at SCALE Science, we’re trying to figure out how to change how assessment is done. So we try a new innovation and talk with teachers and students about how it’s going.

How did that assessment help us in achieving this vision? Was it meaningful? Was it enjoyable? And of course, how did it help them learn? And so, I wanna share a brief video. I took just little snippets of some reflections that some students, middle school students did, and a teacher, after trying a performance assessment that we developed. And I want you to really pay attention to what stood out to them, what was important to them about these assessments.

Student 1:

We had to think. It wasn’t just, like, multiple choice.

Student 2:

It’s easier, it’s more in depth, and you learn more from it.

Student 1:

The test shouldn’t just be about reflecting over what you learned, but you should also learn and gain something from the test.

Student 2:

When I had to come up with my own ideas instead of, say, the multiple choice questions, for me it was, well, I liked it, but for me it was hard because I have a problem that I really always want to make sure that I’m doing it right. And when I’m doing it by myself without the multiple choice questions, I don’t know what would be wrong with it and how this is gonna be graded.

Student 3:

Each of us tries to put something of our own.

Student 4:

We have to, like, get a lot of research, and then we have to write it down and stuff. And then we always have to check it again to make sure if, like, we got something right or not.

Student 5:

Sometimes you have to put, like, a little bit more effort into them.

Student 3:

Yeah.

Student 5:

Like, yeah, but they become, like, fun at the same time.

Student 4:

And so if we could actually prove how, like, how much we have learned, then we could also show it in the model of the biosphere.

Student 5:

Also it’s bringing something we might need in the future, and we’ve never really done that.

Educator:

A multiple choice test does not give you how the student is thinking. Answering a question, looking at data, trying to prove why these people made the choices or why you make the choices kinda shows the insight and what the students are thinking. I think the benefit of the performance assessment is at the lower level the students can give you something, and you could build on it. And if you’re walking around and looking at the assessment and they’ve kinda stopped, you can go, “I like what you’re thinking about. Is there anything else you can add?”

And they’ll probably start going, “Hey, I do know more than I think I know.” And that’s gonna just build more confidence. I think once the kids have gotten that support that says, “Oh, you really do want my opinion, you really do wanna know what I’ve learned,” that’s a really huge benefit for the the performance assessment. And it gets them away with, is it just right or wrong? Is that what you want me to answer? It’s gonna take a little while to get that, but I think that’s gonna be a huge benefit of the performance assessments, is getting the students away from that, it’s just this answer.

Jenny Sarna:

Well, Jill, I gotta hand it to you. I don’t usually hear people say such positive things about assessments. And so, help frame this a little bit in terms of, like, when exactly are students taking assessments like these or, like, where do these fit into the rest of the assessments students might be taking?

Jill Wertheim:

So, I find it helpful to think about assessments as sort of part of a coherent system. And so if you look at this diagram on the left, big side of the funnel, we have the daily assessments. These are often part of instructional materials. And they happen all the time, and they’re really focused on eliciting information about individual students that help us support their learning. And on the other end of the funnel, the narrow end, we have those really infrequent assessments that happen usually at the state level, annually, or, in the case of science, at the end of the grade band.

And these are really oriented toward monitoring learning. In the middle of the funnel we have end-of-unit assessments, and these ideally come as part of instructional materials and quarterly assessments that you can think of as, like, an interim assessment or a district benchmark assessment. And these two assessments can play a dual role of both supporting and monitoring learning. But what we’re seeing is that there are just these common assessments, these district-level quarterly assessments are sort of a missing part of the puzzle.

And that’s sort of what you heard the teachers and the student reflecting on, that they felt like they were getting something out of this assessment that they’re not getting in other ways. We’re hearing a lot from district leaders that they want deeper and more actionable insights into what’s happening in classrooms. And these assessments, if they’re well-designed, can play a really critical role in giving teachers timely normed feedback about how their students are doing.

Leaders can pinpoint what is and isn’t going well across the school or district, but they can, at the same time, give teachers and students feedback that informs their on ongoing learning so they can feel really valuable to students. It’s not just something that’s giving district leaders or state leaders data. So we’ve been focusing on that sort of what we see as kinda that missing part of the system.

Jenny Sarna:

I love it. Like, that sounds great. What my one concern here is I often hear from teachers and leaders, like, we’re already spending so much time on testing, especially, like, getting ready for the end of course or the state exam. Like, there’s sort of this hyper focus on testing that’s taking away from really good instructional time. So, like, do we really need more assessments? Like, tell me about this.

Jill Wertheim:

That is a great question, and it’s a concern I also hold. And so, before we developed any assessments, I first did sort of a national listening tour, talking to lots of stakeholders, asking them, if we’re gonna develop an interim assessment for elementary grades, which is something we were asked to do, what would make that assessment worthwhile? What do they need from an elementary interim assessment? And what I heard from the people I talked to in some cases really surprised me. So I heard that they didn’t necessarily want an assessment that covered every single standard.

What they wanted was an assessment that focused on the ideas that really mattered for students’ learning. So those standards that have, like, particular explanatory power or particularly difficult but foundational for students. I also heard a lot about the variation in instructional time for elementary students. So an assessment would need to kind of fit into the kind of blocks that range from like 15 or 20 minutes to 45 or 50 minutes. I heard that, you know, teachers are overloaded, and if we’re gonna bring a new assessment in, it has to be user-friendly.

Also, something that was interesting is that interims often historically were oriented toward predicting performance on that state test, but leaders were saying, “Really, I want something that is gonna help teachers do their job. I want something that is instructionally supportive, more than predictive.” Something I heard a lot that was really influential for me was that there are a lot of students that teachers feel, and leaders, are not being served well by the current assessments that they have.

So particularly students with IEPs or multilingual learners, that the assessments they felt do not reflect the learning that those students are doing. And so they wanted, they would value an assessment that did a better job of authentically showing the learning that’s going on. They liked this idea of an assessment that is really supporting teachers in using student data meaningfully in instruction and assessment that actually showed teachers how to do that, that was educative in that way would be really useful.

And something I heard particularly from researchers was the challenges people are seeing with transfer. So these external assessments that ask students to look at a new phenomenon that they’re not familiar with and take everything they learned and apply in this new way in an assessment context, even secondary students are really struggling with that. So an assessment would really need to support students through that cognitive step.

Jenny Sarna:

That sounds like a great list, and I’m, like, dying to dive in here and see these assessments, get a little preview of what they actually look like. And I think one big takeaway for me was that typically when I think about benchmarks or interim assessments, it really is, the focus is being predictive about helping us kind of say, are we on track? Are we getting ready for that end of course Or that state exam? And this almost seems like you’re saying, like, we can actually use this time really well to look back and help us think about, like, what has been supportive in instruction, where do we need to focus more, and kinda help us understand if we’re on track.

So it is exciting, ’cause I feel like it’s the best of both worlds. So, I’d love to, can you talk us through a little bit about, like, how does an assessment help us sort of at this very timely quarterly or interim moment, how does this design help us do all of those things?

Jill Wertheim:

Absolutely. So I’ll first give you kind of big-picture description of the design of the assessment. So we took all of those things that the stakeholders told us and used them as sort of design goals. And we ended up with an assessment that looks very unusual as an interim. And so it has these three parts, they’re kind of indicated in dark blue here. One of the features I wanna just point out, you’ll see these three parts have different structures.

So a typical interim, you can imagine, you know, the room is silent, students are individually working on their assessment, but with this design, we’re actually using, we’re trying to elicit multimodal evidence of students’ thinking, and we’re using a variety of structures to do that. So we use a combination of whole class, small group, and individual structures to elicit evidence. Another thing I just wanna point out before I go in more depth in these three parts is that the design is really intentional around making explicit connections to the instruction that happened before the assessment and the instruction that will follow.

And so this is embedded into the assessment itself. Knowing that using assessments as a source of, like, just grading or scoring as sort of the norm, we wanted to build this idea that assessment can be, even an interim, can be seamlessly integrated into the process of teaching and learning in a classroom. And so that’s sort of woven into the fabric of the task. So I’ll tell you a little bit about each of those three parts.

Jenny Sarna:

Awesome. And so let me get this right, so I’ve taught a unit or two, and then I’m gonna do this three-part task. And then I’m gonna get some information, and that’s actually gonna help me then use that information not to look back and go and reteach but to use that to teach kind of things differently in the next unit.

Jill Wertheim:

Exactly right.

Jenny Sarna:

All right, let’s dive in. I’m excited to see these three parts.

Jill Wertheim:

So, the first part of the assessment introduces students to a new phenomenon. And the idea is that they’re introduced to a question or problem that is going to be really meaningful to them and motivate them persisting through a really complex task. And so, we can’t just sort of expect students to be motivated by a phenomenon, especially not expect all students, they won’t necessarily see why it’s meaningful.

And so we ask teachers to go through a routine to introduce students to the problem and to explore together how and why it matters, who it matters to, how it might relate to their own experiences so they can really take on this problem as something that matters to them. And that we find that when we take the time to do this, students come outta the assessment saying, “I feel like I did something really important. I feel like, you know, my work in figuring out something about this problem, like, matters to the world.”

And so the next thing the teachers do after sort of getting students familiar with the phenomenon, they bring something out from instruction, an artifact, something that can prompt students to revisit some core thinking that they did that’s particularly relevant to the task they’re about to approach. And so they do some thinking together in various structures, pairs, tables, and whole class, about how the ideas they learned might apply to this new phenomenon. And so you can see in the picture here, the students are working, this is a fourth grade classroom, it’s a task about structure and function.

And they’re thinking about a periscope as a tool for observing an animal without scaring it off. And so they’re thinking about light reflection and sort of how information is taken to their brain. And there’s a little image at the bottom here that’s flipped from the teacher guide, where we sort of show teachers, instruct teachers how to adapt this piece to their classroom. And so you can see we show sort of an example of a model they might co-construct as a class, but the teacher’s model that you see here is very different.

And that’s because it’s based on the learning they did. It’s based on the thinking that his students shared and what was important to them. And so this model stays up throughout the entire assessment so it’s available to students to support their work going forward.

Jenny Sarna:

So this is part one, we’re doing a whole class. And then tell us about where we go from here.

Jill Wertheim:

Okay, so then we go to small groups. And I’ll just tell you, a meaningful task that students care about is always very complex. And so we take sort of the most complex pieces of this task and we put it in the small group part. So students have an opportunity to talk together to sort of share language, share ways of expressing their thinking. This is where they do a lot of the data analysis and modeling. You can see a model of the periscope here. Students always do work on their own assessment, but they can share their ideas, they can point things out to each other, they talk about their ideas.

This is an opportunity to really work collaboratively and to encourage that. And what the teacher is doing is they’re walking around the classroom with an observational rubric that is pointing their attention to evidence of learning through students’ discourse and through their written and drawn products. And what we always hear teachers report to us is that they’re really surprised, particularly when they focus on those students that they find hard to assess, at the richness of their evidence of learning.

Some of those students will not say anything on a written assessment, but when they’re given the opportunity to talk through their ideas, they do it, and they really shine. And so encouraging teachers to recognize and value this evidence as authentic evidence from an assessment that helps us understand students’ thinking is a really important piece of the educative value of the task.

Jenny Sarna:

Absolutely, and so their scoring starts here. It looks like they’re already gathering this evidence of student thinking. And then tell us about part three.

Jill Wertheim:

All right, part three is where students work individually, they can use their thinking, or they’re asked to use their thinking from the entire assessment, and they build from there. So this is where some of the deepest sense-making happens. They move a little bit further than they went in part two and they record individually what they figured out about the phenomenon. We always ask them to do that in writing, but we also offer often the opportunity to draw and include data, you know, visualization and stuff like that, which they very often take us up on. And so this is also scored as evidence of students’ learning.

Jenny Sarna:

So this, you mentioned the word complex, and I’m thinking about elementary school students or intermediate students. This is a tremendous, like, performance when I think about all the student ideas that are here, but this also looks like a ton of information. So, I have all this data, we talked about, like, not spending more time doing assessments for just the sake of assessing, like, how do I actually translate this into something useful for teachers, students, and leaders?

Jill Wertheim:

This is one of the most important parts of the assessment. The rubric and the reports, for me, hold a lot of weight in terms of the value of the task. And so we’re gonna talk really quickly through just a few of the pieces there, both the individual reporting and the whole class reporting. And so individually, we can summarize students’ performance. And I’ll just say quickly, because of the complexity, we actually had to create an app for our rubrics to make them much simpler so we could help teachers make sense of the variety of students’ thinking in these really open-ended tasks really easily and efficiently.

And so then the app summarizes students’ individual performance on the group work and separately on the individual work, and gives a description of both their strengths and their areas for growth very specifically and tied to the standards that are being assessed that we encourage teachers to use to frame their feedback to students and to also look at the next performance level to talk with students about sort of how they can continue growing. And then the whole class trends, you can see really visually across the class where those strengths are, where those areas for growth are, and they’re divided very granularly into pieces of each standard.

So we can target very, very specifically what to do next. One of the additional pieces that we built into the rubrics is a aspect of the reports that help teachers think about what to do with the data. And so it summarizes students’ performance across the whole task group and then individual. And it describes, based on your student’s data, here’s some some areas that you might work on next. So, for the case of this class working on the periscopes, they really needed, lots of students need to work on their systems thinking.

And so we guide teachers in looking to their next unit in how they can create on-ramps for their students to make sure we’re not just saying like, “Okay, well that was your achievement on this task,” we’re actually using it to support ongoing growth and making sure we’re learning from it and providing support students need to continue working on these ideas.

Jenny Sarna:

I love that. It seems like that would be incredibly useful for everybody involved. Thank you for kind of giving us a quick tour of this really innovative and exciting new resource you’re working on. And I just, when I look at this and I think about the three parts and how this was really about enhancing and not interrupting learning, but also giving us really useful information to look forward, it seems like just a really good use of time. So I’m excited to see kind of the full resource as it comes out and is available this summer. I wanna just wrap here, Jill.

Quickly, thinking about this funnel you shared at the beginning and, like, the importance of the HQIM, the high-quality instructional materials, and my sense is if I adopt good high-quality instructional materials, then, like, some of my daily and unit assessments are gonna be taken care of. Can you give me just a quick bit of advice for that quarterly or that benchmark? Like, how do I make sure that I get that piece right, because that’s gonna be really important?

Jill Wertheim:

Yes, great question. The instructional materials can’t emphasize enough that adopting materials that have a thoughtful embedded system of assessments is really critical for making the whole learning process work. We also see a lot of people looking to task banks for those quarterly assessments, and we certainly have task banks with curriculum-neutral assessments. Many other people do too. We’re just starting to see the limitations of those.

And then just, you know, buyer beware, even for these free assessments we offer, that we need to think very carefully about the opportunity to learn before the assessment and do a careful analysis of exactly what’s being assessed and exactly how students are being supported toward that assessment. Whereas, we’re also seeing with a lot of the interim assessments quarterly or just the neutral assessments being put out there in the task banks, that claims of alignment are often not well-founded.

And so you just need to do a little bit of your own work to determine whether or not it’s assessing the things that you care about and you want to know more about. And then the last thing I’ll just say is, the rubrics and reports are where I start before I even look at an assessment. I wanna make sure that it has the ability to give actionable information that teachers can take up and do something with. And so that’s sort of the most important place to decide if it’s a quality assessment to work with.

Jenny Sarna:

It sounds like there’s a lot of parallels between, like, not all materials and not all interims are created equal, so, like, how do we really thoughtfully evaluate those two pieces before we make a purchasing decision? So, thank you so much for joining us today. And to all our science friends out there, we really appreciate the opportunity to share this exciting bit of work with you. I’m gonna drop a link to our websites in the chat. And then, as well, if there were a few questions we didn’t have time to answer today, we will be happy to follow up.

We’ll just send over an individual email to those of you who’ve reached out. And anyone, please do reach out if you have questions or you wanna learn more about this work. And I’m gonna turn it back over to Aimee.

Aimee McCarty:

All right, thank you so much, Jenny and Jill, for such a great discussion today. And thank you to all participants for joining us. We really appreciate you being here. For those of you interested in learning more about SCALE Science at WestEd, Jenny, dropped the link in the chat. You can visit us online at scalescience.wested.org. And of course we invite you to reach out to Jenny and Jill via email if you have any questions about the work that we discussed today. You can reach Jenny via email at [email protected]. And you can reach Jill via email at [email protected]. For more information about our Leading Together Webinar Series, please visit us online at WestEd.org/leading-together-2025.

And finally, you can also sign up for WestEd’s email newsletter to receive updates. Subscribe online at WestEd.org/subscribe or scan the QR code displayed on the screen. You can also follow us on LinkedIn and Bluesky. And with that, thank you all very much for being here with us today, and we’ll see you next week.