19 December 2013

Guest Post: Teacher Attitude Affects Learning and Testing

The following guest post is from Eileen Nagle, an extraordinary teacher that taught my son in 6th and 8th grades. Here she relates how teacher attitude and context can dramatically impact students' testing experience.

A professor I had in New Jersey taught us that if we taught above and beyond the state standards, played games during test week, and didn't make a big deal about the tests that our students would score high on the state testing. From my previous experiences I believe that testing results are very strongly determined by the teacher and the environment in which the test is given.

One year while teaching at an elementary charter school, I was the lead teacher of three grade level teachers. Jan (names are changed), who was in her 30s, had recently graduated from a local college. Next was Tammy who I had worked with the previous year, and then myself. I had received my certification only three years earlier but had homeschooled my own children for 17 years.

Jan wouldn’t do anything that she hadn't learned in college or that wasn't on the state standards. Being a charter school we had a more enriched curriculum than the local public schools, which is why the parents sent their children to our school, but she wouldn’t do it.  Her students would express that they ‘hated’ various parts of the curriculum, parroting their teacher. Tammy was in her 20s and was open to ideas and taught the extended curriculum. Because of Jan’s protests of not being able to handle them, Tammy and I each had a very high needs student in our classes, Tammy struggled each day with classroom management because of the challenges her student presented.

When state testing was coming up we met and I gave them some ideas on how to handle the week.  I told them to tell their students that the test wasn't any part of their grades and to just have fun with it. Jan was freaking out. With my students, I sent a note home asking for healthy snack donations to give to the students before and after tests. Parents were generous and many donations came in. I prepared an art project, a Mother’s Day gift, they could work on when they were finished with each testing section so they wouldn't sit and get bored.  We did fun games to loosen up muscles during the day.

Academically, I didn't do anything special to prepare them other than just teach as I always did. I believe an enriched curriculum taught in a fun way, with lots of music and role playing, will go far in preparing students for testing. They learn deduction skills, retain the information because it was fun, and do well on tests. I didn't do any 'test prep'.

At the end of test week my students wanted to know when it was supposed to get hard.  They thought it was a cake walk and requested another test week the following week.

Jan's experience was different. The first day of testing she came into my room in a panic saying two of her students had thrown up. Her hands were shaking as she described students saying they were scared and were crying. She was crying too saying the stress was too much. She wouldn't take any of my advice so there wasn't anything I could do about it.

The tests results came back and my class scores were the highest, Tammy had the middle level and Jan’s class scored the lowest of the three. There are many variables to testing, but students will perform better if teachers will:
  • Creatively teach to a much higher level than the state tests all year.
  • Don't teach to the test
  • Reduce any stress going into the tests.
If one must test these steps will certainly improve the experience and the outcome.

Eileen Nagle is the Outreach and Workshop Coordinator at the Noorda Theatre Center for Children and Youth, Utah Valley University. She can be reached on LinkedIn (Eileen Nagle) or Facebook (Noorda Center)

25 November 2013

Quantifying Learning: Alternatives to the Carnegie Unit

In 1905, Andrew Carnegie was seeking "ways to improve the economic standing of college professors and the provisions for their financial security in old age" (ref here). In consultation with the president of MIT, he created a free pension fund for college professors. Of course, many colleges and universities were eager to participate in a free benefit of such value. So the Carnegie Foundation for the Advancement of Teaching, which administered the pension, had to set standards for qualification. Among the requirements was that institutions would use the "standard unit" when evaluating high school transcripts for student admission.
The standard unit was created by Charles W. Eliot at Harvard University. Essentially, it measures the number of contact-hours between student and professor. The unit used by the Carnegie Foundation represented 120 hours of class or contact time over the course of a year at the high school level. This is now known as the Carnegie Unit and remains the primary way of measuring achievement in U.S. high schools. On the heels of that, Morris L. Cooke (also with support from the Carnegie Foundation) established the collegiate Student Hour as one hour of lecture, lab work, or recitation per week for a single semester. Today we usually call these "Credit Hours."
Seat time measures like Carnegie Units and Credit Hours are only proxies for actual student learning. Adding class grades to the measure is an attempt to increase their reliability. But there are two problems with this. First, grades are not necessarily a good indicator of actual learning. Anyone who has been through school knows that their best grades aren't necessarily in the classes where they learned the most.
Second, grades reinforce the industrial era notion of school as a sorting device. We send thousands of different students through the same learning experience and then grade their performance. Based on those grades, society decides who is qualified for college and a professional career, who should go into service industries, manual labor, and, perhaps, who will be our criminals.
School doesn't have to sort so viciously. A growing body of evidence indicates that by personalizing learning a majority of students can achieve readiness for college and professional careers. That's important because with automation and offshoring, the number of unskilled jobs in the U.S. is diminishing. But with teacher compensation, student evaluation, school budgets, admissions, financial aid, and pension plans all tied to seat time measures, the environment hasn't been conducive to personalization.
Recognizing this, the Carnegie Foundation recently set out on a year-long quest to seek better ways to measure student learning. The result should be a measure based on competency, not time. The results of their study are due in 2014. In the meantime, here are some of the alternatives already emerging:

Challenges and Waivers

This is an effective interim solution. Alabama and Michigan have Seat Time Waiver policies for high school credit. If students can show mastery of a topic, they are granted credit for the course without regardless of how much time they spent studying or in class. The Ohio Credit Flexibility Plan allows students to earn high school credit by demonstrating competency, completing classroom instruction or a combination of the two. The College-Level Examination Program and similar programs allow college students to obtain credit by demonstrating knowledge on a standardized test. Many universities also allow students to take challenge or exemption exams.
Notably, all of these programs convert demonstrations of competence into seat-time units or waivers thereof. The Carnegie Unit and Credit Hour as measures of learning remain intact. These options represent a transition rather than a new solution.

Merit Badges

The Boy Scouts and Girl Scouts award badges when youths demonstrate skills like First Aid, Knot-Tying, Swimming or Computer Programming. Patches are earned by attending events. Scouting organizations borrowed the badging concept from centuries of military tradition. Education badges are based on this model. Organizations like UC-Davis and Khan Academy have badging systems. The Mozilla Open Badges project is an effort to create a universal format and exchange for badges of all types. They've signed up a diverse variety of organizations and institutions including colleges and universities, MOOCs, professional training companies, the Smithsonian museums and more.

Competency-Based Schools

Western Governor's University substitutes "Competency Units" for credit hours. Students receive credit when they prove competency. This lets student get credit for prior knowledge and also lets them progress through the course materials as quickly or slowly as they choose.
New Hampshire is initiating a statewide redesign of high school education that will be based on demonstrations of competency. In a similar vein, the Re-Inventing Schools Coalition (RISC) is working to help schools develop a performance-based system for earning credit. Among their members are the Adams County School District in Colorado and the Chugach School District in Alaska.


Professional certification programs like MCSE or CCNA specify a set of competencies and a way to demonstrate the associated skills. Individuals seeking a credential can choose the path that suits them – reading a book, attending a class, watching videos, or an online course. Once competencies have been specified, it's possible to separate the learning of a skill from demonstration of that skill. When learning and credentialing are unbundled it's possible to compare different learning methods to see which is more effective. And different students can choose methods that are better suited to their current needs, market positioning or student body. 

Making Success the Only Option

An oft-repeated phrase among competency advocates is that grades should be "A, B, and 'still working on it.'" This necessitates flexibility on the part of the teachers and the school to meet the needs of each individual student. To do this in a conventional classroom takes more time and energy than should reasonably be asked of a teacher. Among the best ways to apply technology in education is to expand teacher's capacity to personalize education and spend more time one-on-one with students.
The other source of capacity is the students themselves. In the long run, our goal is to train students to be self-learners. If the right resources are offered, students can adapt the learning experience to match their own needs.

11 October 2013

Things Engineers Can Teach Us About Feedback

Some time ago I wrote about feedback loops – how they are part of the engineering discipline of Control Theory and how, by substituting a few words, the principles apply surprisingly well to education.  Here's a diagram of a feedback loop according to control theory:
A closed-loop control system.
And here's the same diagram substituting educational terms for the engineering ones.

A personalized learning system.

In that post I noted that the feedback loop needs to be "closed" in the sense that we use feedback to influence the direction instruction should take. Also that feedback needs to be "negative" in the mathematical sense. That is, feedback should reflect the difference between skills the student demonstrates and standards that are to be taught. Both of these concepts, closed feedback loops and negative feedback, are derived from engineering control theory.

In this post, we'll consider three more insights we can gain from control theory: the role of a transfer function, speed and frequency of feedback, and sensitivity to what is being measured.

Transfer Functions

A transfer function is a mathematical description of the relationship between the input and the output of a system. Let's use the car example from my previous post. In that case, the input is the position of the gas pedal (more correctly called the "accelerator pedal" as we'll see in a moment). The output is the speed of the car. Pressing the pedal to a certain position doesn't make the car go a corresponding speed. Rather, pressing the pedal causes the car to accelerate at a rate proportional to the pedal position. If you keep the pedal pressed to the floor, the car will continue to accelerate to higher speeds until it reaches the limits of its construction. (For calculus fans, this means that the transfer function of a car's drive train is an integral.)

The transfer function is important because it's incorporated into the design of the controller. When an engineer designs a controller they use the transfer function to anticipate what will be the result of a particular input. Typically they take the inverse of the function to determine what input is required to achieve the desired output.

The educational equivalent of a transfer function is a learning theory – a description of how people learn. Learning theories help us select activities that will effectively help a student learn a particular skill. Descriptions of the various theories and their strengths and weaknesses are beyond the scope of this post (or my skills for that matter). I recommend the Wikipedia article on the subject. But we can derive two important insights from this:
  • A personalized learning system will inevitably express some learning theory in the selection of activities. It would be best to deliberately select the theory and design the system accordingly.
  • There are personal differences in the way each student learns. In engineering terms, this means that each student has their own personal transfer function. Therefore, the selection of activities should be tuned to the student's individual interests and affinities.

Speed and Frequency of Feedback

The time from the moment an output is measured to a resulting change in the input is called a propagation delay. In educational terms, this is the time from when a student's skill is assessed until moment a student's activity is affected by that. In a traditional math class the student does homework one day, submits it the next day and receives graded homework back the next day. Thus, the propagation delay is two days (or two class periods). Fast feedback means a shorter propagation delay. Many online learning systems offer near instantaneous feedback. Measurements that require human grading will naturally be slower.

Frequency of feedback is a measure of how often the output (or skill) is measured and feedback generated. In the traditional mathematics example, feedback is daily (or once per class period). Some traditionally taught courses may only have two or three graded activities in the entire course. However, this may be a pessimistic way to measure frequency. For example, if students can check their answers in the back of the book then feedback is both faster and more frequent

A third component of educational feedback is richness. In math, this might be the difference between being told than an answer is wrong and being informed about exactly what mistake was made. In English it might be the difference between a simple score and detailed feedback about how the student might improve their paper.

Students can influence all three feedback factors. For example, if English students seek help at a writing lab then they will be getting faster, more frequent and richer feedback than students that don't make use of the resource.

Control theory tells us that faster and more frequent feedback compensates for inaccurate measurements and poorer transfer functions. In education language, this means that if we can make feedback faster and more frequent we can compensate for a less-than-perfect learning theory and suboptimal assessments.

Of course it would be nice to have everything -- fast, frequent and rich feedback, good quality assessments and a solid learning theory. But it's useful to know that there are real tradeoffs among these factors.

Sensitivity to What's Being Measured

Feedback loops are a very effective tool; so effective that if the wrong thing is being measured or the wrong feedback is offered then the wrong skill will be optimized. A recent manifestation of this are complaints of "teaching to the test." The concern is that since summative tests are used to evaluate schools then the only skills that will be taught are those that are on the test. While this outcome is common, it's unfortunate since studies have shown that focus on conceptual understanding results in better test performance than test-focused instruction.

It's also manifest in the combination of skills that a particular problem might require. For example, a mathematics story problem might require reading, visualization, and problem solving skills in addition to the ability to solve the resulting mathematical equation. In order to offer feedback to a wrong answer, the system (whether human or automated) must be able to detect which of these skills was not applied properly. In most cases, this requires interacting with the student to discover the steps followed in answering the question.

It's tempting to try and isolate skills and only assess one at a time. There are two reasons why this won't work. First, it's very likely that you're seeking the student's ability to use multiple skills together. Second, the demand for some skills simply can't be eliminated. For example, nearly every assessment requires the skill, "Can read and follow directions."

Applying Feedback Loops

To summarize, engineering offers us the following insights about using feedback in education:
  • Choose your learning theory deliberately and measure its effectiveness.
  • Adapt not only to what the student has and has not mastered but to the individual learning patterns and affinities of each student.
  • Fast and frequent feedback can compensate for lower quality in other areas of the system. This is a two edged sword; you may think you have a good learning theory when, in fact, it's fast feedback that's making the difference. But it's also an opportunity to make deliberate trade-offs.
  • Be sure you're measuring what you think you're measuring. And don't forget that every assessment measures multiple skills.

26 July 2013

Can Consortia Improve Standardized Testing?

The NCLB embodiment of standardized testing has been in place for eleven years now. Unfortunately, it hasn't resulted in substantial improvement in student learning. But there are things the multi-state assessment consortia can due that will improve the situation.

To many of us, the lack of progress has been confusing. In just about every other industry, when you measure performance and report it back, performance improves. Education is proving to be a more difficult problem than most.

Unintended Consequences

Part of the issue is unintended consequences from standardized testing. Consider a teacher who is anticipating the year-end tests. She and her principal are under pressure to achieve Adequate Yearly Progress (AYP) goals. So, they have regular drill and review sessions -- test preparation to make sure the students are ready.

All of this "teaching to the test" takes time and resources away from more enriching and interesting learning experiences. And it doesn't work. The Measures of Effective Teaching project found that teaching to the test is not as effective as a focus on conceptual understanding and applications. (MET Preliminary Findings Page 21)

That shouldn't be surprising. I wrote about the Flow Channel or Zone of Proximal Development a few months ago. In order to keep the student's attention and prevent frustration, the work needs to be new and challenging but not excessively so. If the work is too easy, the student is bored. If too hard, the student is anxious. In either case the student is frustrated and learning  does not occur. Constant drilling in preparation for the test puts students deeply in the boredom zone.

There are lots of important skills that aren't on the exam. Yong Zhao has written that the most important 21st Century Skills are creativity, entrepreneurship and independent learning. These aren't emphasized in the standardized exams. But the basic skills of literacy and numeracy (which do appear) are required for the kind of creativity and entrepreneurship we need. For this reason, I wish that the Common Core State Standards were named the "Common Foundation Standards" because that's the way I see them -- the foundation skills required for creative work.

This is another unintended consequence of standardized testing. Excessive focus on the exam steals classroom time that could be used for creative application of the knowledge or for self-directed learning. The MET study and others show that more of the latter activities results both in better prepared students and concurrently better exam results.

How To Do Better

Noting the minimal progress, many advocates call for abandoning the standards and assessments altogether. They look at the amount of time and money dedicated to assessment and suggest these resources could be spent in better ways. But, in the absence of standards and measurement we wouldn't know if we are succeeding or failing. The best we could hope for is blissful ignorance. In my opinion we must push forward, improving standards, measurement and teaching.

While improvement will require changes throughout the educational system, there are things that can be done with the assessments themselves to support improvement. Here are some of the things being done by the Smarter Balanced Assessment Consortium and by PARCC:

Richer, More Authentic Assessment Items
Both consortia will use computer-delivered assessments that are much closer to real-world activities. An emphasis on constructed response items will require students to compose an answer to a problem, not just select from a set of prewritten answers.  Upon evaluating the consortia's assessments, the  UCLA CRESST center concluded, "Both PARCC and Smarter Balanced summative assessments ... will represent many goals for deeper learning, particularly those related to mastering and being able to apply core academic content and cognitive strategies related to complex thinking, communication, and problem solving."

Teaching to the test becomes less of a problem the closer the exam gets to assessing real and authentic skills.

Guidance from Interim Assessments
In addition to the year-end summative assessments, both consortia are offering voluntary interim assessments that teachers and administrators can use to gauge students' understanding throughout the school year. If students are found to be prepared, less time will be spend on boring and unnecessary drills. Likewise, identification of weak areas can guide teachers in reviewing just the necessary lessons.

Professional Development and Formative Activities
The consortia are developing training materials for teachers. These will include information on how to plan formative assessment activities for the classroom and how to interpret and make use of assessment results.

Precise, Individual Level Reporting
Existing state assessments measure the number of students who have achieved the state competency threshold for their particular grade. These measures are reported to schools, districts and states in hopes of improving education programs at those institutions. Threshold tests can measure whether a student is above or below the expected competency but the further a student is from that level, the less accurately they can indicate the student's actual competency.

Smarter Balanced will use computer adaptive testing to precisely measure each student's skill level. In adaptive tests, questions are selected based on the results of previous assessment items and testing ends once the student's skill level has been measured to a certain level of confidence. Thus, students that are below grade level aren't subjected to a long series of questions that they can't answer and both those above and below the threshold receive accurate measures of their skill levels.

These more precise assessments measures are used to generate clear, individual level reports for students, their parents and teachers. The reports will have sufficient detail to show growth year over year and to optimize instruction to address individual student needs.

~  ~  ~  ~  ~

Standards and associated assessments haven't resulted in the improvements that were hoped for. But that doesn't mean we should give up on them. They offer an important support for the Personalized Learning theory which has been proven. Refinements to standardized tests listed above will reduce unintended consequences and offer the guidance needed to optimize each student's learning experience.

06 July 2013

References Needed to Complete Engelbart's Vision

Douglas Engelbart, a pioneer in personal computing and one of my heroes, died this past Thursday. Many of the concepts he invented are part of our daily personal computing lives. But there are still a few missing pieces, one of which is a referencing system.

Here's a rough outline of how a series of pioneers developed the technologies you find familiar:
  • 1945: Vannevar Bush, Director of the Office of Scientific Research and Development, writes "As We May Think." Writing at the conclusion of World War II, Bush considers how technologies developed for war can be used to further peace. He envisions an electromechanical system based on microfilm and dry photography that can manage all of the data a person needs and help them organize it into knowledge.
  • 1968: Douglas Engelbart, Director of the Augmentation Research Center at SRI, is inspired by Bush's article. He realizes that the concept can be achieved much more readily using digital computers instead of an electromechanical system. In 1968 he demonstrates their oNLine System (NLS) in what we now call the "Mother of all Demos" including a mouse, graphical user interface, collaborative word processing, teleconferencing and a host of other features that would take decades to make it into the mainstream.
  • 1973: Alan Kay, who had attended Engelbart's demo, incorporates many of Engelbart's ideas into the Xerox Alto. Designing the Alto so that it can be used by children, Kay's insight is that the user interface should manifest the functions that are available. Thus, the system itself can teach the individual how to use it.
  • 1984: Steve Jobs, who had seen a demo of the Alto in 1979, incorporates key elements into the Apple Macintosh. Features inherited from NLS and the Alto include the mouse, GUI and computer networking. Jobs' most important contribution is to get these ideas out of the lab and offer them to a mass market.
If you watch Engelbart's Demo you will see many now-familiar ways of using a computer. NLS centered on the creation and management of documents. These documents were indexed for convenient retrieval and sharable with all other NLS users.

But there was a critical feature in NLS that we have not yet replicated. Each document was given a unique ID. Printed NLS documents were easily recognized because they included index numbers in the margins. The document ID and index number allowed individuals to reference any line in any NLS document.

That need for consistent identifiers that can be referenced has yet to be addressed for most texts. Today, the best that citation systems can do is refer to a page number. But page numbers change with different formats (e.g. hardbound vs. paperback) and editions. Suppose, for example, you want to cite a particular quote from Huckleberry Finn. In order to do so, you have to specify the publisher and edition of the book before citing the page number. And the odds of a reader having that same edition is pretty low. Textbook publishers have taken advantage of this. By changing pagination between editions, they deliberately obsolete previous editions.

With the advent of digital books the problem is compounded. Page numbers change according to user preferences like font size and page orientation. As a stopgap, the Amazon Kindle added "real page numbers" so that you can use references derived from the paper version of a book. But the problem of persistently valid references across editions lingers.

As we honor the legacy of Doug Engelbart, it's appropriate to consider one more of his innovations -- a persistent and universal referencing system. We still need it.

22 June 2013

Education Technology Readiness - Preventing the Unexpected

It's the first day of a new blended learning program. You've figured out how to acquire computers for all of the students. You've chosen a really exciting online curriculum that includes adaptive learning. You've spent the summer learning how the system works, adding personal touches to the lessons and preparing to coach the students. You've gone to the classroom, made sure it has Wi-Fi coverage. Tested bandwidth and played videos. The students arrive, log into their laptops... and everything crashes.

Technology readiness is a new concern for state and district technology directors. With high-stakes assessments going on, concerns are even higher. The RTTA Assessment Consortia have collaborated on a Technology Readiness Tool that states and districts are using to survey and report on their preparedness to perform assessments. Smarter Balanced and PARCC have added Technology Readiness Calculators that help districts and schools perform capacity planning.

But there are a bunch of things that go wrong that are overlooked by conventional planning and testing. Some of them are even missed by experienced network technicians. It's not that the tools above are flawed. It's the nature of these kinds of problems that require them to be addressed in a different way. Today I hope to prevent them from biting you. The following list isn't comprehensive. But it's a good start.

Inadequate Bandwidth
This is why things work when you try them the night before, but on the first day of class, with 30 students all trying to stream video at the same time it all falls apart. It's a well-known issue and bandwidth planning is a key part of the planning tools I mentioned above. Still, this is important enough that it bears an additional mention.

EducationSuperhighway is doing a survey of actual in-classroom bandwidth and using that data to advocate for better connectivity. They use a bandwidth test and ask teachers and educators to run it periodically. While they acknowledge the weaknesses of this approach (listed below) getting a lot of samples will increase the accuracy of their reports. So, if you work in a school, please go to SchoolSpeedTest.org and run their test from time to time. Not only will it inform you but it will also contribute to nationwide advocacy for school bandwidth.

Mismeasurement of Bandwidth
Bandwidth tests like SchoolSpeedTest.org are useful tools but they can be misleading. Most internet service providers offer "burst speed" in excess of the guaranteed bandwidth. Consider a municipal ISP. Perhaps they purchase 10Gbps of bandwidth from their upstream provider and parcel it out to 100 customers at 100Mbps each. None of those customers will use all of their bandwidth all of the time. So the ISP lets them burst beyond 100Mbps, using some of their neighbors' unused bandwidth. So, if you do a bandwidth test at a favorable time, the result could be much higher than what's guaranteed by your ISP.

On the other hand, I've visited with schools who get much lower performance in their classrooms than what they pay for. In their cases, outdated networking equipment or problems with their wireless networking create a bottleneck that slows things below their purchased capacity.

Inadequate Access Point Capacity
Wireless networking bridges to wired networking through access points positioned around the building. Home networks combine the access point with the router. But commercial networks usually have one or two routers for the whole building with access points positioned strategically throughout.

Any access point has a limit to the number of computers it can serve. Consumer grade devices have a lower capacity but even commercial units can be overwhelmed if you get too many devices in the same room. Most people know that the access point's bandwidth (typically 54Mbps) is shared among all connected devices. But you can't just test the bandwidth in a room with a single computer and then divide by the expected number of computers to get available bandwidth. There's bandwidth overhead to each connection and there's a ceiling on the total number of computers that can be supported by a single access point. The max device count varies from model to model but it's always there.

Interference Between Access Points
One way to address access point capacity limits is to use more of them. But if you pack them too closely together they will interfere with each other -- thereby impairing your capacity rather than building it.

Interference From Other Devices
The 2.4Ghz band used by most Wi-Fi devices is also used by Bluetooth, some cordless phones, and many other devices. The 5Ghz band is also available for Wi-Fi but it isn't supported by as many devices and it has a shorter range. Microwave ovens also happen to be in the 2.4Ghz band. Poorly shielded units can jam all network traffic in their vicinity.

Bluetooth Keyboards
This particular case of device interference deserves special attention. Smarter Balanced supports iPads and other tablet devices as acceptable testing devices. However, we require a physical keyboard when taking assessments. Typically, people use wireless Bluetooth keyboards with iPads. As noted above, Bluetooth operates on the same frequency band as most Wi-Fi networks. It's not noticeable when three or four keyboards are in a room but when 30 get going, there can be significant interference with the WiFi network. The network won't go down, but it's bandwidth will be impaired.

Keyboards and other input devices can also interfere with each other. For example, Logitech's recommended density for their wireless keyboards and mice is far lower than a typical computer lab. Another danger is that students might mix up the keyboards among the devices. Finally, wireless devices have batteries to maintain.

Conveniently, Logitech and other manufacturers now offer wired keyboards for iPads.

Inadequate Router Capacity
In addition to bandwidth limits, routers have a number of other capacity limitations. Every open internet connection requires dedicated router capacity, even if it's idle. A lower-end router may not be able to handle more than 50 or 100 devices at a time.

Too Few Network Addresses
Routers also typically manage network address assignment. In a using the DHCP protocol, routers "lease" out addresses from their pool. Home routers typically have a pool of 100 or fewer addresses. Even commercial routers in their default configuration may not have a pool bigger than 250. Lease time is also important. A typical router configuration might have a pool of 200 addresses and give out week-long leases. In such a situation, if more than 200 devices come through the doors of the school in a week's period all of the addresses could be used up even if fewer than that number are present at any particular time.

Insufficient Power or Cooling in the Room
If you're setting up a temporary computer lab (e.g. for year-end testing) you may find that the room hasn't been wired with enough power for the number of systems you set up. Also, a desktop computer with monitor puts out about as much heat as a person. So adding 30 computers to a room meant for 30 people can double the cooling requirement. Laptops and tablets consume less power and generate less heat but the demand is still notable.

Averting Problems
Here are some ideas on how to prevent problems like the above before they happen:
  • Plan Ahead: Make sure the expected infrastructure is in place well in advance of key days (first day of school, first testing day) so that you have time to check everything out.
  • Wired is Better: Wired networking has much greater capacity and reliability than wireless. Wired input devices are naturally tethered to the corresponding device and don't require batteries.
  • Read the Specs.: Don't just test your bandwidth. Find out from your ISP how much is guaranteed. And compare your purchased capacity against your tested bandwidth. Likewise, don't just test the wireless network, look up the capacity specifications of your access points and routers and make sure they meet your needs.
  • Hire a Tech: Get a trained network technician on staff, or at least under contract, and have them do a site survey.
  • Do a Scale Test: Load a room with the expected number of devices and get that many people to exercise them all at once.
  • Identify Interference Points: Who shares your internet connection? What other rooms share an access point? What facilities share a router? What is between the access point and its intended devices? (walls, furniture, etc.) Do any of the barriers move?
  • Build In Redundancy: Install redundant devices at key places (e.g. routers and access points). If redundancy isn't affordable, have spare equipment available on-site.
  • Map Your Network and Document Your Configurations: List everything you would want to know when troubleshooting or replacing a defective item.
Despite the best planning, unexpected problems are still going to occur. In the first years of online learning and assessment they may be painfully frequent. So a final recommendation is to Handle Crises with Grace. Develop a contingency plan. Be ready with an alternative activity when the systems go down. For testing, build excess days into the testing window in case you have to cancel for a day. If we plan well, technology will be a blessing and not a burden for education.

Updated 24 June 2013 to add information about SchoolSpeedTest.org.

11 June 2013

Data Standards in Service of Learning

My friends at SETDA have published a new paper, "Transforming Data to Information in Service of Learning". It represents a movement that I favor. Historically, educational data has been used primarily for accountability purposes. But, properly reported, data can guide instruction and learning and personalize the experience. The result is significant improvement to student achievement.

The SETDA paper incorporates two models that I've used to categorize standards. The Four-Layer Framework for Data Standards divides data standards into four layers of work that build upon each other. The more recent Taxonomy of Education Standards looks at categories of data within the education sector. Shortly before I changed jobs, one of my Gates Foundation colleagues asked if I could make a chart placing existing standards efforts against these two models. I decided to do it all at once by merging the models into a matrix. Here's the result:

To understand the chart better, I recommend reading the descriptions of the two models. The Four-Layer Framework for Data Standards and the Taxonomy of Education Standards.

There's a lot of crowding in the area of student data. The standards in this area don't compete as much as it would appear. While there's some overlap, most fill complimentary roles. Details of all of these standards efforts and how they relate to each other are in the SETDA paper.

Here are links to the official websites and, in some cases, my writing related to each of the above standards.

CCSSCommon Core State StandardsBlog Post
CEDSCommon Education Data Standards
SEEDSoutheast Education Data Exchange (Digital Passport)
Ed-FiEd-Fi Alliance
EDIElectronic Data Interchange
ESBEnterprise Service Bus
GIM-CCSSGranular Identifiers and Metadata for CCSSBlog Post
IMSIMS Global Learning Consortium
IMS LTIIMS Learning Tools Interoperability
LRLearning RegistryBlog Post
LRMILearning Resource Metadata InitiativeBlog Post
OBIOpen Badge Infrastructure
PESCP20W Educational Standards Council
RESTRepresentational State Transfer
SIFSIF Association
TCAPITin Can API (AKA Experience API)

Like most everything on this blog, these models and this chart are free to reuse under a CC-BY license. I hope they're helpful to your efforts.

29 May 2013

CTO for the Smarter Balanced Assessment Consortium

I’m way overdue writing this message. About a month ago I started my new job as Chief Technology Officer for the Smarter Balanced Assessment Consortium. I expected to write about it that very week. After all, I had been composing this announcement in my head for a while by then.

That it’s taken so long is an indicator of what a whirlwind this has been. After all, the consortium has been operating for a couple of years now, contracts have been awarded and the work is underway. It’s like boarding a moving train. My new co-workers, partners and vendors have been incredibly gracious as I’m learning this new job.

Smarter Balanced is one of the Common Core Assessment Consortia. It’s a partnership of 25 states and 1 territory most of which have also adopted the Common Core State Standards. The concept is that with common standards for English Language Arts and Mathematics we can collaborate to develop better quality assessments of student skills than individual states could do working independently. Funding is through grants from the federal government and multiple foundations.

Those of you who know my passion for non-summative assessment may wonder whether I’ve gone over to the summative dark side. Summative assessments include the end-of-year assessments given to K-12 students. They may also be final exams or any other tests that come at the end of a course of study.

Non-summative assessments include formative assessments that occur at the beginning of a topic to help students and teachers understand how much they already know, they include daily exercises and assignments, and they include interim assessments. In short, they include any assessment that occurs before the unit or course of study is complete. My previous post on feedback loops and my VSS talk from last fall explain why I believe non-summative assessment has such potential to improve student learning.

As I looked into the Smarter Balanced opportunity I was delighted to find similar passion for non-summative assessment. To be sure, summative assessments are a huge part of our work. But even these exams will use Computer Adaptive Testing technology to accurately place students in a development sequence rather than just determine whether they’ve met some proficiency threshold. And we’re working hard to ensure these assessments are more authentic – that is, activities being measured are closer to the way skills are expressed in the real world.

Moving one step earlier in the learning cycle, Smarter Balanced will also offer voluntary interim assessments that can be used earlier in the year to find strengths and weaknesses in student skills and inform subsequent teaching. They will use the same kinds of questions and skills alignment as the summative exams. I’m exploring how we can offer these interim assessment items in a way that allows other organizations to integrate them into their adaptive learning systems.

And moving to the beginning of the learning process, we are developing a digital library of formative learning materials. These will be teacher-facing content that helps teachers plan and implement formative activities in their classrooms.

For more than two years at the Gates Foundation I studied the state of online assessment and how it fits into the Personalized Learning Model. I’m excited to apply those ideas at that this scale.  At Smarter Balanced we're addressing assessment needs in three ways: with year-end Summative Assessments, Interim Assessments to inform students and teachers as they learn, and with Formative Activities to introduce topics and get a sense of existing understanding. It goes without saying that I’ll write much about that here.

02 May 2013

inBloom - For My Concerned Friends

In my post about the Common Core State Standards I wrote about how concerned pundits have lumped together five related but independent efforts. Today I'm writing about inBloom which I'll contrast Statewide Longitudinal Data Systems – two more of those five.

InBloom is a service designed to help students achieve academic success through personalized learning. Those of us who helped develop the Shared Learning Collaborative (which was renamed inBloom in February) are convinced that personalizing the learning experience is the best way to improve student achievement. Whether personalization is being done by a teacher, an online learning system, or a synergistic combination of the two, it happens when information about what the student needs to learn intersects with information about available learning materials.

With that in mind, we set out to supply teachers and students with the data they need. That's what inBloom does. It taps into existing student data systems at schools, districts and states and makes that data available, in a secure way, to authorized teachers, students and parents. Simultaneously it indexes a library of teaching materials and makes them available to those same individuals.

A lot of work went into preserving student privacy. inBloom requires two things to happen before any student data can be retrieved. First, the application they are using must be authorized by the school district. Second, the individual using the application must be logged into inBloom and be authorized to access the requested data. This protection of student privacy is compliant with and goes beyond the requirements of FERPA and state data privacy laws.

So, who can access student data? Teachers can access data about students who are enrolled in their classes. Parents, if authorized by the school or district, can access their children's data. And students can access their own data. An application, such as a personalized learning system, can only access private student data if an authorized user is logged in to the app.

To match student achievement data against available learning resources, we need a common taxonomy of what it is that students need to learn. It's not sufficient to know that Johnny got an "A" on assignment number 5 but a "C" on assignment number 7. We need to know what learning objectives were represented by each of these assignments. That's why inBloom makes use of the Common Core State Standards. In the data, we can show that assignment 7 was on multi-digit multiplication. And, since it appears that Johnny needs some more practice, we can search the library for multiplication practice that's suitable to his age and preferences.

In a nutshell, inBoom supplies the student and content data needed for effective personalized learning.

Statewide Longitudinal Data Systems

For whatever reason, some people have confused inBoom with Statewide Longitudinal Data Systems (SLDS). The SLDS effort was launched more than a decade ago by the Bush Administration and funded by the Educational Technical Assistance Act of 2002. While a separate statute, it's related to the No Child Left Behind Act of 2001. The official SLDS website describes it this way:
Better decisions require better information. This principle lies at the heart of the Statewide Longitudinal Data Systems (SLDS) Grant Program. Through grants and a growing range of services and resources, the program has helped propel the successful design, development, implementation, and expansion of K12 and P-20W (early learning through the workforce) longitudinal data systems. These systems are intended to enhance the ability of States to efficiently and accurately manage, analyze, and use education data, including individual student records. The SLDSs should help states, districts, schools, educators, and other stakeholders to make data-informed decisions to improve student learning and outcomes; as well as to facilitate research to increase student achievement and close achievement gaps.
Under grants from the SLDS program, 47 states are developing longitudinal data systems that aspire to collect student data from preschool through college and even into workforce placement. Analysis of the data should help researchers understand the impact of different factors and programs on student achievement.

Before being analyzed to find trends, the data is either anonymized or aggregated in order to preserve the privacy of the students. However, the databases themselves necessarily contain personally identifiable information (PII). That's because the data comes from multiple sources: K-12 schools, colleges and workforce databases. In order to connect all of the data about an individual together, you need to be able to match up records and that requires the personal identity information about each individual.

This concentration of individual data spanning decades of educational experiences spooks a lot of people. Two factors help moderate those fears. First, according to federal regulation, data is not combined between states nor is it reported to the federal government. Only aggregate data (sums, averages and so forth) is reported to the federal government. Second, the Family Educational Rights and Privacy Act (FERPA) prohibits the release of any student information without permission from parent. Of course, that doesn't reassure everyone. The mere fact that such databases exist concerns many.

I have a different concern. I've previously written about Theories of Change for educational improvement. In this case, the theory is that over time the collected data will help government officials, education officials, teachers and curriculum developers make better decisions based on what really works. But if we're trying to figure out how a particular curriculum choice in elementary school affects a student's college prospects, it may take 10 years or more to have the data to measure that effect. My concern is that this effort will take a long time to make a difference.

~ ~ ~ ~ ~

inBloom and SLDS both collect student data. Both leverage CEDS definitions for the data fields they collect. But the purposes of the data sets and the people who have access to the data are entirely different. Of the two, I'm more optimistic that inBloom will achieve the impact on student learning that our country needs.

23 April 2013

The Common Core State Standards - For My Concerned Friends

Even before their adoption in the Summer of 2010, the Common Core State Standards (CCSS) had their advocates and their critics. Recently, however, that criticism has made its way into the popular press. Knowing that I've worked on related projects at the Bill & Melinda Gates Foundation, friends and family have asked my opinion.

Several of the pundits have conflated five different projects as if they were all the Common Core. These are to some degree related but each has it's own sponsors and they are being managed and adopted separately. They are:
In this post I'll address the Common Core and what distinguishes it from a curriculum. In a future post I'll write about inBloom and other data systems. And one more post will cover the assessment consortia.

The concept of state core standards gained prominence during the Bush Administration as part of the No Child Left Behind act. In a recent blog post I wrote about how they are part of the Standards and Accountability theory of education reform and how later and more promising theories also rely on quality standards.

The result of NCLB and related efforts is that each of the 50 states developed its own core standards. This has the vague advantage of more local influence but it has two significant disadvantages. First, there are differences between what students learn in different states. So colleges and universities don't have a consistent standard of preparation to expect from students. Second, developers of tests and curriculum spread their resources 50 different ways. The result is lower quality teaching materials and examinations.

Starting in 2008 a consortium of state representatives developed the Common Core State Standards for ELA/Literacy and Mathematics. They don't include Science, Social Studies, History or any other subject. However they do specify literacy standards for Science and Social Studies. In other words, they specify that reading should be a significant part of those subjects without specifying the actual titles or subjects to be read.

The standards are written in the form of "competencies" – that is, descriptions of things that students should be able to do. For example, standard CCSS.ELA-Literacy.RL.8.5 reads, "Compare and contrast the structure of two or more texts and analyze how the differing structure of each text contributes to its meaning and style." Standards like the common core describe what is to be taught while curriculum describes how it should be taught.

Here are some ways that distinction applies: The Common Core describes the difficulty of text to be read at each grade; curriculum gives a list of actual books and stories. The common core describes the kinds of problems a student should be able to solve; curriculum specifies the order concepts will be taught and includes exercises to be performed. The rivalry between Phonics and Whole Language is not resolved by the Common Core; that decision remains in the hands of district curriculum committees.

Critics of the core have missed an opportunity here. Since curriculum involves textbooks, lesson plans and teaching materials, it consists of thousands of pages, tens of hours of video and other media. It's also copyrighted. All of this makes reviewing a curriculum a daunting task – albeit an important one. Meanwhile, the standards are relatively short and accessible. They are released under an open license and you can read them online at http://corestandards.org. They total somewhere around 200 pages long including appendices so you can review them in an afternoon.

They are different from previous standards. The ELA/Literacy standards start with a 50/50 balance between literary and informational texts (fiction and non-fiction) in the lower grades and increase that to a 30/70 split when social studies and science reading are included in upper grades. Reading in the English classes remains a 50/50 split through all grades.

The focus in all texts, whether fiction or non-fiction, is on critical thinking and extracting arguments and meaning from the text itself. As a result, "response papers" where a student expresses their opinion or thoughts about a document are discouraged in favor of more analytical writing that identifies arguments, contrasts perspectives and uses evidence from the documents themselves.

Many English teachers have objected to the shift away from an emphasis on fictional reading and writing. Picking up on that, one pundit suggested that Huckleberry Finn would be eliminated in favor of the phone book. Of course, the phone book isn't what the Common Core means by "informational texts". A sample list can be found in Appendix B of the common core. Remember that actual reading lists are the domain of curriculum. That's why this is in an appendix; it's not normative to the standard. Examples of informational texts in that list include the founding documents of our country, Lincoln's "Gettysburg Address" and Ronald Reagan's “Address to Students at Moscow State University”. This isn't the phone book.

To get an idea of how these texts might be taught, I recommend this video from David Coleman. He was a coordinator and key author of the ELA standards. In this video he demonstrates how to teach the standards using Martin Luther King's "Letter from a Birmingham Jail" and Lincoln's "Gettysburg Address." In both cases he shows the brilliance of the authors and how it's not necessary to teach a lot of background because the authors include the needed information in the texts themselves.

Regarding the math standards, there are two important shifts from existing teaching practice. First is that they have reduced the total amount of information to be taught. The overall theme is narrower and deeper. For example they require fewer methods for solving quadratic equations (narrower), but they also introduce complex numbers and the possibility of an imaginary result to a quadratic (deeper).

The second change is that they teach mathematics at three levels: conceptual understanding, computational and procedural fluency, and mathematical thinking. The overall goal is to help children become "numerate." That is, students should naturally apply mathematics to interpret things in their daily lives.

So, what are the objections? A common one is that this is a federal program to control what our students learn. First off, this is a state-lead initiative, not a federal one. Secondly, the standards will only control teaching if the they are considered to be limits to what is taught. But they are really a floor, not a ceiling and most of the details remain left to the curriculum.

Other objections come from academics arguing for or against certain pedagogical theories that the rest of us aren't familiar with. For example, advocates for both Phonics and Whole Language have complained that the Common Core is a capitulation to the other side. But the Common Core Standards aren't as opaque as all of that. As I wrote a couple of months ago, the English standards focus on a few basic skills applied to increasingly complex texts. The math standards cover the familiar topics of arithmetic, algebra, geometry and so forth.

The biggest issue is that change is difficult and frequently unpopular. The changes demanded by the common core aren't easy ones. They require changes to curriculum; they require new lesson plans; and they require teachers to approach subjects in new ways. Many people are excited by the possibilities but it's not surprising that some would prefer to preserve the status quo. Unfortunately, status quo isn't good enough.

The Common Core State Standards offer two important advantages over previous state core standards. First is simply that they are common. We hope that by concentrating their efforts on one standard instead of 45, developers of curriculum and examinations can do a better job than before. The second advantage is that the Common Core is a second-generation standard built on a foundation of the best state standards and informed by the experience of those who built the first generation.

Are they perfect? Not likely. But these new standards are better than previous ones and they will become a valuable tool in our personalized learning arsenal.
Update 26 July 2013:

29 March 2013

A Taxonomy of Education Standards

I've previously posted and updated A Four-Layer Framework for Data Standards. When working with education standards I've also used the following taxonomy that categorizes standards according to their purpose. For convenience, this taxonomy is also available in PDF form under a CC0 dedication.

Types of StandardsThere are three types of standards that are involved educational efforts: Academic Standards, Data Standards and Technology Standards.

Academic Standards include achievement standards like the Common Core State Standards (CCSS) plus curriculum and testing standards. Contemporary practice in the U.S. is to describe academic standards in the form of learning objectives – descriptions of skills that students can acquire or demonstrate. Historically it was more common to describe standards in syllabus form – as a list of subjects to be studied.

Encouraged by the No Child Left Behind Act, the 50 states have each defined core curriculum standards. More recently, the CCSS standards for Mathematics and ELA-Literacy have been adopted by 45 states. Using a similar process, the Next Generation Science Standards have been proposed for multi-state adoption. In higher education there is no such consistency. Some institutions have developed their own sets of standards but most leave the objectives up to the professor. A few industry organizations publish standard sets. These include the AAAS Benchmarks for Science Literacy[3] and the National Center for History in the Schools standards for History.

Data Standards define the data elements and structures used to store and exchange educational information. In the Four-Layer Framework data standards may include layers 1-3 (Data Dictionary, Data Model and Serialization).

For education, the three major domains of data standards are Student Data, Educator Data and Content Data. Important metrics like graduation rate, student financial aid repayment or college-going rate are derived from data sets but aren’t data in and of themselves.

Student Data includes traditional demographic information as well as a student record which includes academic achievements, assessment results, learning activities, attendance and so forth. Educator Data includes information about teachers and staff. It includes qualifying information like academic credentials, a portfolio of creative works and publications and data about teaching performance. Content Data, often called metadata, is information about learning materials including textbooks, assessments, multimedia and digital resources. Content data often indicates the alignment between learning resources and academic standards like the CCSS.

Technical Standards define how systems interoperate. Accordingly, they usually include the protocol layer of the Four-Layer Framework. A wide variety of standards may fit into this category but the majority of education-related technical standards involve Content Packaging Formats, Interoperability Protocols and Data Exchange Protocols.

Content Packaging Formats support the transport of learning content (e.g. text, video, graphics, etc.) and assessments between systems. Examples include IMS Common Cartridge and SCORM.

Interoperability Protocols support interoperability among learning systems. The most common use case is integration of learning tools (like simulations, games or assessments) into learning environments (like a learning management system). Key functions are to identify the user to the learning tool, ensure that they are authorized to access the content, transfer control to the tool, and collect data back. Common examples include OpenID, SAML, OAuth and IMS QTI. Data Exchange Protocols represent layer 4 in the Four Layer Framework for Data Standards. Thus, data exchange protocols are usually paired with a corresponding data standard. Frameworks for setting up data exchange protocols include ESB, SOAP and REST.

20 March 2013

Progress Report: The Personalized Learning Model

A bit more than two years ago my colleagues and I at the Gates Foundation came up with the Personalized Learning Model. Eighteen months ago I introduced it on this blog. Two weeks ago, at SXSWedu, we celebrated the launch of inBloom which is a set of services that support the Personalized Learning Model.

The concept of personalized learning was not new or unique to us. Indeed, we chose it because the benefits have been well-proven. Our model was a way to describe how technological supports could be designed to facilitate personalized learning. As we've been working on this for a couple of years now, it's time for a progress report.

Learning Objectives
In 2010 a consortium of states, coordinated by the Council of Chief State School Officers (CCSSO) and the National Governor's Association (NGA), introduced the Common Core State Standards for English/Literacy and Mathematics. They were rapidly adopted by 45 U.S. states. Having common standards across states is, of course, convenient but these standards seek to be an improvement on the previous generation.
The Common Core State Standards were written by building on the best and highest state standards in existence in the U.S., examining the expectations of other high performing countries around the world, and careful study of the research and literature available on what students need to know and be able to do to be successful in college and careers. No state in the country was asked to lower their expectations for their students in adopting the Common Core. The standards are evidence-based, aligned with college and work expectations, include rigorous content and skills, and are informed by other top performing countries. They were developed in consultation with teachers and parents from across the country so they are also realistic and practical for the classroom. (From the CCSS FAQ.)
In August of 2012, the CCSSO and NGA released official identifiers and an XML representation of the Common Core thereby facilitating alignment of digital learning resource to the core standards. Driven by the need to measure and prove coverage of the standards, finer-grained identifiers are being assigned to individual learning objectives within the common core standards.

The Next Generation Science Standards are also under development with an expected release before the end of March. Following their release, state education boards will consider adoption.

Postsecondary education is taking a different approach. There's little formal agreement between colleges and universities on the learning objectives that compose common courses. However, college and university departments are defining the objectives for core curriculum and there is growth in the sharing of these objectives within university systems. Colleges are also considering use of the Common Core for developmental education courses.

Student Data
Common Education Data Standards (CEDS) is a project to create a common data dictionary and logical data model for education data. Applications that align to CEDS use the same definitions for data fields making data exchange easier and increasing fidelity.

The inBloom Data Store uses CEDS for its data model and ingests data in SIF and Ed-Fi data formats. It offers an API through which personalized learning applications can store and retrieve common student data. Security features preserve the privacy of data and ensure that only authorized people can access it.

Newer data stores align student activity and assessment data to standard learning objectives. The goal is derive a model of what the student knows, what the student is learning and what the student has yet to learn. This enables rich reporting on student competency levels on an objective-by-objective basis and the stimulation of targeted interventions.

I prefer to talk about educational content as learning activities. There are the traditional passive media such as reading, lectures, video and so forth. More engaging are interactive activities like virtual labs, simulations virtual worlds and games. For both active and passive content, education doesn't need special formats. The web content formats managed by the W3C are adequate and well-supported. What is needed is a way to represent the alignment between the content or activities and the standard learning objectives.

The Learning Resource Metadata Initiative (LRMI) is a standard way to describe educational materials including their alignment to standards. It's based on the Schema.org metadata standard adopted by Google, Yahoo!, Bing and Yandex.

LRMI metadata can be shared between systems using the Learning Registry. The inBloom index consumes LRMI data from the learning registry and offers a search service that can find educational content suited to specific student needs.

IMS Global defines standards for packaging learning content for import into learning management systems. However, I prefer the approach IMS uses for Learning Tools Interoperability. Instead of packaging content, this protocol allows content from other sites on the web to be seemlessly integrated into the learning experience. Integration in this way avoids limitations imposed by the packaging format and lets the developers of learning activities collect data about the use and effectiveness of their products.

In my opinion, assessments are presently the weakest part of the Personalized Learning Model but that's changing rapidly. Two multistate assessment consortia, Smarter Balanced and PARCC are developing new assessments aligned to the Common Core State Standards. Both are committed to supplying formative and interim assessments in addition to year-end summative exams. CoreSpring is pooling assessments from a more than six different sources to supply a bank of good quality assessments that can be used in class, for quizzes and in interactive learning environments. MOOC developers such as Coursera, edX and Udacity are having to invent new ways to offer interactive assessments at extremely large scale.

In the long run, I expect the line between learning activities and assessment activities to blur. After all, much of learning occurs when the student demonstrates understanding. With adequately instrumented activities, the accumulated data about student competencies should reduce the need for big summative exams at the end of the year.

~ ~ ~

We've come a long way in the last couple of years. Pioneers in this space like DreamBox, Knewton, Read180 and GrockIt had to build a whole infrastructure. But now there's a solid set of building blocks on which developers can build personalized learning applications. I anticipate a lot more innovation at the place where student data and content come together.

06 March 2013

Theories of Education Reform

My oldest son was a junior in high school when the standardized tests associated with No Child Left Behind were rolled out. One day, shortly before the exams, he asked me, "Why do we have to take these tests anyway?"

I answered truthfully, "They're not evaluating you, they're evaluating your school."

I found out later that with that information, he and his friends challenged each other to get the lowest scores possible. I sometimes use this story to illustrate broken feedback loops. It was nine months later before the scores had impact. When he returned to school the next fall he found he had been enrolled in remedial math despite aceing Pre-Calculus the previous year. He had to meet with the counselor to get into the right class.

Today, however, I want to explore the theories of education reform that drove the deployment of these exams. There are three prominent theories of reform with a few variations. Most contemporary efforts to improve education are based on at least one of these.

Theory: Standards and School Accountability
This is the primary theory represented by No Child Left Behind (NCLB). It's based on the broader theory that measuring something and reporting on those measurements will bring about improvement – especially if improvement is incentivized. It also represents the truism that if you don't measure something, you can't tell whether you've changed it for the better.

In order to bring about accountability, NCLB requires states to define learning objectives for each year or grade. These objectives are commonly referred to as the state core standards and each U.S. state has its own set. Furthermore, any public school receiving federal funding must administer a state-wide standardized test to every student in grades 3-9 and at least once in grades 10-12. Student scores are compared with previous years' results to determine whether they have achieved Adequate Yearly Progress (AYP). Certain consequences are tied to individual schools' success or failure to achieve progress for all students.

The core of the theory is this: If we set higher standards, measure against those standards and report performance then learning will improve. Unfortunately, 11 years into this experiment the quality of U.S. student learning is nearly flat.

There are numerous criticisms of standards and testing; but my personal concern is that by themselves they are a blunt instrument. In the absence of a proven formula for improvement the result is a form of natural selection – schools that underperform are taken out (actually they "receive interventions") while better performers survive. Natural selection is proven to work but it takes many generations and a lot of the population are brutalized before measurable improvement occurs.

Despite the lack of success, it's not time to abandon standards or accountability. Prior to 2002 most states didn't have well-defined core standards nor was student performance consistently measured. Now all states have standards, we are measuring regularly and 45 of the states have recently agreed to the Common Core State Standards. While standards and testing are inadequate remedies by themselves, they are important assets on which to build.

Theory: Highly Qualified Teacher
Where the Standards and Accountability theory focuses on school improvement. This theory focuses on teacher improvement. It's certainly intuitive; most of us have had one or more great teachers and we know they make a big difference. It's also justified by the data. Studies confirm that teacher quality is an important factor in student achievement and that the variation in achievement between classes within the same school is greater than variation between schools.

NCLB includes a mandate for states to supply highly qualified teachers to every student but it leaves it up to states to determine what it means to be highly qualified. And that turns out to be a problem. Studies show that certain teachers are consistently more effective than others; value added measures can identify which ones they are (albeit with a moderate error rate); but individual teachers often don't know what they need to do to improve.

In raw form this becomes another application of natural selection. If we reward teachers who perform well and eliminate those who don't then eventually performance will improve – assuming we don't run out of teachers beforehand. But many generations will be required and a lot of brutal actions will be taken in the meantime. No wonder there's so much controversy around teacher evaluations being tied to wages and promotions.

I'm actually in favor of merit pay for teachers so long as good quality performance measures are used. But those evaluations need to be deployed concurrently with professional development that informs teachers on how they are doing and what they can do to improve. Conveniently, resources are emerging to support that. For example, the Measures of Effective Teaching project used the Danielson Framework for Teaching to identify teacher behaviors that are well-correlated with student performance. These and similar frameworks can be used to inform teachers on how they can do better.

Even so, effective teachers alone are not enough. In our current educational system, teachers account for approximately 8.5% of variation in student achievement. School-, teacher-, and class-level factors combined account for about 21%. Meanwhile, background characteristics such as race, parental achievement and family income combine to account for 60% of variation in achievement levels.

So, if every teacher in the country was equivalent to our very best, it still wouldn't be enough to overcome the cycle of intergenerational poverty. To achieve that dream, we have to increase the influence school has over student achievement. That can be done by adapting the learning experience to the needs of individual students.

Theory: Personalized Learning
There's a pattern to these theories: The Standards and School Accountability theory introduces the concept of measurement and uses it to assess whole schools. The Effective Teachers theory takes those same measures and applies them at the teacher level. For this third theory, feedback is applied at the student level.

Personalized learning leverages the same standards as the other theories. It can also incorporate the same measures. However, annual testing alone is insufficient for personalization. Instead, understanding is measured weekly, daily or, in the best adaptive learning systems, continuously. Measurement must happen soon enough and feedback given quickly enough to affect learning activities. A truly personalized system selects activities according to student needs and also adapts to student behavior within an activity.

Bloom's Two Sigma experiments and the follow up work they inspired make me optimistic about Personalized Learning. These and other studies have shown that personalized learning experiences enabled by immediate feedback consistently deliver one to two standard deviations improvement in learning. We believe that is sufficient to overcome background factors thereby enabling a majority of students become high achievers.

Personalized learning is the natural result of 1:1 tutoring which is why tutoring is so effective. To do personalized learning at classroom scale generally requires 1:1 computers and a role change for the teacher as she shifts from "deliverer of knowledge" to "facilitator of learning." As with the other theories, there's a lot of skepticism and resistance to change. But pilot deployments are showing great promise.

Variation: School Choice
School Choice attempts to bring competitive pressure for schools to perform better. In this way, it's a variation on the Standards and School Accountability theory. Like NCLB, School Choice needs standards to be set and school performance must be measured against those standards. Performance is reported to parents who are expected to make an informed choice of which school their students should attend.

Since allocation of school funds is tied to enrollment, the theory is that schools seeking students will compete, not only on standards and their measures, but also on the basis of any other factor that's important to parents and students.

School Choice efforts include charter schools, magnet schools and voucher programs. The idea is to give public and private schools more freedom to experiment thereby accelerating the identification of viable formulas for improved leaning. Studies have shown this to be the case as the average of charter school outcomes is similar to that of public schools while variation among charter schools is much greater. Therefore, some charter schools are substantially better and should be emulated while others are substantially worse and should be shut down or reorganized. It's exactly this kind of variety and freedom that school choice advocates seek.

School choice can incorporate Highly Qualified Teachers and Personalized Learning. Indeed, since both of these theories have been shown to be effective, the expectation is that schools that incorporate these principles will be the best rated and will attract more students.

Variation: Small Classes
The small classes movement is based on studies showing that students learn better in smaller classes – all other factors being equal. But other factors are not equal. Lowering the student:teacher ratio costs a lot of money and other factors such as teacher skill have a greater impact than class size. For example, when California mandated smaller classes they had to hire many more teachers. For at-risk populations, the impact of less-experienced teachers overcame the benefits of smaller classes resulting in lower performance instead of the expected improvement.

Variation: No Excuses
The No Excuses model centers on maintaining high expectations for student performance without making excuses for external issues such as background, troubles at home and so forth. It's associated with charter management organizations such as KIPP and BES. Proponents emphasize pillars such as college expectations, culture of respect, voluntary participation and high discipline. They also have extended hours and extended school years. A key value is the whole school's commitment to each student's success. If a student is struggling or falling behind, they discover that early and engage counseling, tutoring and other supports to ensure the student succeeds.

No Excuses engages all three theories, overall school performance is measured, they hire and train highly effective teachers and they adapt the learning environment to the needs of individual students, albeit most No Excuses schools do adaptation with limited use of technology. Over the last decade, No Excuses schools have demonstrated that background factors can, indeed, be overcome by a supportive school structure. On the other hand, their high reliance on supportive interventions sometimes leaves students underprepared for the independent learning discipline required in college. Recognizing this, No Excuses organizations are updating their practices to better train students to become independent learners.

~ ~ ~ ~ ~

As with the variations listed here, most reform projects mix two or more of these theories. Even NCLB includes a mandate for Highly Qualified Teachers. Personalized Learning efforts are more common at charter schools than conventional public schools.

Education Reform will remain an important part of our civic dialog for a long time. Unsurprisingly, it means different things to different people. For some it's a moral crusade. To those being asked or forced to reform it's more threatening. All too often arguments about reform neglect the research (which is abundant) and fail to fully express the theories on which they are based. That shouldn't be the case as there are decades worth of data and studies behind each of these theories – sufficient for advocates and policy makers to make informed decisions.

The data tells those of us seeking to eliminate poverty that incremental improvement to existing schools is insufficient. Personalized learning with an eye toward training independent learners seems to be the most promising approach. Deploying this at scale requires whole-school changes to the way programs are funded, to the choices of curriculum and technology, and to the roles of educators.