25 November 2013
11 October 2013
|A closed-loop control system.|
|A personalized learning system.|
In that post I noted that the feedback loop needs to be "closed" in the sense that we use feedback to influence the direction instruction should take. Also that feedback needs to be "negative" in the mathematical sense. That is, feedback should reflect the difference between skills the student demonstrates and standards that are to be taught. Both of these concepts, closed feedback loops and negative feedback, are derived from engineering control theory.
In this post, we'll consider three more insights we can gain from control theory: the role of a transfer function, speed and frequency of feedback, and sensitivity to what is being measured.
Transfer FunctionsA transfer function is a mathematical description of the relationship between the input and the output of a system. Let's use the car example from my previous post. In that case, the input is the position of the gas pedal (more correctly called the "accelerator pedal" as we'll see in a moment). The output is the speed of the car. Pressing the pedal to a certain position doesn't make the car go a corresponding speed. Rather, pressing the pedal causes the car to accelerate at a rate proportional to the pedal position. If you keep the pedal pressed to the floor, the car will continue to accelerate to higher speeds until it reaches the limits of its construction. (For calculus fans, this means that the transfer function of a car's drive train is an integral.)
The transfer function is important because it's incorporated into the design of the controller. When an engineer designs a controller they use the transfer function to anticipate what will be the result of a particular input. Typically they take the inverse of the function to determine what input is required to achieve the desired output.
The educational equivalent of a transfer function is a learning theory – a description of how people learn. Learning theories help us select activities that will effectively help a student learn a particular skill. Descriptions of the various theories and their strengths and weaknesses are beyond the scope of this post (or my skills for that matter). I recommend the Wikipedia article on the subject. But we can derive two important insights from this:
- A personalized learning system will inevitably express some learning theory in the selection of activities. It would be best to deliberately select the theory and design the system accordingly.
- There are personal differences in the way each student learns. In engineering terms, this means that each student has their own personal transfer function. Therefore, the selection of activities should be tuned to the student's individual interests and affinities.
Speed and Frequency of FeedbackThe time from the moment an output is measured to a resulting change in the input is called a propagation delay. In educational terms, this is the time from when a student's skill is assessed until moment a student's activity is affected by that. In a traditional math class the student does homework one day, submits it the next day and receives graded homework back the next day. Thus, the propagation delay is two days (or two class periods). Fast feedback means a shorter propagation delay. Many online learning systems offer near instantaneous feedback. Measurements that require human grading will naturally be slower.
Frequency of feedback is a measure of how often the output (or skill) is measured and feedback generated. In the traditional mathematics example, feedback is daily (or once per class period). Some traditionally taught courses may only have two or three graded activities in the entire course. However, this may be a pessimistic way to measure frequency. For example, if students can check their answers in the back of the book then feedback is both faster and more frequent
A third component of educational feedback is richness. In math, this might be the difference between being told than an answer is wrong and being informed about exactly what mistake was made. In English it might be the difference between a simple score and detailed feedback about how the student might improve their paper.
Students can influence all three feedback factors. For example, if English students seek help at a writing lab then they will be getting faster, more frequent and richer feedback than students that don't make use of the resource.
Control theory tells us that faster and more frequent feedback compensates for inaccurate measurements and poorer transfer functions. In education language, this means that if we can make feedback faster and more frequent we can compensate for a less-than-perfect learning theory and suboptimal assessments.
Of course it would be nice to have everything -- fast, frequent and rich feedback, good quality assessments and a solid learning theory. But it's useful to know that there are real tradeoffs among these factors.
Sensitivity to What's Being MeasuredFeedback loops are a very effective tool; so effective that if the wrong thing is being measured or the wrong feedback is offered then the wrong skill will be optimized. A recent manifestation of this are complaints of "teaching to the test." The concern is that since summative tests are used to evaluate schools then the only skills that will be taught are those that are on the test. While this outcome is common, it's unfortunate since studies have shown that focus on conceptual understanding results in better test performance than test-focused instruction.
It's also manifest in the combination of skills that a particular problem might require. For example, a mathematics story problem might require reading, visualization, and problem solving skills in addition to the ability to solve the resulting mathematical equation. In order to offer feedback to a wrong answer, the system (whether human or automated) must be able to detect which of these skills was not applied properly. In most cases, this requires interacting with the student to discover the steps followed in answering the question.
It's tempting to try and isolate skills and only assess one at a time. There are two reasons why this won't work. First, it's very likely that you're seeking the student's ability to use multiple skills together. Second, the demand for some skills simply can't be eliminated. For example, nearly every assessment requires the skill, "Can read and follow directions."
Applying Feedback LoopsTo summarize, engineering offers us the following insights about using feedback in education:
- Choose your learning theory deliberately and measure its effectiveness.
- Adapt not only to what the student has and has not mastered but to the individual learning patterns and affinities of each student.
- Fast and frequent feedback can compensate for lower quality in other areas of the system. This is a two edged sword; you may think you have a good learning theory when, in fact, it's fast feedback that's making the difference. But it's also an opportunity to make deliberate trade-offs.
- Be sure you're measuring what you think you're measuring. And don't forget that every assessment measures multiple skills.
26 July 2013
To many of us, the lack of progress has been confusing. In just about every other industry, when you measure performance and report it back, performance improves. Education is proving to be a more difficult problem than most.
Unintended ConsequencesPart of the issue is unintended consequences from standardized testing. Consider a teacher who is anticipating the year-end tests. She and her principal are under pressure to achieve Adequate Yearly Progress (AYP) goals. So, they have regular drill and review sessions -- test preparation to make sure the students are ready.
All of this "teaching to the test" takes time and resources away from more enriching and interesting learning experiences. And it doesn't work. The Measures of Effective Teaching project found that teaching to the test is not as effective as a focus on conceptual understanding and applications. (MET Preliminary Findings Page 21)
That shouldn't be surprising. I wrote about the Flow Channel or Zone of Proximal Development a few months ago. In order to keep the student's attention and prevent frustration, the work needs to be new and challenging but not excessively so. If the work is too easy, the student is bored. If too hard, the student is anxious. In either case the student is frustrated and learning does not occur. Constant drilling in preparation for the test leaves is squarely in the boredom zone.
There are lots of important skills that aren't on the exam. Yong Zhao has written that the most important 21st Century Skills are creativity, entrepreneurship and independent learning. These aren't emphasized in the standardized exams. But the basic skills of literacy and numeracy (which do appear) are required for the kind of creativity and entrepreneurship we need. For this reason, I wish that the Common Core State Standards were named the "Common Foundation Standards" because that's the way I see them -- the foundation skills required for creative work.
This is another unintended consequence of standardized testing. Excessive focus on the exam steals classroom time that could be used for creative application of the knowledge or for self-directed learning. The MET study and others show that more of the latter activities results both in better prepared students and concurrently better exam results.
How To Do BetterNoting the minimal progress, many advocates call for abandoning the standards and assessments altogether. They look at the amount of time and money dedicated to assessment and suggest these resources could be spent in better ways. But, in the absence of standards and measurement we wouldn't know if we are succeeding or failing. The best we could hope for is blissful ignorance. In my opinion we must push forward, improving standards, measurement and teaching.
While improvement will require changes throughout the educational system, there are things that can be done with the assessments themselves to support improvement. Here are some of the things being done by the Smarter Balanced Assessment Consortium and by PARCC:
Richer, More Authentic Assessment Items
Both consortia will use computer-delivered assessments that are much closer to real-world activities. An emphasis on constructed response items will require students to compose an answer to a problem, not just select from a set of prewritten answers. Upon evaluating the consortia's assessments, the UCLA CRESST center concluded, "Both PARCC and Smarter Balanced summative assessments ... will represent many goals for deeper learning, particularly those related to mastering and being able to apply core academic content and cognitive strategies related to complex thinking, communication, and problem solving."
Teaching to the test becomes less of a problem the closer the exam gets to assessing real and authentic skills.
Guidance from Interim Assessments
In addition to the year-end summative assessments, both consortia are offering voluntary interim assessments that teachers and administrators can use to gauge students' understanding throughout the school year. If students are found to be prepared, less time will be spend on boring and unnecessary drills. Likewise, identification of weak areas can guide teachers in reviewing just the necessary lessons.
Professional Development and Formative Activities
The consortia are developing training materials for teachers. These will include information on how to plan formative assessment activities for the classroom and how to interpret and make use of assessment results.
Precise, Individual Level Reporting
Existing state assessments measure the number of students who have achieved the state competency threshold for their particular grade. These measures are reported to schools, districts and states in hopes of improving education programs at those institutions. Threshold tests can measure whether a student is above or below the expected competency but the further a student is from that level, the less accurately they can indicate the student's actual competency.
Smarter Balanced will use computer adaptive testing to precisely measure each student's skill level. In adaptive tests, questions are selected based on the results of previous assessment items and testing ends once the student's skill level has been measured to a certain level of confidence. Thus, students that are below grade level aren't subjected to a long series of questions that they can't answer and both those above and below the threshold receive accurate measures of their skill levels.
These more precise assessments measures are used to generate clear, individual level reports for students, their parents and teachers. The reports will have sufficient detail to show growth year over year and to optimize instruction to individual student needs.
Standards and associated assessments haven't resulted in the improvements that were hoped for. But that doesn't mean we should give up on them. They offer an important support for the Personalized Learning theory which has been proven. Refinements to standardized tests listed above will reduce unintended consequences and offer the guidance needed to optimize each student's learning experience.
06 July 2013
Here's a rough outline of how a series of pioneers developed the technologies you find familiar:
- 1945: Vannevar Bush, Director of the Office of Scientific Research and Development, writes "As We May Think." Writing at the conclusion of World War II, Bush considers how technologies developed for war can be used to further peace. He envisions an electromechanical system based on microfilm and dry photography that can manage all of the data a person needs and help them organize it into knowledge.
- 1968: Douglas Engelbart, Director of the Augmentation Research Center at SRI, is inspired by Bush's article. He realizes that the concept can be achieved much more readily using digital computers instead of an electromechanical system. In 1968 he demonstrates their oNLine System (NLS) in what we now call the "Mother of all Demos" including a mouse, graphical user interface, collaborative word processing, teleconferencing and a host of other features that would take decades to make it into the mainstream.
- 1973: Alan Kay, who had attended Engelbart's demo, incorporates many of Engelbart's ideas into the Xerox Alto. Designing the Alto so that it can be used by children, Kay's insight is that the user interface should manifest the functions that are available. Thus, the system itself can teach the individual how to use it.
- 1984: Steve Jobs, who had seen a demo of the Alto in 1979, incorporates key elements into the Apple Macintosh. Features inherited from NLS and the Alto include the mouse, GUI and computer networking. Jobs' most important contribution is to get these ideas out of the lab and offer them to a mass market.
But there was a critical feature in NLS that we have not yet replicated. Each document was given a unique ID. Printed NLS documents were easily recognized because they included index numbers in the margins. The document ID and index number allowed individuals to reference any line in any NLS document.
That need for consistent identifiers that can be referenced has yet to be addressed for most texts. Today, the best that citation systems can do is refer to a page number. But page numbers change with different formats (e.g. hardbound vs. paperback) and editions. Suppose, for example, you want to cite a particular quote from Huckleberry Finn. In order to do so, you have to specify the publisher and edition of the book before citing the page number. And the odds of a reader having that same edition is pretty low. Textbook publishers have taken advantage of this. By changing pagination between editions, they deliberately obsolete previous editions.
With the advent of digital books the problem is compounded. Page numbers change according to user preferences like font size and page orientation. As a stopgap, the Amazon Kindle added "real page numbers" so that you can use references derived from the paper version of a book. But the problem of persistently valid references across editions lingers.
As we honor the legacy of Doug Engelbart, it's appropriate to consider one more of his innovations -- a persistent and universal referencing system. We still need it.
22 June 2013
Technology readiness is a new concern for state and district technology directors. With high-stakes assessments going on, concerns are even higher. The RTTA Assessment Consortia have collaborated on a Technology Readiness Tool that states and districts are using to survey and report on their preparedness to perform assessments. Smarter Balanced and PARCC have added Technology Readiness Calculators that help districts and schools perform capacity planning.
But there are a bunch of things that go wrong that are overlooked by conventional planning and testing. Some of them are even missed by experienced network technicians. It's not that the tools above are flawed. It's the nature of these kinds of problems that require them to be addressed in a different way. Today I hope to prevent them from biting you. The following list isn't comprehensive. But it's a good start.
This is why things work when you try them the night before, but on the first day of class, with 30 students all trying to stream video at the same time it all falls apart. It's a well-known issue and bandwidth planning is a key part of the planning tools I mentioned above. Still, this is important enough that it bears an additional mention.
EducationSuperhighway is doing a survey of actual in-classroom bandwidth and using that data to advocate for better connectivity. They use a bandwidth test and ask teachers and educators to run it periodically. While they acknowledge the weaknesses of this approach (listed below) getting a lot of samples will increase the accuracy of their reports. So, if you work in a school, please go to SchoolSpeedTest.org and run their test from time to time. Not only will it inform you but it will also contribute to nationwide advocacy for school bandwidth.
Mismeasurement of Bandwidth
Bandwidth tests like SchoolSpeedTest.org are useful tools but they can be misleading. Most internet service providers offer "burst speed" in excess of the guaranteed bandwidth. Consider a municipal ISP. Perhaps they purchase 10Gbps of bandwidth from their upstream provider and parcel it out to 100 customers at 100Mbps each. None of those customers will use all of their bandwidth all of the time. So the ISP lets them burst beyond 100Mbps, using some of their neighbors' unused bandwidth. So, if you do a bandwidth test at a favorable time, the result could be much higher than what's guaranteed by your ISP.
On the other hand, I've visited with schools who get much lower performance in their classrooms than what they pay for. In their cases, outdated networking equipment or problems with their wireless networking create a bottleneck that slows things below their purchased capacity.
Inadequate Access Point Capacity
Wireless networking bridges to wired networking through access points positioned around the building. Home networks combine the access point with the router. But commercial networks usually have one or two routers for the whole building with access points positioned strategically throughout.
Any access point has a limit to the number of computers it can serve. Consumer grade devices have a lower capacity but even commercial units can be overwhelmed if you get too many devices in the same room. Most people know that the access point's bandwidth (typically 54Mbps) is shared among all connected devices. But you can't just test the bandwidth in a room with a single computer and then divide by the expected number of computers to get available bandwidth. There's bandwidth overhead to each connection and there's a ceiling on the total number of computers that can be supported by a single access point. The max device count varies from model to model but it's always there.
Interference Between Access Points
One way to address access point capacity limits is to use more of them. But if you pack them too closely together they will interfere with each other -- thereby impairing your capacity rather than building it.
Interference From Other Devices
The 2.4Ghz band used by most Wi-Fi devices is also used by Bluetooth, some cordless phones, and many other devices. The 5Ghz band is also available for Wi-Fi but it isn't supported by as many devices and it has a shorter range. Microwave ovens also happen to be in the 2.4Ghz band. Poorly shielded units can jam all network traffic in their vicinity.
This particular case of device interference deserves special attention. Smarter Balanced supports iPads and other tablet devices as acceptable testing devices. However, we require a physical keyboard when taking assessments. Typically, people use wireless Bluetooth keyboards with iPads. As noted above, Bluetooth operates on the same frequency band as most Wi-Fi networks. It's not noticeable when three or four keyboards are in a room but when 30 get going, there can be significant interference with the WiFi network. The network won't go down, but it's bandwidth will be impaired.
Keyboards and other input devices can also interfere with each other. For example, Logitech's recommended density for their wireless keyboards and mice is far lower than a typical computer lab. Another danger is that students might mix up the keyboards among the devices. Finally, wireless devices have batteries to maintain.
Conveniently, Logitech and other manufacturers now offer wired keyboards for iPads.
Inadequate Router Capacity
In addition to bandwidth limits, routers have a number of other capacity limitations. Every open internet connection requires dedicated router capacity, even if it's idle. A lower-end router may not be able to handle more than 50 or 100 devices at a time.
Too Few Network Addresses
Routers also typically manage network address assignment. In a using the DHCP protocol, routers "lease" out addresses from their pool. Home routers typically have a pool of 100 or fewer addresses. Even commercial routers in their default configuration may not have a pool bigger than 250. Lease time is also important. A typical router configuration might have a pool of 200 addresses and give out week-long leases. In such a situation, if more than 200 devices come through the doors of the school in a week's period all of the addresses could be used up even if fewer than that number are present at any particular time.
Insufficient Power or Cooling in the Room
If you're setting up a temporary computer lab (e.g. for year-end testing) you may find that the room hasn't been wired with enough power for the number of systems you set up. Also, a desktop computer with monitor puts out about as much heat as a person. So adding 30 computers to a room meant for 30 people can double the cooling requirement. Laptops and tablets consume less power and generate less heat but the demand is still notable.
Here are some ideas on how to prevent problems like the above before they happen:
- Plan Ahead: Make sure the expected infrastructure is in place well in advance of key days (first day of school, first testing day) so that you have time to check everything out.
- Wired is Better: Wired networking has much greater capacity and reliability than wireless. Wired input devices are naturally tethered to the corresponding device and don't require batteries.
- Read the Specs.: Don't just test your bandwidth. Find out from your ISP how much is guaranteed. And compare your purchased capacity against your tested bandwidth. Likewise, don't just test the wireless network, look up the capacity specifications of your access points and routers and make sure they meet your needs.
- Hire a Tech: Get a trained network technician on staff, or at least under contract, and have them do a site survey.
- Do a Scale Test: Load a room with the expected number of devices and get that many people to exercise them all at once.
- Identify Interference Points: Who shares your internet connection? What other rooms share an access point? What facilities share a router? What is between the access point and its intended devices? (walls, furniture, etc.) Do any of the barriers move?
- Build In Redundancy: Install redundant devices at key places (e.g. routers and access points). If redundancy isn't affordable, have spare equipment available on-site.
- Map Your Network and Document Your Configurations: List everything you would want to know when troubleshooting or replacing a defective item.
Updated 24 June 2013 to add information about SchoolSpeedTest.org.
11 June 2013
The SETDA paper incorporates two models that I've used to categorize standards. The Four-Layer Framework for Data Standards divides data standards into four layers of work that build upon each other. The more recent Taxonomy of Education Standards looks at categories of data within the education sector. Shortly before I changed jobs, one of my Gates Foundation colleagues asked if I could make a chart placing existing standards efforts against these two models. I decided to do it all at once by merging the models into a matrix. Here's the result:
To understand the chart better, I recommend reading the descriptions of the two models. The Four-Layer Framework for Data Standards and the Taxonomy of Education Standards.
There's a lot of crowding in the area of student data. The standards in this area don't compete as much as it would appear. While there's some overlap, most fill complimentary roles. Details of all of these standards efforts and how they relate to each other are in the SETDA paper.
Here are links to the official websites and, in some cases, my writing related to each of the above standards.
Like most everything on this blog, these models and this chart are free to reuse under a CC-BY license. I hope they're helpful to your efforts.
29 May 2013
That it’s taken so long is an indicator of what a whirlwind this has been. After all, the consortium has been operating for a couple of years now, contracts have been awarded and the work is underway. It’s like boarding a moving train. My new co-workers, partners and vendors have been incredibly gracious as I’m learning this new job.
Smarter Balanced is one of the Common Core Assessment Consortia. It’s a partnership of 25 states and 1 territory most of which have also adopted the Common Core State Standards. The concept is that with common standards for English Language Arts and Mathematics we can collaborate to develop better quality assessments of student skills than individual states could do working independently. Funding is through grants from the federal government and multiple foundations.
Those of you who know my passion for non-summative assessment may wonder whether I’ve gone over to the summative dark side. Summative assessments include the end-of-year assessments given to K-12 students. They may also be final exams or any other tests that come at the end of a course of study.
Non-summative assessments include formative assessments that occur at the beginning of a topic to help students and teachers understand how much they already know, they include daily exercises and assignments, and they include interim assessments. In short, they include any assessment that occurs before the unit or course of study is complete. My previous post on feedback loops and my VSS talk from last fall explain why I believe non-summative assessment has such potential to improve student learning.
As I looked into the Smarter Balanced opportunity I was delighted to find similar passion for non-summative assessment. To be sure, summative assessments are a huge part of our work. But even these exams will use Computer Adaptive Testing technology to accurately place students in a development sequence rather than just determine whether they’ve met some proficiency threshold. And we’re working hard to ensure these assessments are more authentic – that is, activities being measured are closer to the way skills are expressed in the real world.
Moving one step earlier in the learning cycle, Smarter Balanced will also offer voluntary interim assessments that can be used earlier in the year to find strengths and weaknesses in student skills and inform subsequent teaching. They will use the same kinds of questions and skills alignment as the summative exams. I’m exploring how we can offer these interim assessment items in a way that allows other organizations to integrate them into their adaptive learning systems.
And moving to the beginning of the learning process, we are developing a digital library of formative learning materials. These will be teacher-facing content that helps teachers plan and implement formative activities in their classrooms.
For more than two years at the Gates Foundation I studied the state of online assessment and how it fits into the Personalized Learning Model. I’m excited to apply those ideas at that this scale. At Smarter Balanced we're addressing assessment needs in three ways: with year-end Summative Assessments, Interim Assessments to inform students and teachers as they learn, and with Formative Activities to introduce topics and get a sense of existing understanding. It goes without saying that I’ll write much about that here.
02 May 2013
InBloom is a service designed to help students achieve academic success through personalized learning. Those of us who helped develop the Shared Learning Collaborative (which was renamed inBloom in February) are convinced that personalizing the learning experience is the best way to improve student achievement. Whether personalization is being done by a teacher, an online learning system, or a synergistic combination of the two, it happens when information about what the student needs to learn intersects with information about available learning materials.
With that in mind, we set out to supply teachers and students with the data they need. That's what inBloom does. It taps into existing student data systems at schools, districts and states and makes that data available, in a secure way, to authorized teachers, students and parents. Simultaneously it indexes a library of teaching materials and makes them available to those same individuals.
A lot of work went into preserving student privacy. inBloom requires two things to happen before any student data can be retrieved. First, the application they are using must be authorized by the school district. Second, the individual using the application must be logged into inBloom and be authorized to access the requested data. This protection of student privacy is compliant with and goes beyond the requirements of FERPA and state data privacy laws.
So, who can access student data? Teachers can access data about students who are enrolled in their classes. Parents, if authorized by the school or district, can access their children's data. And students can access their own data. An application, such as a personalized learning system, can only access private student data if an authorized user is logged in to the app.
To match student achievement data against available learning resources, we need a common taxonomy of what it is that students need to learn. It's not sufficient to know that Johnny got an "A" on assignment number 5 but a "C" on assignment number 7. We need to know what learning objectives were represented by each of these assignments. That's why inBloom makes use of the Common Core State Standards. In the data, we can show that assignment 7 was on multi-digit multiplication. And, since it appears that Johnny needs some more practice, we can search the library for multiplication practice that's suitable to his age and preferences.
In a nutshell, inBoom supplies the student and content data needed for effective personalized learning.
Statewide Longitudinal Data Systems
For whatever reason, some people have confused inBoom with Statewide Longitudinal Data Systems (SLDS). The SLDS effort was launched more than a decade ago by the Bush Administration and funded by the Educational Technical Assistance Act of 2002. While a separate statute, it's related to the No Child Left Behind Act of 2001. The official SLDS website describes it this way:
Better decisions require better information. This principle lies at the heart of the Statewide Longitudinal Data Systems (SLDS) Grant Program. Through grants and a growing range of services and resources, the program has helped propel the successful design, development, implementation, and expansion of K12 and P-20W (early learning through the workforce) longitudinal data systems. These systems are intended to enhance the ability of States to efficiently and accurately manage, analyze, and use education data, including individual student records. The SLDSs should help states, districts, schools, educators, and other stakeholders to make data-informed decisions to improve student learning and outcomes; as well as to facilitate research to increase student achievement and close achievement gaps.Under grants from the SLDS program, 47 states are developing longitudinal data systems that aspire to collect student data from preschool through college and even into workforce placement. Analysis of the data should help researchers understand the impact of different factors and programs on student achievement.
Before being analyzed to find trends, the data is either anonymized or aggregated in order to preserve the privacy of the students. However, the databases themselves necessarily contain personally identifiable information (PII). That's because the data comes from multiple sources: K-12 schools, colleges and workforce databases. In order to connect all of the data about an individual together, you need to be able to match up records and that requires the personal identity information about each individual.
This concentration of individual data spanning decades of educational experiences spooks a lot of people. Two factors help moderate those fears. First, according to federal regulation, data is not combined between states nor is it reported to the federal government. Only aggregate data (sums, averages and so forth) is reported to the federal government. Second, the Family Educational Rights and Privacy Act (FERPA) prohibits the release of any student information without permission from parent. Of course, that doesn't reassure everyone. The mere fact that such databases exist concerns many.
I have a different concern. I've previously written about Theories of Change for educational improvement. In this case, the theory is that over time the collected data will help government officials, education officials, teachers and curriculum developers make better decisions based on what really works. But if we're trying to figure out how a particular curriculum choice in elementary school affects a student's college prospects, it may take 10 years or more to have the data to measure that effect. My concern is that this effort will take a long time to make a difference.
inBloom and SLDS both collect student data. Both leverage CEDS definitions for the data fields they collect. But the purposes of the data sets and the people who have access to the data are entirely different. Of the two, I'm more optimistic that inBloom will achieve the impact on student learning that our country needs.
23 April 2013
Several of the pundits have conflated five different projects as if they were all the Common Core. These are to some degree related but each has it's own sponsors and they are being managed and adopted separately. They are:
- The Common Core State Standards themselves.
- State or district curriculum.
- State Longitudinal Data Systems (SLDS).
- Race to the Top Assessment Consortia
The concept of state core standards gained prominence during the Bush Administration as part of the No Child Left Behind act. In a recent blog post I wrote about how they are part of the Standards and Accountability theory of education reform and how later and more promising theories also rely on quality standards.
The result of NCLB and related efforts is that each of the 50 states developed its own core standards. This has the vague advantage of more local influence but it has two significant disadvantages. First, there are differences between what students learn in different states. So colleges and universities don't have a consistent standard of preparation to expect from students. Second, developers of tests and curriculum spread their resources 50 different ways. The result is lower quality teaching materials and examinations.
Here are some ways that distinction applies: The Common Core describes the difficulty of text to be read at each grade; curriculum gives a list of actual books and stories. The common core describes the kinds of problems a student should be able to solve; curriculum specifies the order concepts will be taught and includes exercises to be performed. The rivalry between Phonics and Whole Language is not resolved by the Common Core; that decision remains in the hands of district curriculum committees.
Critics of the core have missed an opportunity here. Since curriculum involves textbooks, lesson plans and teaching materials, it consists of thousands of pages, tens of hours of video and other media. It's also copyrighted. All of this makes reviewing a curriculum a daunting task – albeit an important one. Meanwhile, the standards are relatively short and accessible. They are released under an open license and you can read them online at http://corestandards.org. They total somewhere around 200 pages long including appendices so you can review them in an afternoon.
So, what are the objections? A common one is that this is a federal program to control what our students learn. First off, this is a state-lead initiative, not a federal one. Secondly, the standards will only control teaching if the they are considered to be limits to what is taught. But they are really a floor, not a ceiling and most of the details remain left to the curriculum.
Other objections come from academics arguing for or against certain pedagogical theories that the rest of us aren't familiar with. For example, advocates for both Phonics and Whole Language have complained that the Common Core is a capitulation to the other side. But the Common Core Standards aren't as opaque as all of that. As I wrote a couple of months ago, the English standards focus on a few basic skills applied to increasingly complex texts. The math standards cover the familiar topics of arithmetic, algebra, geometry and so forth.
The biggest issue is that change is difficult and frequently unpopular. The changes demanded by the common core aren't easy ones. They require changes to curriculum; they require new lesson plans; and they require teachers to approach subjects in new ways. Many people are excited by the possibilities but it's not surprising that some would prefer to preserve the status quo.
Unfortunately, status quo isn't good enough.
The Common Core State Standards offer two important advantages over previous state core standards. First is simply that they are common. We hope that by concentrating their efforts on one standard instead of 45, developers of curriculum and examinations can do a better job than before. The second advantage is that the Common Core is a second-generation standard built on a foundation of the best state standards and informed by the experience of those who built the first generation.
Are they perfect? Not likely. But these new standards are better than previous ones and they will become a valuable tool in our personalized learning arsenal.
Update 26 July 2013:
The Fordham Institute has launched a website representing conservative support for the Common Core: http://highercorestandards.org/
- While older, this is another good editorial on the subject: http://www.hoover.org/publications/defining-ideas/article/147681
29 March 2013
Academic Standards include achievement standards like the Common Core State Standards (CCSS) plus curriculum and testing standards. Contemporary practice in the U.S. is to describe academic standards in the form of learning objectives – descriptions of skills that students can acquire or demonstrate. Historically it was more common to describe standards in syllabus form – as a list of subjects to be studied.
Encouraged by the No Child Left Behind Act, the 50 states have each defined core curriculum standards. More recently, the CCSS standards for Mathematics and ELA-Literacy have been adopted by 45 states. Using a similar process, the Next Generation Science Standards have been proposed for multi-state adoption. In higher education there is no such consistency. Some institutions have developed their own sets of standards but most leave the objectives up to the professor. A few industry organizations publish standard sets. These include the AAAS Benchmarks for Science Literacy and the National Center for History in the Schools standards for History.
Data Standards define the data elements and structures used to store and exchange educational information. In the Four-Layer Framework data standards may include layers 1-3 (Data Dictionary, Data Model and Serialization).
For education, the three major domains of data standards are Student Data, Educator Data and Content Data. Important metrics like graduation rate, student financial aid repayment or college-going rate are derived from data sets but aren’t data in and of themselves.
Student Data includes traditional demographic information as well as a student record which includes academic achievements, assessment results, learning activities, attendance and so forth. Educator Data includes information about teachers and staff. It includes qualifying information like academic credentials, a portfolio of creative works and publications and data about teaching performance. Content Data, often called metadata, is information about learning materials including textbooks, assessments, multimedia and digital resources. Content data often indicates the alignment between learning resources and academic standards like the CCSS.
Technical Standards define how systems interoperate. Accordingly, they usually include the protocol layer of the Four-Layer Framework. A wide variety of standards may fit into this category but the majority of education-related technical standards involve Content Packaging Formats, Interoperability Protocols and Data Exchange Protocols.
Content Packaging Formats support the transport of learning content (e.g. text, video, graphics, etc.) and assessments between systems. Examples include IMS Common Cartridge and SCORM.
Interoperability Protocols support interoperability among learning systems. The most common use case is integration of learning tools (like simulations, games or assessments) into learning environments (like a learning management system). Key functions are to identify the user to the learning tool, ensure that they are authorized to access the content, transfer control to the tool, and collect data back. Common examples include OpenID, SAML, OAuth and IMS QTI. Data Exchange Protocols represent layer 4 in the Four Layer Framework for Data Standards. Thus, data exchange protocols are usually paired with a corresponding data standard. Frameworks for setting up data exchange protocols include ESB, SOAP and REST.
20 March 2013
have been well-proven. Our model was a way to describe how technological supports could be designed to facilitate personalized learning. As we've been working on this for a couple of years now, it's time for a progress report.
In 2010 a consortium of states, coordinated by the Council of Chief State School Officers (CCSSO) and the National Governor's Association (NGA), introduced the Common Core State Standards for English/Literacy and Mathematics. They were rapidly adopted by 45 U.S. states. Having common standards across states is, of course, convenient but these standards seek to be an improvement on the previous generation.
The Common Core State Standards were written by building on the best and highest state standards in existence in the U.S., examining the expectations of other high performing countries around the world, and careful study of the research and literature available on what students need to know and be able to do to be successful in college and careers. No state in the country was asked to lower their expectations for their students in adopting the Common Core. The standards are evidence-based, aligned with college and work expectations, include rigorous content and skills, and are informed by other top performing countries. They were developed in consultation with teachers and parents from across the country so they are also realistic and practical for the classroom. (From the CCSS FAQ.)In August of 2012, the CCSSO and NGA released official identifiers and an XML representation of the Common Core thereby facilitating alignment of digital learning resource to the core standards. Driven by the need to measure and prove coverage of the standards, finer-grained identifiers are being assigned to individual learning objectives within the common core standards.
The Next Generation Science Standards are also under development with an expected release before the end of March. Following their release, state education boards will consider adoption.
Postsecondary education is taking a different approach. There's little formal agreement between colleges and universities on the learning objectives that compose common courses. However, college and university departments are defining the objectives for core curriculum and there is growth in the sharing of these objectives within university systems. Colleges are also considering use of the Common Core for developmental education courses.
Common Education Data Standards (CEDS) is a project to create a common data dictionary and logical data model for education data. Applications that align to CEDS use the same definitions for data fields making data exchange easier and increasing fidelity.
The inBloom Data Store uses CEDS for its data model and ingests data in SIF and Ed-Fi data formats. It offers an API through which personalized learning applications can store and retrieve common student data. Security features preserve the privacy of data and ensure that only authorized people can access it.
Newer data stores align student activity and assessment data to standard learning objectives. The goal is derive a model of what the student knows, what the student is learning and what the student has yet to learn. This enables rich reporting on student competency levels on an objective-by-objective basis and the stimulation of targeted interventions.
I prefer to talk about educational content as learning activities. There are the traditional passive media such as reading, lectures, video and so forth. More engaging are interactive activities like virtual labs, simulations virtual worlds and games. For both active and passive content, education doesn't need special formats. The web content formats managed by the W3C are adequate and well-supported. What is needed is a way to represent the alignment between the content or activities and the standard learning objectives.
The Learning Resource Metadata Initiative (LRMI) is a standard way to describe educational materials including their alignment to standards. It's based on the Schema.org metadata standard adopted by Google, Yahoo!, Bing and Yandex.
LRMI metadata can be shared between systems using the Learning Registry. The inBloom index consumes LRMI data from the learning registry and offers a search service that can find educational content suited to specific student needs.
IMS Global defines standards for packaging learning content for import into learning management systems. However, I prefer the approach IMS uses for Learning Tools Interoperability. Instead of packaging content, this protocol allows content from other sites on the web to be seemlessly integrated into the learning experience. Integration in this way avoids limitations imposed by the packaging format and lets the developers of learning activities collect data about the use and effectiveness of their products.
In my opinion, assessments are presently the weakest part of the Personalized Learning Model but that's changing rapidly. Two multistate assessment consortia, Smarter Balanced and PARCC are developing new assessments aligned to the Common Core State Standards. Both are committed to supplying formative and interim assessments in addition to year-end summative exams. CoreSpring is pooling assessments from a more than six different sources to supply a bank of good quality assessments that can be used in class, for quizzes and in interactive learning environments. MOOC developers such as Coursera, edX and Udacity are having to invent new ways to offer interactive assessments at extremely large scale.
In the long run, I expect the line between learning activities and assessment activities to blur. After all, much of learning occurs when the student demonstrates understanding. With adequately instrumented activities, the accumulated data about student competencies should reduce the need for big summative exams at the end of the year.
We've come a long way in the last couple of years. Pioneers in this space like DreamBox, Knewton, Read180 and GrockIt had to build a whole infrastructure. But now there's a solid set of building blocks on which developers can build personalized learning applications. I anticipate a lot more innovation at the place where student data and content come together.
06 March 2013
I answered truthfully, "They're not evaluating you, they're evaluating your school."
I found out later that with that information, he and his friends challenged each other to get the lowest scores possible. I sometimes use this story to illustrate broken feedback loops. It was nine months later before the scores had impact. When he returned to school the next fall he found he hand been enrolled in remedial math despite aceing Pre-Calculus the previous year. He had to meet with the counselor to get into the right class.
Today, however, I want to explore the theories of education reform that drove the deployment of these exams. There are three prominent theories of reform with a few variations. Most contemporary efforts to improve education are based on at least one of these.
Theory: Standards and School Accountability
This is the primary theory represented by No Child Left Behind (NCLB). It's based on the broader theory that measuring something and reporting on those measurements will bring about improvement – especially if improvement is incentivized. It also represents the truism that if you don't measure something, you can't tell whether you've changed it for the better.
In order to bring about accountability, NCLB requires states to define learning objectives for each year or grade. These objectives are commonly referred to as the state core standards and each U.S. state has its own set. Furthermore, any public school receiving federal funding must administer a state-wide standardized test to every student in grades 3-9 and at least once in grades 10-12. Student scores are compared with previous years' results to determine whether they have achieved Adequate Yearly Progress (AYP). Certain consequences are tied to individual schools' success or failure to achieve progress for all students.
The core of the theory is this: If we set higher standards, measure against those standards and report performance then learning will improve. Unfortunately, 11 years into this experiment the quality of U.S. student learning is nearly flat.
There are numerous criticisms of standards and testing; but my personal concern is that by themselves they are a blunt instrument. In the absence of a proven formula for improvement the result is a form of natural selection – schools that underperform are taken out (actually they "receive interventions") while better performers survive. Natural selection is proven to work but it takes many generations and a lot of the population are brutalized before measurable improvement occurs.
Despite the lack of success, it's not time to abandon standards or accountability. Prior to 2002 most states didn't have well-defined core standards nor was student performance consistently measured. Now all states have standards, we are measuring regularly and 45 of the states have recently agreed to the Common Core State Standards. While standards and testing are inadequate remedies by themselves, they are important assets on which to build.
Theory: Highly Qualified Teacher
Where the Standards and Accountability theory focuses on school improvement. This theory focuses on teacher improvement. It's certainly intuitive; most of us have had one or more great teachers and we know they make a big difference. It's also justified by the data. Studies confirm that teacher quality is an important factor in student achievement and that the variation in achievement between classes within the same school is greater than variation between schools.
NCLB includes a mandate for states to supply highly qualified teachers to every student but it leaves it up to states to determine what it means to be highly qualified. And that turns out to be a problem. Studies show that certain teachers are consistently more effective than others; value added measures can identify which ones they are (albeit with a moderate error rate); but individual teachers often don't know what they need to do to improve.
In raw form this becomes another application of natural selection. If we reward teachers who perform well and eliminate those who don't then eventually performance will improve – assuming we don't run out of teachers beforehand. But many generations will be required and a lot of brutal actions will be taken in the meantime. No wonder there's so much controversy around teacher evaluations being tied to wages and promotions.
I'm actually in favor of merit pay for teachers so long as good quality performance measures are used. But those evaluations need to be deployed concurrently with professional development that informs teachers on how they are doing and what they can do to improve. Conveniently, resources are emerging to support that. For example, the Measures of Effective Teaching project used the Danielson Framework for Teaching to identify teacher behaviors that are well-correlated with student performance. These and similar frameworks can be used to inform teachers on how they can do better.
Even so, effective teachers alone are not enough. In our current educational system, teachers account for approximately 8.5% of variation in student achievement. School-, teacher-, and class-level factors combined account for about 21%. Meanwhile, background characteristics such as race, parental achievement and family income combine to account for 60% of variation in achievement levels.
So, if every teacher in the country was equivalent to our very best, it still wouldn't be enough to overcome the cycle of intergenerational poverty. To achieve that dream, we have to increase the influence school has over student achievement. That can be done by adapting the learning experience to the needs of individual students.
Theory: Personalized Learning
There's a pattern to these theories: The Standards and School Accountability theory introduces the concept of measurement and uses it to assess whole schools. The Effective Teachers theory takes those same measures and applies them at the teacher level. For this third theory, feedback is applied at the student level.
Personalized learning leverages the same standards as the other theories. It can also incorporate the same measures. However, annual testing alone is insufficient for personalization. Instead, understanding is measured weekly, daily or, in the best adaptive learning systems, continuously. Measurement must happen soon enough and feedback given quickly enough to affect learning activities. A truly personalized system selects activities according to student needs and also adapts to student behavior within an activity.
Bloom's Two Sigma experiments and the follow up work they inspired make me optimistic about Personalized Learning. These and other studies have shown that personalized learning experiences enabled by immediate feedback consistently deliver one to two standard deviations improvement in learning. We believe that is sufficient to overcome background factors thereby enabling a majority of students become high achievers.
Personalized learning is the natural result of 1:1 tutoring which is why tutoring is so effective. To do personalized learning at classroom scale generally requires 1:1 computers and a role change for the teacher as she shifts from "deliverer of knowledge" to "facilitator of learning." As with the other theories, there's a lot of skepticism and resistance to change. But pilot deployments are showing great promise.
Variation: School Choice
School Choice attempts to bring competitive pressure for schools to perform better. In this way, it's a variation on the Standards and School Accountability theory. Like NCLB, School Choice needs standards to be set and school performance must be measured against those standards. Performance is reported to parents who are expected to make an informed choice of which school their students should attend.
Since allocation of school funds is tied to enrollment, the theory is that schools seeking students will compete, not only on standards and their measures, but also on the basis of any other factor that's important to parents and students.
School Choice efforts include charter schools, magnet schools and voucher programs. The idea is to give public and private schools more freedom to experiment thereby accelerating the identification of viable formulas for improved leaning. Studies have shown this to be the case as the average of charter school outcomes is similar to that of public schools while variation among charter schools is much greater. Therefore, some charter schools are substantially better and should be emulated while others are substantially worse and should be shut down or reorganized. It's exactly this kind of variety and freedom that school choice advocates seek.
School choice can incorporate Highly Qualified Teachers and Personalized Learning. Indeed, since both of these theories have been shown to be effective, the expectation is that schools that incorporate these principles will be the best rated and will attract more students.
Variation: Small Classes
The small classes movement is based on studies showing that students learn better in smaller classes – all other factors being equal. But other factors are not equal. Lowering the student:teacher ratio costs a lot of money and other factors such as teacher skill have a greater impact than class size. For example, when California mandated smaller classes they had to hire many more teachers. For at-risk populations, the impact of less-experienced teachers overcame the benefits of smaller classes resulting in lower performance instead of the expected improvement.
Variation: No Excuses
The No Excuses model centers on maintaining high expectations for student performance without making excuses for external issues such as background, troubles at home and so forth. It's associated with charter management organizations such as KIPP and BES. Proponents emphasize pillars such as college expectations, culture of respect, voluntary participation and high discipline. They also have extended hours and extended school years. A key value is the whole school's commitment to each student's success. If a student is struggling or falling behind, they discover that early and engage counseling, tutoring and other supports to ensure the student succeeds.
No Excuses engages all three theories, overall school performance is measured, they hire and train highly effective teachers and they adapt the learning environment to the needs of individual students, albeit most No Excuses schools do adaptation with limited use of technology. Over the last decade, No Excuses schools have demonstrated that background factors can, indeed, be overcome by a supportive school structure. On the other hand, their high reliance on supportive interventions sometimes leaves students underprepared for the independent learning discipline required in college. Recognizing this, No Excuses organizations are updating their practices to better train students to become independent learners.
Education Reform will remain an important part of our civic dialog for a long time. Unsurprisingly, it means different things to different people. For some it's a moral crusade. To those being asked or forced to reform it's more threatening. All too often arguments about reform neglect the research (which is abundant) and fail to fully express the theories on which they are based. That shouldn't be the case as there are decades worth of data and studies behind each of these theories – sufficient for advocates and policy makers to make informed decisions.
The data tells those of us seeking to eliminate poverty that incremental improvement to existing schools is insufficient. Personalized learning with an eye toward training independent learners seems to be the most promising approach. Deploying this at scale requires whole-school changes to the way programs are funded, to the choices of curriculum and technology, and to the roles of educators.