26 January 2015

K-12 Education Funding... and the Strings Attached

In the 2013-2014 fiscal year, California spent $70 billion on K-12 education. To put that in perspective, Bill Gates' net worth is $80.4 billion. So, in a single year, California spends nearly all of Bill Gates' wealth on teaching children. This is a good thing, of course, but it's also an impressive number.

Nationwide, the country spent $632 billion on on public elementary and secondary schools in the 2010-2011 school year (the latest year for which I could find data). That's nearly 4% of the US GDP and 10% of total U.S government spending (including federal, state and local).

Here's where the 2013-2014 California money came from, in billions of dollars. Other states have similar proportions between federal and state/local funds:

Local Funds$21.78031%
State Funds$40.86458%
Federal Funds$7.38211%

For this post I'm going to concentrate on the strings attached to the Federal funds.

The Elementary and Secondary Education Act (ESEA 1965)

Federal funding of education, at least at contemporary rates, centers on the Elementary and Secondary Education Act (ESEA). Passed in 1965 as part of Lyndon Johnson's "War on Poverty," the ESEA was intended to address inequities in education. It had been long observed that students from lower income, urban schools have significantly lower educational achievement than their middle income, suburban contemporaries. ESEA provided supplementary funding to the lowest achieving schools with provisions intended to insure that existing funding is preserved rather than replaced.

The ESEA was set up to require periodic reauthorization by congress – typically every five years. However, due to congressional gridlock on educational ideas, the reauthorizations have often been single-year continuing resolutions that continue funding for another year without changing the provisions of the law. Major updates occurred in 1981 under the Reagan administration and in 1994 under the Clinton administration. But the biggest update was No Child Left Behind, proposed in 2001 and signed by President Bush in January of 2002.

No Child Left Behind (NCLB 2002)

The No Child Left Behind Act (NCLB) is the name given to the 2001/2002 reauthorization of ESEA. It establishes the accountability and reform framework in which state education systems presently operate. In theory, states have the ability to opt out at the expense of federal funding. In practice, no state is willing to give up approximately 11% of their educational budget.

The principle focus of NCLB is on the Standards and Accountability theory of education reform. Here are the main requirements:
  • States must establish state standards (sometimes known as core standards) for achievement in English Language Arts (ELA), Mathematics, and Science. Most states also include standards for Social Studies and other subjects.
  • States must test all students in grades 3 through 8 and again in either grade 11 or 12 to measure progress in ELA and Math. 
  • At a minimum, states must test students in science three times. Once in grades 3-5, once in grades 6-9, and once in grades 10-12.
  • The testing results for each school should show Adequate Yearly Progress (AYP) toward having all students meeting or exceeding state standards by the 2013-2014 school year.

Adequate Yearly Progress (AYP)

Among the most challenging parts of NCLB as been the Adequate Yearly Progress requirement for schools. Schools receiving Title I assistance (those with a large number of low-income students) receive increasingly strident interventions each consecutive year they fail to achieve AYP:
  • Year 1: No intervention.
  • Year 2: Develop an improvement plan, provide students the option to transfer to other schools including paying for the transportation to get there, and prescribed uses of Title I funds.
  • Year 3: Must continue year 2 interventions plus and also provide tutoring and/or after school programs from a state-appointed provider.
  • Year 4: Must continue year 2 and 3 interventions plus one or more of the following: Replace responsible staff'; Implement a new curriculum; Decrease a school's management authority; Appoint an external expert to advise the school; or Restructure the internal organization of the school.
  • Year 5: Shut down or completely restructure the school.
When NCLB was passed, there was an optimistic outlook. Within 12 years, nearly all schools would be meeting state standards for performance with a small number of underperforming schools receiving intervention. It turns out that, as a country, we haven't worked out a formula for consistent school improvement. If the process for meeting AYP standards was well-known, the goals might have been met.

One concern has been that certain states set unreasonably low standards. Prior to adopting the Common Core State Standards, Tennessee had the lowest standards for reading while Massachusetts had the highest.

Despite low and inconsistent standards, so many schools are failing to meet AYP goals that there aren't enough resources to deliver the prescribed remedies. In 2011, 48% of public schools failed to meet AYP goals. In 21 states, more than half of schools didn't meet AYP goals and in 41 states and Washington D.C. more than one fourth of schools didn't make AYP. There aren't enough tutoring organizations, replacement staff, or trained principals to supply the year 4 and 5 remedies for this many schools, not to mention sufficient funds to pay for these interventions.


With so many schools failing to meet AYP goals and the remedies being impractical to implement, congress is way overdue for an ESEA reauthorization that adapts to current circumstances. Unfortunately, no proposed update has made any significant progress. Congress has left us with continuing resolutions that preserve the law as it stands.

To relieve pressure, the Department of Education, under Secretary Arne Duncan has begun granting waivers to NCLB to states that produce an acceptable alternative plan. Not surprisingly, the granting of waivers is controversial. The authority of the executive branch to waive requirements like these seems to have legal precedent. However, it's not clear that alternative requirements can be applied without congressional action.

Nevertheless, every state except Nebraska has applied for a waiver, many have been granted, and even Nebraska has announced plans to apply for a waiver in 2015.

The Way Forward

There's growing hope that congress may finally address ESEA reauthorization in 2015. There are even hints that the reauthorization may include support for competency education. Many organizations are offering wishlists for reauthorization from civil rights groups to advocates of federalist solutions. As in the past, divisions on education don't follow traditional political lines.

Here is my personal wish list for an ESEA reauthorization:
  • Preserve and strengthen state standards, encourage but don't require alignment of standards between states.
  • Preserve regular assessment of student achievement with an increasing emphasis on Depth of Knowledge.
  • Accelerate the shift from seat-time measures to direct measures of competency for the granting of secondary school credit.
  • Encourage the transition from periodic testing events to continuous assessment of student skills (curriculum-embedded assessment) with frequent and rapid feedback to students, teachers and parents.
  • Clarify the difference between standards and curriculum and establish a framework for public review of both standards and curriculum. Require schools to report the origin of curricular materials on public websites and on every worksheet or assignment.
  • Sustain the concept of interventions for schools not achieving AYP goals while shifting to more practical and supportive remedies than those in NCLB.

20 November 2014

Education Data Standards Update

Over the last couple of years, some colleagues and I have developed several models that are useful for understanding education data standards, where they apply and how they fit together. Many thanks go to host of collaborators who have reviewed and helped with these models.

The first is the Four-Layer Framework for Data Standards. This framework has helped guide decisions about the Common Education Data Standards – what should be the scope and how CEDS should relate to other standards in the space. However, the framework is not limited to education standards. Any organization that's developing specifications for the exchange of data should think of these four layers and try to describe each part semi-independently.

Last year I developed A Taxonomy of Education Standards. This framework categorizes standards according to their purpose or the domain in which they are applied.

These education standards are not exclusively data standards. Academic Standards, which include Achievement Standards and Competency Standards describe skills that students should be able to demonstrate as they achieve certain levels of education. Nevertheless, there are data standards for describing Academic Standards and for aligning content to those standards.

In May of 2013 my friends at SETDA published Transforming Data to Information In Service of Learning. This is an enormously valuable survey of existing data standards with guidance on how organizations can apply them to improve learning and support interoperability of their learning technologies. In doing so, they used both the four-layer model and the taxonomy.

Shortly thereafter, I combined the models into a two-dimensional matrix with the four layers on the horizontal axis the taxonomy on the vertical axis. This allows us to plot existing and proposed standards against the two dimensions to see how they fit together.

At the iNACOL symposium two weeks ago Liz Glowa, Jim Goodell and I presented a workshop on "Competency Education Informed by Data". For that workshop I updated the matrix to reflect changes in the standards landscape over the last year. Here's the updated version:

For that same workshop, Jim Goodell developed a matrix plotting the layers on the vertical axis and the progression from Pre-K to primary, secondary, higher education, and workforce data on the horizontal.

And to tie these all together, here's a translation of the acronyms into the standards with links to their corresponding websites.

AIFAssessment Interoperability Framework
CCSSCommon Core State StandardsBlog Post
CEDSCommon Education Data Standards
Ed-FiEd-Fi Alliance
EDIElectronic Data Interchange
ESBEnterprise Service Bus
IMS CCIMS Common Cartridge
IMS LTIIMS Learning Tools Interoperability
IMS QTIIMS Question and Test Interoperability
LRLearning RegistryBlog Post
LRMILearning Resource Metadata InitiativeBlog Post
NGSSNext Generation Science Standards
OAI-PMHOpen Archives Initiative - Protocol for Metadata Harvesting
OBIOpen Badge Infrastructure
PESCP20W Educational Standards Council
RESTRepresentational State Transfer
SEEDState Exchange of Education Data
SIFSIF Association
xAPIExperience API (AKA Tin-Can API)

Updated: 25 Nov 2014 to add the OAI-PMH protocol.

30 July 2014

Bitcoin - What Makes a Currency?

Today I'm diverging from the education theme to write about cryptocurrency. I am provoked, in part, by this quote from Alan Greenspan:

“It [Bitcoin] has to have intrinsic value. You have to really stretch your imagination to infer what the intrinsic value of Bitcoin is. I haven’t been able to do it. Maybe somebody else can.”

Now, Greenspan should know better than to say something like that. As a fiat currency, the dollar doesn't have any more intrinsic value than Bitcoin. And that's why I decided to write about this. Most of the supposed "Bitcoin Primers" out there are more confusing than helpful. They don't explain how money works or how cryptocurrencies like Bitcoin satisfy the requirements to become a currency.

What makes a Currency?

Currency is a form of money that accepted by a group of people to exchange value. A functional currency must have three important characteristics:
  • Scarcity - If you have too much of the currency, it's value will plummet toward zero. So, there must be a limited supply.
  • Verifiability - You must be able to verify that a unit or token of the currency is valid and not a forgery or imitation.
  • Availability - Despite scarcity, there still must be a stable supply of the currency to match growth in the corresponding economy.
Precious metals like gold and silver were the first common currencies. They meet all of the foregoing criteria. Gold is scarce; there's a limited amount of it available thereby endowing a small amount of gold with considerable value. It's verifiable; gold has certain characteristics, such as density, malleability and color, that make it easy to distinguish from other materials. And gold is available; while it is not common, gold mines still offer a consistent supply of the material.

One of the difficulties with early uses of gold currency was the complexity of exchange. Merchants had to use a balance or scale to determine how much gold was being offered. To facilitate easier exchange, governments, banks, and other trusted organizations would mint coins of consistent size and weight. This would allow someone to verify the value of a coin without resorting to a balance.

Fiat Currency

"Fiat" means, roughly, "because I said so." Fiat currency has value simply because some trusted entity says it does. It need not have any intrinsic value.

The first fiat money was the banknote. When making a large payment it could be inconvenient or dangerous to move large quantities of coins or bullion. Banks solved this problem for their customers by issuing banknotes. A banknote is a paper that a bank or other entity promises to exchange for a certain amount of coin, gold, or other currency. The bank could keep the corresponding gold locked away in a vault and people could carry more convenient paper certificates.

Beginning in 1863, the United States began issuing gold certificates as a form of paper money or banknote. Certificates like these were backed by stockpiles of gold held in places like Fort Knox. European countries did similar things. With the stresses of late 19th century wars and World War I that followed, countries discovered that they could issue more banknotes than their corresponding stockpiles. This led to a lot of instability until countries figured out how to regulate their currencies. But, by the end of the Great Depression, pretty much every economically developed country had fiat currencies controlled by a central bank. While backed by gold or other reserves, the value of these currencies is not directly tied to the value of gold.

Here's how the U.S Federal Reserve system works: The Federal Reserve Bank creates the money. Money is issued as currency (the familiar U.S. coins and bills) but also simply as bank balances. Indeed, far more money exists as bank records than in actual physical currency. Originally this was done through careful bookkeeping in bank ledgers. Now it's all done on computers. The money is issued in the form of low-interest loans, primarily to banks, which then lend the money to their customers and to other, smaller banks. Other central banking systems like the European Central Bank work in a similar way.

So, how does fiat money meet our requirements for currency?

Scarcity: Only one entity, the central bank, has the authority to create and issue the currency. The central bank limits the issue of money in order to preserve its value.

Verifiability: Coins and paper money are printed or minted using materials and techniques that are difficult for average people to reproduce but are fairly easy for to verify. Money in the form of bank balances is verifiable because each bank or credit union has accounts with higher-level banks ultimately reaching the Federal Reserve. So, when I write a check from my bank to yours, our two banks contact each other and transfer the value sending records up the banking chain until they reach a common parent bank which may be the Fed. Each bank in the chain verifies that the appropriate balances are in place before allowing the transaction to proceed.

Availability: Central banks can create as much money as they think the economy needs. The primary challenge for central banks is manage the money supply - ensuring both scarcity and availability.


Bitcoin is the first, but by no means the only cryptocurrency. The challenge that the pseudonymous creators of Bitcoin tackled was to achieve the three features of currency - scarcity, verifiability, and availability - in the digital realm. They magnified the challenge by prohibiting a central authority like a government or a central bank. Trust, in the case of Bitcoin, is in the system, not in any particular institution.

Scarcity: The "coin" part of most cryptocurrency names is somewhat misleading. Bitcoin doesn't consist of a bunch of digital tokens that are exchanged. If that were the case it would be hard to prevent double-spending of the same token. Instead, cryptocurrencies work more like bank account balances. Bitcoin has is one, big, public ledger that is duplicated thousands of times. All transactions in the ledger must balance - for one account to receive value, another account must be reduced by the same amount. This ledger is called the block chain and it contains a record of every transaction since the creation of the currency.

Verifiability: Cryptocurrences rely on public-key cryptography to ensure that only the owner of a currency balance can initiate its transfer. The bitcoin owner uses their private key to sign the transfer record and then posts it to the network of block chain replicas. Any entity in the network can use that owner's public key to verify that the transaction is valid and that ownership has been transferred.

Availability: Those who host a copy of the block chain have to perform the cryptographic calculations necessary to verify transaction validity and prevent fraud. Those who do this fastest are periodically rewarded through the creation of new Bitcoin balances. Because of the reward, maintaining the block chain is known as "mining" and a small industry of Bitcoin mining software and devices has developed. All users of cryptocurrency benefit from this because the more miners exist, the more secure the currency becomes due to the duplication of records and validation.

This is a tremendously clever scheme because it simultaneously ensures a consistent supply of currency, decentralizes operation, and secures the network against manipulation by creating thousands of replicas of the block chain.

Potential Impact

The true value of any currency is the willingness of a community of people to use it for daily transactions. The three requirements, Scarcity, Verifiability, and Availability combine to cause people to trust a particular currency. When that trust is lost you can get bank runs, hyperinflation, or simple destruction of wealth. Meanwhile, the community rushes to find a new currency.

The advent of the internet with myriad handheld devices capable of initiating transactions makes it possible for multiple currencies to coexist. For the first time in history, people may have a choice among currencies to use in daily transactions. Central bankers, and the sovereign countries that endow them with their power, are appropriately worried. An industry that has historically been immune to competition no longer has that protection.

I think this is a good thing. Just like any other competitive market, competition should incentivize good behavior both from established central banks and from upstart cryptocurrencies.

23 May 2014

Illusions of Success when Inputs are Confused with Outputs

Prosperity has been defined as, "the state of flourishing, thriving, good fortune and / or successful social status." In the United States we tend to measure prosperity in terms of wealth, or lack thereof. Indeed, the U.S. government defines poverty (the lack of prosperity) as having an income below $15,730 for a household of two. The trouble is, that this confuses the output (or outcome) of prosperity with one of its inputs, income (or wealth). And while the two values often correlate, they can be quite different.

In the early 1800's, Georgia gave away millions of acres of land through a series of land lotteries. Nearly everyone who was eligible entered the lottery because an individual had a roughly 1 in 5 chance of winning and a typical parcel was worth about the median net worth of a Georgia resident. A penniless person who entered the lottery had a one in five chance of suddenly becoming wealthier than half of the residents of the state.

When Hoyt Bleakly, of the University of Chicago, and Joseph Ferrie, of Northwestern University, learned of this event they found it to be a convenient natural experiment. Does handing out wealth to random individuals elevate their prosperity and does that prosperity carry over to future generations? The answer, at least in this particular case, seems to be "no." Even though wealth and prosperity are correlated, increasing wealth didn't increase the prosperity of the children. As Bleakley said on a Freakonomics podcast, "Maybe the resources have to come from outside the household, be it say a good public school. Maybe the resources have to come from the parents, but the parents don’t know how to provide it in terms of nurturing, in terms of reading and communicating ideas to their children, etc." In other words, wealth is only one of the contributors to prosperity and it may be among the least important.

Optimizing the Wrong Thing

When two features, like wealth and prosperity, are correlated, and one is easier to measure or influence than the other, a common mistake is to focus on the more convenient factor. The result is a host of unintended consequences.

This is a case where feedback loops offer insight:
A feedback loop with a short-circuit bypassing the system (or student).
In a proper feedback loop, we measure the output, compare it with the reference, and use it to choose the proper input. But when inputs are confused with outputs, the feedback loop is short-circuited – as with the red line in the above diagram. The evidence of this is when we get all kinds of reports showing how good the inputs are. Meanwhile, the real goal suffers.

A Pedagogical  feedback loop measures student outcomes (in the form of competencies or skills), compares them with standards of what students should know, and uses the result to choose appropriate learning activities. But, when inputs are confused with outputs we get reports of good student attendance, appropriate construction of curriculum, the prescribed amount of seat time, properly trained and certified teachers, high quality facilities, and all kinds of other reports about the inputs. Meanwhile, the output, in terms of student skills, remains unimproved.

Here are a few other inputs and outputs to consider:
To be sure, there's correlation in every one of these cases. But, just as with the Georgia Land Lottery, manipulating the input frequently diminishes the correlation and results in a less-than desired outcome. Focusing on, and reporting about the inputs can give the illusion of success. Focusing on the outcome helps identify other factors that contribute to the desired result.

Furthermore, excess focus on inputs results in missed opportunities. As Michael Horne and Katherine Mackey wrote, "Focusing on inputs has the effect of locking a system into a set way of doing things and inhibiting innovation; focusing on outcomes, on the other hand, encourages continuous improvement against a set of overall goals and can unlock a path toward the creation of a student-centric education system."

Incentives are Inputs

Just as mistaking outputs for inputs causes trouble, the reverse is also true. A 2011 study by the Hamilton Project compared incentives tied to inputs with incentives tied to outputs. Groups of students were offered financial incentives tied to input activities such as number of books read, time spent reading, or number of math objectives completed. Other groups were offered incentives tied to outcomes such as high test scores or class grades. The study found that input incentives were much more effective than output incentives. Among their recommendations are:
  • "Provide incentives for inputs, not outputs, especially for younger children."
  • "Think carefully about what to incentivize."
  • "Don't believe that all education incentives destroy intrinsic motivation."
This shouldn't be surprising. Incentives, at least when given to the student, are inputs. Incentivzing outcomes is a different kind of short-circuit in the feedback loop.
Feedback loop with a short-circuit bypassing instructional influence.
In a Pedagogical  feedback loop the instructional system interprets the results of assessment before passing them on to the student. When we incentivize the outcomes (or assessment thereof) we bypass the capacity of the education system to interpret student needs and prescribe the right learning activities.

It's notable that the Hamilton Project study found that incentivizing outcomes was especially ineffective for younger students. Among the goals of any educational system should be to develop students into independent learners. A mature, independent learner has taken on pedagogical skill and responsibility. For independent learners, incentivizing outcomes should be more effective.

Nevertheless, the Hamilton Project study didn't neglect outputs. In every experiment, the effect of the incentives was evaluated according to student outcomes. Only the point of intervention was changed.

Effective Measurement and Improvement

In 2005, New Hampshire abolished the Carnegie unit – a measure of seat time by which most U.S. schools quantify educational credits. "In its place, the state mandated that all high schools measure credit according to students’ mastery of material, rather than time spent in class." Thus, New Hampshire has shifted their fundamental measure of student achievement from an input to an output. Early results of that change are promising.

To be sure, optimizing certain inputs still has a positive impact. Otherwise schools would have completely failed since the institution of the Carnegie Unit in 1905. But shifting the focus from inputs to the outputs we wish to optimize will open the door to greater innovations and more rapid improvements in student achievement.

17 March 2014

Lecture Experiment at Summit Public Schools

A couple of weeks ago I attended the LearnLaunch conference in Boston. In one of the sessions, Diego Arambula from Summit Public Schools told a great story:

In one of their blended learning classes the students were taught by a team of teachers and given flexibility to choose the activities they felt would best help them learn the subject. One of the activities the teachers introduced was optional lectures. Strategically scheduled shortly before tests, the lectures gave students a chance to review material and solidify understanding.

At first, the lectures were quite popular – probably due to their proximity to tests. However, they found that the scores of those students who attended the lectures were not significantly different from those who chose not to do so. The students must have sensed the lack of impact because attendance at the lectures dwindled.

When lecture attendance fell to 3-5 students, scores of those who attended suddenly shot up. Arambula asked the teachers what was happening? The teachers said that with so few students attending, they didn't really deliver a lecture. Rather, they asked the students what areas they were struggling with and they concentrated the time on those particular issues. In other words, the lectures turned into teacher-led study groups or small-group tutoring sessions.

Eventually the teachers abandoned the lecture format and opened a "help bar" at the back of the classroom. Staffed by at least one of the teachers, students could go to the bar just about any time for one-on-one or small group assistance.

There are a bunch of things to learn from this vignette. Here are a few:
  • Summit was prepared to measure the effectiveness of the optional lectures (and presumably any other learning option they offer).
  • The teachers and staff are as much in a learning mode as the students. They discover what works and adjust in those directions.
  • Tutoring and small group instruction is tremendously effective even when it accounts for a small part of the student's learning experience.
Finally, Summit established an environment where innovation like this is natural and encouraged.

27 January 2014

Personalization Relies on Standardization - A Medical Metaphor

In my last post, I wrote about Yong Zhao's observation that the U.S. leads the world in cultivating 21st century skills like Confidence, Risk-Taking, Creativity and Entrepreneurship. Zhao is concerned that the current U.S. "obsession" with standards and assessment will result in reduced appreciation of creative endeavor. Indeed, Zhao's concerns are confirmed by contemporary de-emphasis of arts and humanities education in U.S. public schools.

I share Zhao's concern that today's schools suffer from excess focus on achievement as measured by test scores. I also agree with him that some of this is encouraged by federal programs like No Child Left Behind. However, I disagree with Zhao in that I believe that achievement standards and testing aren't the cause of the problem. Indeed, they're a critical part of the solution.

To explain this apparent contradiction, I’ll borrow a metaphor from Sir Ken Robinson. When I go to my physician, I expect a personalized, custom experience. I expect him to diagnose, treat and prescribe according to my personal needs. In order to do this, however, the doctor will use standard tests. He'll do a standardized exam and ask me standard questions. For example, he’ll measure my temperature in degrees and compare it against 98.6 Fahrenheit. He’ll measure my blood pressure in millimeters of mercury and compare that against standards established by the American Medical Association. Based on those results he may follow-up with custom questions or tests chosen according to my individual needs. But even those follow-on tests will be compared against standards. Finally, he'll prescribe a course of treatment that's customized to my individual needs.

Admittedly, not all doctors handle standards the same way. For example, when my cholesterol tested high, one doctor called in a prescription for Statin drugs without consulting me. This bothered me as I wanted to discuss how serious the problem was and consider alternatives like diet and exercise before simply taking a drug. Indeed, another doctor recommended a Coronary Calcium Scan before going on Statins. The test came out clean and I'm putting additional effort into my exercise.

That’s what standardized testing, properly done, is all about. This school year, the Smarter Balanced Assessment Consortium will test more than three million students in grades 3 to 11. The results from this first year will be used to calibrate the tests and find reasonable benchmarks for student achievement in English and Mathematics. In future years, students’ test results will be used by teachers, students and parents to customize learning activities to the needs of every child.

This isn't a complete solution. We need to actively fight the tendency to teach only what’s going to be tested. Not only is it not good for the child, strangely enough, “teaching to the test” doesn't improve scores as much as a well-rounded education. We also need to resist efforts to standardize curriculum and teaching. Standards belong to measurement of the results of education, not to the inputs.

Doctors can only directly measure a few vital signs and compare them to standards. For more detail they perform or prescribe more extensive tests. Some of these are screenings like the cholesterol test I had with my annual physical. Others are specific to certain problems like the CT scan I had after breaking some ribs. But even the full battery of tests available to a physician can't discover all issues. For the rest, a physician has to rely on interviews, experience, consultation with other doctors and sometimes trial-and-error.

The same is true for education. We can only measure a few of the factors that go into a well-rounded education. The Common Core State Standards only apply to fundamental skills in reading and mathematics. It's a small fraction of all that we hope children will learn. But that doesn't mean we should throw out the standards. Literacy and numeracy are fundamental skills that are prerequisite to every other academic skill we desire students to develop. The mistake is to assume that just because these are the skills that are being measured that they are the only ones that count.

Standards and testing are useful tools – but only when they serve the greater goal of developing confident, creative adults who are capable of a lifetime of self-directed learning.

16 January 2014

Is the U.S. Leading or Trailing the World in Education?

Is the United States leading or trailing the world in education? Unsurprisingly, it all depends on how you measure. And if we emphasize the wrong factor, we risk losing important qualities of the existing educational system.
2012 PISA Rankings
for Mathematics

First, the bad news. The results of the 2012 PISA tests were released in early December 2013. The United States ranks 26th in math and is below the OECD average in all three tested areas: Mathematics, Reading and Science. So, the common narrative that U.S. education trails the economically-developed world seems to be supported.

But if that's the case, how then does the U.S. rank 6th in per-capita GDP, 5th in Global Competitiveness and 2nd in Global Creativity?
Could it be that the U.S. economy is simply coasting based on a previous lead? That doesn't appear to be the case. Previous studies show that the U.S. trailed other industrialized countries in Mathematics, Science and Reading in the 1960s, 1980s, and 1990s. In fact, U.S. rankings relative to other countries have improved somewhat over the last 50 years. It would seem that the U.S. advantage leverages factors not captured by these test scores.

TIMSS is an international test of Mathematics and Science proficiency. In addition to measuring students' mathematical skills it also surveys their attitudes toward mathematics. Yong Zhao, an articulate critic of factory-model education, has drawn some interesting information out of the TIMSS results:

CountryMath ScoresConfidence %
(4th Grade)
Value Math %
Korea61303 (11)14
Singapore61114 (21)43
Chinese Taipei60907 (20)13
Hong Kong58607 (24)26
Japan57002 (09)13
United States50924 (40)51
England50716 (33)48
Australia50517 (38)46

Among countries, there's an inverse relationship between achieving high math scores and either valuing or having confidence in the use of math. Not visible in the table is that the TIMMS results also show that within countries, higher math achievement does correlate with greater confidence and with valuing mathematics. So, while higher skill in math results in greater confidence on an individual level, countrywide programs that result in high math scores do not result in high mathematical confidence or a sense of the value of mathematics.

It also suggests that development of mathematical skill must be combined with gaining confidence in applying mathematics and a sense of the value of mathematics. Mathematical skill alone is not sufficient to develop Numeracy.

Zhao concludes his analysis by pointing out that Confidence, Creativity and Entrepreneurship are key skills that drive U.S. economic leadership. An excessive emphasis on rote learning and test scores, what he calls an "employee-oriented" education, tends to suppress the more "entrepreneur-oriented" skills that are in demand for the 21st century. Rather than praise U.S. education for developing those skills, he simply says that U.S. education "is much less successful in stifling creativity and suppressing entrepreneurship."

I join Zhao and many others in decrying the factory model of education. We can do a lot better than simply "less bad." Our schools should foster more creativity and offer more personalized learning experiences. They should be places where it's safe to fail – especially when taking on a big challenge. And schools should encourage students to pursue studies in individual areas of interest.

Strangely enough, standards and even standardized testing can help with this but only when used properly. I'll elaborate on how that might be accomplished in my next post.

19 December 2013

Guest Post: Teacher Attitude Affects Learning and Testing

The following guest post is from Eileen Nagle, an extraordinary teacher that taught my son in 6th and 8th grades. Here she relates how teacher attitude and context can dramatically impact students' testing experience.

A professor I had in New Jersey taught us that if we taught above and beyond the state standards, played games during test week, and didn't make a big deal about the tests that our students would score high on the state testing. From my previous experiences I believe that testing results are very strongly determined by the teacher and the environment in which the test is given.

One year while teaching at an elementary charter school, I was the lead teacher of three grade level teachers. Jan (names are changed), who was in her 30s, had recently graduated from a local college. Next was Tammy who I had worked with the previous year, and then myself. I had received my certification only three years earlier but had homeschooled my own children for 17 years.

Jan wouldn’t do anything that she hadn't learned in college or that wasn't on the state standards. Being a charter school we had a more enriched curriculum than the local public schools, which is why the parents sent their children to our school, but she wouldn’t do it.  Her students would express that they ‘hated’ various parts of the curriculum, parroting their teacher. Tammy was in her 20s and was open to ideas and taught the extended curriculum. Because of Jan’s protests of not being able to handle them, Tammy and I each had a very high needs student in our classes, Tammy struggled each day with classroom management because of the challenges her student presented.

When state testing was coming up we met and I gave them some ideas on how to handle the week.  I told them to tell their students that the test wasn't any part of their grades and to just have fun with it. Jan was freaking out. With my students, I sent a note home asking for healthy snack donations to give to the students before and after tests. Parents were generous and many donations came in. I prepared an art project, a Mother’s Day gift, they could work on when they were finished with each testing section so they wouldn't sit and get bored.  We did fun games to loosen up muscles during the day.

Academically, I didn't do anything special to prepare them other than just teach as I always did. I believe an enriched curriculum taught in a fun way, with lots of music and role playing, will go far in preparing students for testing. They learn deduction skills, retain the information because it was fun, and do well on tests. I didn't do any 'test prep'.

At the end of test week my students wanted to know when it was supposed to get hard.  They thought it was a cake walk and requested another test week the following week.

Jan's experience was different. The first day of testing she came into my room in a panic saying two of her students had thrown up. Her hands were shaking as she described students saying they were scared and were crying. She was crying too saying the stress was too much. She wouldn't take any of my advice so there wasn't anything I could do about it.

The tests results came back and my class scores were the highest, Tammy had the middle level and Jan’s class scored the lowest of the three. There are many variables to testing, but students will perform better if teachers will:
  • Creatively teach to a much higher level than the state tests all year.
  • Don't teach to the test
  • Reduce any stress going into the tests.
If one must test these steps will certainly improve the experience and the outcome.

Eileen Nagle is the Outreach and Workshop Coordinator at the Noorda Theatre Center for Children and Youth, Utah Valley University. She can be reached on LinkedIn (Eileen Nagle) or Facebook (Noorda Center)

25 November 2013

Quantifying Learning: Alternatives to the Carnegie Unit

In 1905, Andrew Carnegie was seeking "ways to improve the economic standing of college professors and the provisions for their financial security in old age" (ref here). In consultation with the president of MIT, he created a free pension fund for college professors. Of course, many colleges and universities were eager to participate in a free benefit of such value. So the Carnegie Foundation for the Advancement of Teaching, which administered the pension, had to set standards for qualification. Among the requirements was that institutions would use the "standard unit" when evaluating high school transcripts for student admission.
The standard unit was created by Charles W. Eliot at Harvard University. Essentially, it measures the number of contact-hours between student and professor. The unit used by the Carnegie Foundation represented 120 hours of class or contact time over the course of a year at the high school level. This is now known as the Carnegie Unit and remains the primary way of measuring achievement in U.S. high schools. On the heels of that, Morris L. Cooke (also with support from the Carnegie Foundation) established the collegiate Student Hour as one hour of lecture, lab work, or recitation per week for a single semester. Today we usually call these "Credit Hours."
Seat time measures like Carnegie Units and Credit Hours are only proxies for actual student learning. Adding class grades to the measure is an attempt to increase their reliability. But there are two problems with this. First, grades are not necessarily a good indicator of actual learning. Anyone who has been through school knows that their best grades aren't necessarily in the classes where they learned the most.
Second, grades reinforce the industrial era notion of school as a sorting device. We send thousands of different students through the same learning experience and then grade their performance. Based on those grades, society decides who is qualified for college and a professional career, who should go into service industries, manual labor, and, perhaps, who will be our criminals.
School doesn't have to sort so viciously. A growing body of evidence indicates that by personalizing learning a majority of students can achieve readiness for college and professional careers. That's important because with automation and offshoring, the number of unskilled jobs in the U.S. is diminishing. But with teacher compensation, student evaluation, school budgets, admissions, financial aid, and pension plans all tied to seat time measures, the environment hasn't been conducive to personalization.
Recognizing this, the Carnegie Foundation recently set out on a year-long quest to seek better ways to measure student learning. The result should be a measure based on competency, not time. The results of their study are due in 2014. In the meantime, here are some of the alternatives already emerging:

Challenges and Waivers

This is an effective interim solution. Alabama and Michigan have Seat Time Waiver policies for high school credit. If students can show mastery of a topic, they are granted credit for the course without regardless of how much time they spent studying or in class. The Ohio Credit Flexibility Plan allows students to earn high school credit by demonstrating competency, completing classroom instruction or a combination of the two. The College-Level Examination Program and similar programs allow college students to obtain credit by demonstrating knowledge on a standardized test. Many universities also allow students to take challenge or exemption exams.
Notably, all of these programs convert demonstrations of competence into seat-time units or waivers thereof. The Carnegie Unit and Credit Hour as measures of learning remain intact. These options represent a transition rather than a new solution.

Merit Badges

The Boy Scouts and Girl Scouts award badges when youths demonstrate skills like First Aid, Knot-Tying, Swimming or Computer Programming. Patches are earned by attending events. Scouting organizations borrowed the badging concept from centuries of military tradition. Education badges are based on this model. Organizations like UC-Davis and Khan Academy have badging systems. The Mozilla Open Badges project is an effort to create a universal format and exchange for badges of all types. They've signed up a diverse variety of organizations and institutions including colleges and universities, MOOCs, professional training companies, the Smithsonian museums and more.

Competency-Based Schools

Western Governor's University substitutes "Competency Units" for credit hours. Students receive credit when they prove competency. This lets student get credit for prior knowledge and also lets them progress through the course materials as quickly or slowly as they choose.
New Hampshire is initiating a statewide redesign of high school education that will be based on demonstrations of competency. In a similar vein, the Re-Inventing Schools Coalition (RISC) is working to help schools develop a performance-based system for earning credit. Among their members are the Adams County School District in Colorado and the Chugach School District in Alaska.


Professional certification programs like MCSE or CCNA specify a set of competencies and a way to demonstrate the associated skills. Individuals seeking a credential can choose the path that suits them – reading a book, attending a class, watching videos, or an online course. Once competencies have been specified, it's possible to separate the learning of a skill from demonstration of that skill. When learning and credentialing are unbundled it's possible to compare different learning methods to see which is more effective. And different students can choose methods that are better suited to their current needs, market positioning or student body. 

Making Success the Only Option

An oft-repeated phrase among competency advocates is that grades should be "A, B, and 'still working on it.'" This necessitates flexibility on the part of the teachers and the school to meet the needs of each individual student. To do this in a conventional classroom takes more time and energy than should reasonably be asked of a teacher. Among the best ways to apply technology in education is to expand teacher's capacity to personalize education and spend more time one-on-one with students.
The other source of capacity is the students themselves. In the long run, our goal is to train students to be self-learners. If the right resources are offered, students can adapt the learning experience to match their own needs.

11 October 2013

Things Engineers Can Teach Us About Feedback

Some time ago I wrote about feedback loops – how they are part of the engineering discipline of Control Theory and how, by substituting a few words, the principles apply surprisingly well to education.  Here's a diagram of a feedback loop according to control theory:
A closed-loop control system.
And here's the same diagram substituting educational terms for the engineering ones.

A personalized learning system.

In that post I noted that the feedback loop needs to be "closed" in the sense that we use feedback to influence the direction instruction should take. Also that feedback needs to be "negative" in the mathematical sense. That is, feedback should reflect the difference between skills the student demonstrates and standards that are to be taught. Both of these concepts, closed feedback loops and negative feedback, are derived from engineering control theory.

In this post, we'll consider three more insights we can gain from control theory: the role of a transfer function, speed and frequency of feedback, and sensitivity to what is being measured.

Transfer Functions

A transfer function is a mathematical description of the relationship between the input and the output of a system. Let's use the car example from my previous post. In that case, the input is the position of the gas pedal (more correctly called the "accelerator pedal" as we'll see in a moment). The output is the speed of the car. Pressing the pedal to a certain position doesn't make the car go a corresponding speed. Rather, pressing the pedal causes the car to accelerate at a rate proportional to the pedal position. If you keep the pedal pressed to the floor, the car will continue to accelerate to higher speeds until it reaches the limits of its construction. (For calculus fans, this means that the transfer function of a car's drive train is an integral.)

The transfer function is important because it's incorporated into the design of the controller. When an engineer designs a controller they use the transfer function to anticipate what will be the result of a particular input. Typically they take the inverse of the function to determine what input is required to achieve the desired output.

The educational equivalent of a transfer function is a learning theory – a description of how people learn. Learning theories help us select activities that will effectively help a student learn a particular skill. Descriptions of the various theories and their strengths and weaknesses are beyond the scope of this post (or my skills for that matter). I recommend the Wikipedia article on the subject. But we can derive two important insights from this:
  • A personalized learning system will inevitably express some learning theory in the selection of activities. It would be best to deliberately select the theory and design the system accordingly.
  • There are personal differences in the way each student learns. In engineering terms, this means that each student has their own personal transfer function. Therefore, the selection of activities should be tuned to the student's individual interests and affinities.

Speed and Frequency of Feedback

The time from the moment an output is measured to a resulting change in the input is called a propagation delay. In educational terms, this is the time from when a student's skill is assessed until moment a student's activity is affected by that. In a traditional math class the student does homework one day, submits it the next day and receives graded homework back the next day. Thus, the propagation delay is two days (or two class periods). Fast feedback means a shorter propagation delay. Many online learning systems offer near instantaneous feedback. Measurements that require human grading will naturally be slower.

Frequency of feedback is a measure of how often the output (or skill) is measured and feedback generated. In the traditional mathematics example, feedback is daily (or once per class period). Some traditionally taught courses may only have two or three graded activities in the entire course. However, this may be a pessimistic way to measure frequency. For example, if students can check their answers in the back of the book then feedback is both faster and more frequent

A third component of educational feedback is richness. In math, this might be the difference between being told than an answer is wrong and being informed about exactly what mistake was made. In English it might be the difference between a simple score and detailed feedback about how the student might improve their paper.

Students can influence all three feedback factors. For example, if English students seek help at a writing lab then they will be getting faster, more frequent and richer feedback than students that don't make use of the resource.

Control theory tells us that faster and more frequent feedback compensates for inaccurate measurements and poorer transfer functions. In education language, this means that if we can make feedback faster and more frequent we can compensate for a less-than-perfect learning theory and suboptimal assessments.

Of course it would be nice to have everything -- fast, frequent and rich feedback, good quality assessments and a solid learning theory. But it's useful to know that there are real tradeoffs among these factors.

Sensitivity to What's Being Measured

Feedback loops are a very effective tool; so effective that if the wrong thing is being measured or the wrong feedback is offered then the wrong skill will be optimized. A recent manifestation of this are complaints of "teaching to the test." The concern is that since summative tests are used to evaluate schools then the only skills that will be taught are those that are on the test. While this outcome is common, it's unfortunate since studies have shown that focus on conceptual understanding results in better test performance than test-focused instruction.

It's also manifest in the combination of skills that a particular problem might require. For example, a mathematics story problem might require reading, visualization, and problem solving skills in addition to the ability to solve the resulting mathematical equation. In order to offer feedback to a wrong answer, the system (whether human or automated) must be able to detect which of these skills was not applied properly. In most cases, this requires interacting with the student to discover the steps followed in answering the question.

It's tempting to try and isolate skills and only assess one at a time. There are two reasons why this won't work. First, it's very likely that you're seeking the student's ability to use multiple skills together. Second, the demand for some skills simply can't be eliminated. For example, nearly every assessment requires the skill, "Can read and follow directions."

Applying Feedback Loops

To summarize, engineering offers us the following insights about using feedback in education:
  • Choose your learning theory deliberately and measure its effectiveness.
  • Adapt not only to what the student has and has not mastered but to the individual learning patterns and affinities of each student.
  • Fast and frequent feedback can compensate for lower quality in other areas of the system. This is a two edged sword; you may think you have a good learning theory when, in fact, it's fast feedback that's making the difference. But it's also an opportunity to make deliberate trade-offs.
  • Be sure you're measuring what you think you're measuring. And don't forget that every assessment measures multiple skills.

26 July 2013

Can Consortia Improve Standardized Testing?

The NCLB embodiment of standardized testing has been in place for eleven years now. Unfortunately, it hasn't resulted in substantial improvement in student learning. But there are things the multi-state assessment consortia can due that will improve the situation.

To many of us, the lack of progress has been confusing. In just about every other industry, when you measure performance and report it back, performance improves. Education is proving to be a more difficult problem than most.

Unintended Consequences

Part of the issue is unintended consequences from standardized testing. Consider a teacher who is anticipating the year-end tests. She and her principal are under pressure to achieve Adequate Yearly Progress (AYP) goals. So, they have regular drill and review sessions -- test preparation to make sure the students are ready.

All of this "teaching to the test" takes time and resources away from more enriching and interesting learning experiences. And it doesn't work. The Measures of Effective Teaching project found that teaching to the test is not as effective as a focus on conceptual understanding and applications. (MET Preliminary Findings Page 21)

That shouldn't be surprising. I wrote about the Flow Channel or Zone of Proximal Development a few months ago. In order to keep the student's attention and prevent frustration, the work needs to be new and challenging but not excessively so. If the work is too easy, the student is bored. If too hard, the student is anxious. In either case the student is frustrated and learning  does not occur. Constant drilling in preparation for the test puts students deeply in the boredom zone.

There are lots of important skills that aren't on the exam. Yong Zhao has written that the most important 21st Century Skills are creativity, entrepreneurship and independent learning. These aren't emphasized in the standardized exams. But the basic skills of literacy and numeracy (which do appear) are required for the kind of creativity and entrepreneurship we need. For this reason, I wish that the Common Core State Standards were named the "Common Foundation Standards" because that's the way I see them -- the foundation skills required for creative work.

This is another unintended consequence of standardized testing. Excessive focus on the exam steals classroom time that could be used for creative application of the knowledge or for self-directed learning. The MET study and others show that more of the latter activities results both in better prepared students and concurrently better exam results.

How To Do Better

Noting the minimal progress, many advocates call for abandoning the standards and assessments altogether. They look at the amount of time and money dedicated to assessment and suggest these resources could be spent in better ways. But, in the absence of standards and measurement we wouldn't know if we are succeeding or failing. The best we could hope for is blissful ignorance. In my opinion we must push forward, improving standards, measurement and teaching.

While improvement will require changes throughout the educational system, there are things that can be done with the assessments themselves to support improvement. Here are some of the things being done by the Smarter Balanced Assessment Consortium and by PARCC:

Richer, More Authentic Assessment Items
Both consortia will use computer-delivered assessments that are much closer to real-world activities. An emphasis on constructed response items will require students to compose an answer to a problem, not just select from a set of prewritten answers.  Upon evaluating the consortia's assessments, the  UCLA CRESST center concluded, "Both PARCC and Smarter Balanced summative assessments ... will represent many goals for deeper learning, particularly those related to mastering and being able to apply core academic content and cognitive strategies related to complex thinking, communication, and problem solving."

Teaching to the test becomes less of a problem the closer the exam gets to assessing real and authentic skills.

Guidance from Interim Assessments
In addition to the year-end summative assessments, both consortia are offering voluntary interim assessments that teachers and administrators can use to gauge students' understanding throughout the school year. If students are found to be prepared, less time will be spend on boring and unnecessary drills. Likewise, identification of weak areas can guide teachers in reviewing just the necessary lessons.

Professional Development and Formative Activities
The consortia are developing training materials for teachers. These will include information on how to plan formative assessment activities for the classroom and how to interpret and make use of assessment results.

Precise, Individual Level Reporting
Existing state assessments measure the number of students who have achieved the state competency threshold for their particular grade. These measures are reported to schools, districts and states in hopes of improving education programs at those institutions. Threshold tests can measure whether a student is above or below the expected competency but the further a student is from that level, the less accurately they can indicate the student's actual competency.

Smarter Balanced will use computer adaptive testing to precisely measure each student's skill level. In adaptive tests, questions are selected based on the results of previous assessment items and testing ends once the student's skill level has been measured to a certain level of confidence. Thus, students that are below grade level aren't subjected to a long series of questions that they can't answer and both those above and below the threshold receive accurate measures of their skill levels.

These more precise assessments measures are used to generate clear, individual level reports for students, their parents and teachers. The reports will have sufficient detail to show growth year over year and to optimize instruction to address individual student needs.

~  ~  ~  ~  ~

Standards and associated assessments haven't resulted in the improvements that were hoped for. But that doesn't mean we should give up on them. They offer an important support for the Personalized Learning theory which has been proven. Refinements to standardized tests listed above will reduce unintended consequences and offer the guidance needed to optimize each student's learning experience.

06 July 2013

References Needed to Complete Engelbart's Vision

Douglas Engelbart, a pioneer in personal computing and one of my heroes, died this past Thursday. Many of the concepts he invented are part of our daily personal computing lives. But there are still a few missing pieces, one of which is a referencing system.

Here's a rough outline of how a series of pioneers developed the technologies you find familiar:
  • 1945: Vannevar Bush, Director of the Office of Scientific Research and Development, writes "As We May Think." Writing at the conclusion of World War II, Bush considers how technologies developed for war can be used to further peace. He envisions an electromechanical system based on microfilm and dry photography that can manage all of the data a person needs and help them organize it into knowledge.
  • 1968: Douglas Engelbart, Director of the Augmentation Research Center at SRI, is inspired by Bush's article. He realizes that the concept can be achieved much more readily using digital computers instead of an electromechanical system. In 1968 he demonstrates their oNLine System (NLS) in what we now call the "Mother of all Demos" including a mouse, graphical user interface, collaborative word processing, teleconferencing and a host of other features that would take decades to make it into the mainstream.
  • 1973: Alan Kay, who had attended Engelbart's demo, incorporates many of Engelbart's ideas into the Xerox Alto. Designing the Alto so that it can be used by children, Kay's insight is that the user interface should manifest the functions that are available. Thus, the system itself can teach the individual how to use it.
  • 1984: Steve Jobs, who had seen a demo of the Alto in 1979, incorporates key elements into the Apple Macintosh. Features inherited from NLS and the Alto include the mouse, GUI and computer networking. Jobs' most important contribution is to get these ideas out of the lab and offer them to a mass market.
If you watch Engelbart's Demo you will see many now-familiar ways of using a computer. NLS centered on the creation and management of documents. These documents were indexed for convenient retrieval and sharable with all other NLS users.

But there was a critical feature in NLS that we have not yet replicated. Each document was given a unique ID. Printed NLS documents were easily recognized because they included index numbers in the margins. The document ID and index number allowed individuals to reference any line in any NLS document.

That need for consistent identifiers that can be referenced has yet to be addressed for most texts. Today, the best that citation systems can do is refer to a page number. But page numbers change with different formats (e.g. hardbound vs. paperback) and editions. Suppose, for example, you want to cite a particular quote from Huckleberry Finn. In order to do so, you have to specify the publisher and edition of the book before citing the page number. And the odds of a reader having that same edition is pretty low. Textbook publishers have taken advantage of this. By changing pagination between editions, they deliberately obsolete previous editions.

With the advent of digital books the problem is compounded. Page numbers change according to user preferences like font size and page orientation. As a stopgap, the Amazon Kindle added "real page numbers" so that you can use references derived from the paper version of a book. But the problem of persistently valid references across editions lingers.

As we honor the legacy of Doug Engelbart, it's appropriate to consider one more of his innovations -- a persistent and universal referencing system. We still need it.

22 June 2013

Education Technology Readiness - Preventing the Unexpected

It's the first day of a new blended learning program. You've figured out how to acquire computers for all of the students. You've chosen a really exciting online curriculum that includes adaptive learning. You've spent the summer learning how the system works, adding personal touches to the lessons and preparing to coach the students. You've gone to the classroom, made sure it has Wi-Fi coverage. Tested bandwidth and played videos. The students arrive, log into their laptops... and everything crashes.

Technology readiness is a new concern for state and district technology directors. With high-stakes assessments going on, concerns are even higher. The RTTA Assessment Consortia have collaborated on a Technology Readiness Tool that states and districts are using to survey and report on their preparedness to perform assessments. Smarter Balanced and PARCC have added Technology Readiness Calculators that help districts and schools perform capacity planning.

But there are a bunch of things that go wrong that are overlooked by conventional planning and testing. Some of them are even missed by experienced network technicians. It's not that the tools above are flawed. It's the nature of these kinds of problems that require them to be addressed in a different way. Today I hope to prevent them from biting you. The following list isn't comprehensive. But it's a good start.

Inadequate Bandwidth
This is why things work when you try them the night before, but on the first day of class, with 30 students all trying to stream video at the same time it all falls apart. It's a well-known issue and bandwidth planning is a key part of the planning tools I mentioned above. Still, this is important enough that it bears an additional mention.

EducationSuperhighway is doing a survey of actual in-classroom bandwidth and using that data to advocate for better connectivity. They use a bandwidth test and ask teachers and educators to run it periodically. While they acknowledge the weaknesses of this approach (listed below) getting a lot of samples will increase the accuracy of their reports. So, if you work in a school, please go to SchoolSpeedTest.org and run their test from time to time. Not only will it inform you but it will also contribute to nationwide advocacy for school bandwidth.

Mismeasurement of Bandwidth
Bandwidth tests like SchoolSpeedTest.org are useful tools but they can be misleading. Most internet service providers offer "burst speed" in excess of the guaranteed bandwidth. Consider a municipal ISP. Perhaps they purchase 10Gbps of bandwidth from their upstream provider and parcel it out to 100 customers at 100Mbps each. None of those customers will use all of their bandwidth all of the time. So the ISP lets them burst beyond 100Mbps, using some of their neighbors' unused bandwidth. So, if you do a bandwidth test at a favorable time, the result could be much higher than what's guaranteed by your ISP.

On the other hand, I've visited with schools who get much lower performance in their classrooms than what they pay for. In their cases, outdated networking equipment or problems with their wireless networking create a bottleneck that slows things below their purchased capacity.

Inadequate Access Point Capacity
Wireless networking bridges to wired networking through access points positioned around the building. Home networks combine the access point with the router. But commercial networks usually have one or two routers for the whole building with access points positioned strategically throughout.

Any access point has a limit to the number of computers it can serve. Consumer grade devices have a lower capacity but even commercial units can be overwhelmed if you get too many devices in the same room. Most people know that the access point's bandwidth (typically 54Mbps) is shared among all connected devices. But you can't just test the bandwidth in a room with a single computer and then divide by the expected number of computers to get available bandwidth. There's bandwidth overhead to each connection and there's a ceiling on the total number of computers that can be supported by a single access point. The max device count varies from model to model but it's always there.

Interference Between Access Points
One way to address access point capacity limits is to use more of them. But if you pack them too closely together they will interfere with each other -- thereby impairing your capacity rather than building it.

Interference From Other Devices
The 2.4Ghz band used by most Wi-Fi devices is also used by Bluetooth, some cordless phones, and many other devices. The 5Ghz band is also available for Wi-Fi but it isn't supported by as many devices and it has a shorter range. Microwave ovens also happen to be in the 2.4Ghz band. Poorly shielded units can jam all network traffic in their vicinity.

Bluetooth Keyboards
This particular case of device interference deserves special attention. Smarter Balanced supports iPads and other tablet devices as acceptable testing devices. However, we require a physical keyboard when taking assessments. Typically, people use wireless Bluetooth keyboards with iPads. As noted above, Bluetooth operates on the same frequency band as most Wi-Fi networks. It's not noticeable when three or four keyboards are in a room but when 30 get going, there can be significant interference with the WiFi network. The network won't go down, but it's bandwidth will be impaired.

Keyboards and other input devices can also interfere with each other. For example, Logitech's recommended density for their wireless keyboards and mice is far lower than a typical computer lab. Another danger is that students might mix up the keyboards among the devices. Finally, wireless devices have batteries to maintain.

Conveniently, Logitech and other manufacturers now offer wired keyboards for iPads.

Inadequate Router Capacity
In addition to bandwidth limits, routers have a number of other capacity limitations. Every open internet connection requires dedicated router capacity, even if it's idle. A lower-end router may not be able to handle more than 50 or 100 devices at a time.

Too Few Network Addresses
Routers also typically manage network address assignment. In a using the DHCP protocol, routers "lease" out addresses from their pool. Home routers typically have a pool of 100 or fewer addresses. Even commercial routers in their default configuration may not have a pool bigger than 250. Lease time is also important. A typical router configuration might have a pool of 200 addresses and give out week-long leases. In such a situation, if more than 200 devices come through the doors of the school in a week's period all of the addresses could be used up even if fewer than that number are present at any particular time.

Insufficient Power or Cooling in the Room
If you're setting up a temporary computer lab (e.g. for year-end testing) you may find that the room hasn't been wired with enough power for the number of systems you set up. Also, a desktop computer with monitor puts out about as much heat as a person. So adding 30 computers to a room meant for 30 people can double the cooling requirement. Laptops and tablets consume less power and generate less heat but the demand is still notable.

Averting Problems
Here are some ideas on how to prevent problems like the above before they happen:
  • Plan Ahead: Make sure the expected infrastructure is in place well in advance of key days (first day of school, first testing day) so that you have time to check everything out.
  • Wired is Better: Wired networking has much greater capacity and reliability than wireless. Wired input devices are naturally tethered to the corresponding device and don't require batteries.
  • Read the Specs.: Don't just test your bandwidth. Find out from your ISP how much is guaranteed. And compare your purchased capacity against your tested bandwidth. Likewise, don't just test the wireless network, look up the capacity specifications of your access points and routers and make sure they meet your needs.
  • Hire a Tech: Get a trained network technician on staff, or at least under contract, and have them do a site survey.
  • Do a Scale Test: Load a room with the expected number of devices and get that many people to exercise them all at once.
  • Identify Interference Points: Who shares your internet connection? What other rooms share an access point? What facilities share a router? What is between the access point and its intended devices? (walls, furniture, etc.) Do any of the barriers move?
  • Build In Redundancy: Install redundant devices at key places (e.g. routers and access points). If redundancy isn't affordable, have spare equipment available on-site.
  • Map Your Network and Document Your Configurations: List everything you would want to know when troubleshooting or replacing a defective item.
Despite the best planning, unexpected problems are still going to occur. In the first years of online learning and assessment they may be painfully frequent. So a final recommendation is to Handle Crises with Grace. Develop a contingency plan. Be ready with an alternative activity when the systems go down. For testing, build excess days into the testing window in case you have to cancel for a day. If we plan well, technology will be a blessing and not a burden for education.

Updated 24 June 2013 to add information about SchoolSpeedTest.org.