Skip to content.

Kitchen > PrivateWebHome > WebLog > BlookisForBayes
09 Jan 2006 - 15:57

blookis for Bayes


I majored in experimental psychology, and was taught that the 'frequentist' model was the only model.

Large sample size, random assignment, double-blind controls, tests for significance: these were the only conceivable means to discover the truth or something close to it.

Nobody said boo about Bayes.

At some point along the line, probably within the last 10 years, I realized something was missing.

First of all, peer-reviewed, random-assignment, frequentist studies are often wrong.

How often?

Probably 15% of the time: (subscription required)


THEODORE STURGEON, an American science-fiction writer, once observed that “95% of everything is crap”. John Ioannidis, a Greek epidemiologist, would not go that far. His benchmark is 50%. But that figure, he thinks, is a fair estimate of the proportion of scientific papers that eventually turn out to be wrong.

Dr Ioannidis, who works at the University of Ioannina, in northern Greece, makes his claim in PLoS Medicine, an online journal published by the Public Library of Science. His thesis that many scientific papers come to false conclusions is not new. Science is a Darwinian process that proceeds as much by refutation as by publication. But until recently no one has tried to quantify the matter.

Dr Ioannidis began by looking at specific studies, in a paper published in the Journal of the American Medical Association in July. He examined 49 research articles printed in widely read medical journals between 1990 and 2003. Each of these articles had been cited by other scientists in their own papers 1,000 times or more. However, 14 of them—almost a third—were later refuted by other work. Some of the refuted studies looked into whether hormone-replacement therapy was safe for women (it was, then it wasn't), whether vitamin E increased coronary health (it did, then it didn't), and whether stents are more effective than balloon angioplasty for coronary-artery disease (they are, but not nearly as much as was thought).

[snip]

...he concluded that even a large, well-designed study with little researcher bias has only an 85% chance of being right. An underpowered, poorly performed drug trial with researcher bias has but a 17% chance of producing true conclusions. Overall, more than half of all published research is probably wrong.




Jakob Nielsen says to use bullets, so I'm using bullets

What are the odds of any given study being right?

  • large, well-designed study with little researcher bias: 85% chance of being right

  • underpowered, poorly performed drug trial with researcher bias: 17% chance of being right

  • all published research, taken as a whole: 50% chance of being right




med school

Apparently, Dr. Ioannidis' exercise has been a tradition in med schools for some time.

Two physicians, who attended different medical schools, have told me that when they started med school their professors said that half of the articles published in JAMA that year would prove to be wrong by the time they graduated.

These professors had never conducted a study.

So how did they come up with a figure of 50-50?

I'd say they used Bayesian reasoning.

This is an example of the human mind using Bayesian analysis to arrive at a correct conclusion -- the same conclusion a frequentist study like Ionnidis' will reach (assuming his study is correct, of course).



when you don't need a large sample

Carolyn linked to an ECONOMIST article on research showing the human mind probably uses Bayesian reasoning.

...the Bayesian capacity to draw strong inferences from sparse data could be crucial to the way the mind perceives the world, plans actions, comprehends and learns language, reasons from correlation to causation, and even understands the goals and beliefs of other minds. [snip]

The key to successful Bayesian reasoning is not in having an extensive, unbiased sample, which is the eternal worry of frequentists, but rather in having an appropriate “prior”, as it is known to the cognoscenti. This prior is an assumption about the way the world works—in essence, a hypothesis about reality—that can be expressed as a mathematical probability distribution of the frequency with which events of a particular magnitude happen.

The best known of these probability distributions is the “normal”, or Gaussian distribution. This has a curve similar to the cross-section of a bell, with events of middling magnitude being common, and those of small and large magnitude rare, so it is sometimes known by a third name, the bell-curve distribution. But there are also the Poisson distribution, the Erlang distribution, the power-law distribution and many even weirder ones that are not the consequence of simple mathematical equations (or, at least, of equations that mathematicians regard as simple).

With the correct prior, even a single piece of data can be used to make meaningful Bayesian predictions. By contrast frequentists, though they deal with the same probability distributions as Bayesians, make fewer prior assumptions about the distribution that applies in any particular situation. Frequentism is thus a more robust approach, but one that is not well suited to making decisions on the basis of limited information—which is something that people have to do all the time.





more bullets!

  • Bayesian reasoning draws strong — and accurate — inferences from 'sparse data'

  • all you need for Bayesian reasoning to work is an 'appropriate prior' — an accurate hypothesis about the way the world works

  • if you have a good hypothesis about the way the world works, you don't need a huge sample

  • real people in real life have to make decisions based on limited data all the time; hence we probably developed Bayesian analytic abilities




the cognitive unconscious knows what it's talking about

I believe it.

As I was saying, at some point I realized that:

a) published, peer-reviewed research is frequently wrong

and

b) personal opinions, gut feelings, and intuition are frequently right


At least, my own personal opinions & gut feelings have proved correct often enough that I never dismiss personal opinion & gut feeling — my own or other people's — out of hand.

But until I read this article, I didn't know why, when, or how.

I would have a 'feeling' about something, or an idea, and I would have no clue whether this was or was not likely to be right.

Then, after awhile, I accumulated so much experience in certain realms that I began to trust my judgment.

For example, after a few years working with medication for Jimmy, I began to have a sense of what we ought to try with him. Often, I was right.

I had meant to write a post about this back when we were talking about 'partnering' with teachers.....I've had numerous partnerships with Jimmy's doctors. I would read a piece of research that made sense, bring it in to our doctor, and our doctor would either instantly agree that it made sense, or would pursue it further.

Often he or she decided to try the medication I thought should be tried.

There are no medications approved for autism; all prescribing is done off-label. When we began working with meds, the standing belief was that medication 'did not treat autism.' The most you could hope for was to ameliorate a couple of symptoms, like hyperactivity and insomnia, and these symptoms were considered not to be 'core.' I rejected that line of reasoning years before the profession did, and I was right.

Now Ed has developed tremendous 'Bayesian' expertise with meds. He's been supervising medication for the past 10 years, since the twins were born, and he knows what he's doing. We're working with one of the best psychiatrists in the world (IMO) and Ed can frequently predict what Dr. Hollander will do next.

That's the cognitive unconscious at work. Research on the cognitive unconscious, which Arthur Reber surveys in his book Implicit Learning and Tacit Knowledge: An Essay on the Cognitive Unconscious, shows that it is startlingly accurate.

Since reading Reber, I know that the cognitive unconscious — my own or others' — knows what it's talking about at least some of the time.

My problem has been figuring out when.

There's probably a simple answer to that.

According to Robin Hogarth, who wrote Educating Intuition, intuition — the cognitive unconscious — is likely to be right in realms that offer feedback.

A weatherman gets feedback. On Monday he predicts rain. On Tuesday, either it rains or it doesn't.

That's feedback. An experienced weatherman is going to develop good intuition.

A constructivist teacher who's not using formative assessment is getting very little feedback. In September he predicts that kids in the TRAILBLAZERS curriculum will learn their math facts without drill. In May he assumes they have.

That's not feedback.

This is why I don't listen to the casual observations and assertions of constructivists.

They haven't had enough feedback to develop good intuition.

In my experience, at least, a constructivist talking education is often talking belief, not experience.



rule of thumb

That last sentence gave me a new rule of thumb:

I tend to trust people who sound as if they're speaking from direct experience.

I don't trust people who sound as if they are restating educational philosophy.

This is the glaring difference between the writings of an Engelmann or a John Gatto Taylor and a generic constructivist.

Engelmann's work is filled with experience. I don't have to perform a post-hoc analysis of the statistical techniques used by Project Follow-Through to conclude that Engelmann knows what he's talking about.

He's got a Bayesian brain, I've got a Bayesian brain, and 95% of the time he's talking about his experience, not his philosophy.



blookis for Bayes

Which brings me to Kitchen Table Math.

A blooki is the perfect venue for Bayesian analysis.

I remember back in the first couple of months, someone left a personal narrative & then apologized for having done so, saying that his experience was just one example, nothing more.

I answered that a major reason I started writing Kitchen Table Math in the first place was that I wanted to learn about other people's personal experiences.

I'm not looking for the 'large sample' of a frequentist study.

I'm looking for the personal experience & observations of people with good priors.

That's what I've been getting ever since we started!


ImplicitLearning.jpg




Bayes statistics & false positives
does human mind use Bayesian reasoning?
Bayesian reasoning, intuition, & the cognitive unconscious
most bell curves have thick tails
ECONOMIST explanation Bayesian statistics
Bayesian certainty scale

Bayesianprobability



Back to main page.



Comments

After entering a comment, users can login anonymously as KtmGuest (password: guest) when prompted.
Please consider registering as a regular user.
Look here for syntax help.


Catherine--you are amazing!!!

-- KtmGuest - 09 Jan 2006


wow!

thank you!

-- CatherineJohnson - 09 Jan 2006


actually, I feel great

this is something I've been trying to figure out for a long time — and I think I finally have a handle on it —

-- CatherineJohnson - 09 Jan 2006


This leads back to "expert systems" research. Experts are much better at guessing than the general population; codifying why that is and how to replicate it in software is an ongoing struggle.

Sturgeon's Law: "Sure, 90% of science fiction is crud. That's because 90% of everything is crud."

"John Ioannidis, a Greek epidemiologist, would not go that far. His benchmark is 50%. But that figure, he thinks, is a fair estimate of the proportion of scientific papers that eventually turn out to be wrong."

That only 50% of scientific papers are demonstrably wrong doesn't imply that only 50% of scientific papers are crud. "Crud" (Sturgeon's original term) includes far more than just the "incorrect". There's also the "useless", the "misleading", and the "unconvincing".

"Dr Ioannidis, who works at the University of Ioannina..."

He's probably married to Ioanna and has a son named Ioan and a daughter named Ioanniffer.

-- DougSundseth - 09 Jan 2006


"I remember back in the first couple of months, someone left a personal narrative & then apologized for having done so, saying that his experience was just one example, nothing more."

Anecdotes are data points. We judge each of them based on our knowledge and experience (and our biases)to refine our understanding of a subject. They are not worthless.

Some, unfortunately, argue that no knowledge or understanding is worthwhile unless it is the result of a double-blind, etc. formal study. Of course, the math fuzzies who are in control, want these studies now to to justify why they should do something different. As everyone knows, educational studies have way too many variables. I am not arguing that these studies are not proper, just that we can't wait for them and we can't expect agreement over what the results mean. This is true especially when viewed through fundamental biases over what constitutes a proper education.

I mentioned once that I told a couple of people on our school committee that the schools should hand out Hirsch's "What Every First (Second, Third, et.) Grader needs to know" and tell the parents that this is NOT the education your child will receive. In other words, what good are formal studies when the school declares that they won't do "pull-out" and only offer a pre-algebra level course in eighth grade? Can a formal study decide whether it is proper or improper to offer a rigorous algebra course in eighth grade? Common sense would say that by seventh or eighth grade, schools should begin to offer different levels (not quality) of math classes.

All of this is not complicated. K-8 math is not complicated. There is no need for formal studies for most of the things we talk about at KTM. A full course in algebra by eighth grade and a curriculum to get the student there.

-- SteveH - 09 Jan 2006


I'm not so sure about the forecasters.

I worked as an economic forecaster for a couple of years and one of the general results in that profession was that an experienced forecaster doesn't actually give you much more accuracy in predictions than an inexperienced forecaster. The main difference is that the experienced forecaster knows a lot more about what is currently going on - and explaining what is currently going on is how a lot of forecasters get paid.

Another result is that generally if forecasters adjust the results of their statistical models they reduce accuracy.

Oh, and if you get a group of forecasters together to discuss their forecasts, they generally reach consensus quite fast, but reduce accuracy.

See Armstrong's work on forecasting

-- TracyW - 09 Jan 2006


I mentioned once that I told a couple of people on our school committee that the schools should hand out Hirsch's "What Every First (Second, Third, et.) Grader needs to know" and tell the parents that this is NOT the education your child will receive.

I'd forgotten that!

I love it!

-- CatherineJohnson - 09 Jan 2006

WebLogForm
Title: blookis for Bayes
TopicType: WebLog
SubjectArea: StatisticsTeaching
LogDate: 200601091057