Clinical Trials: Asking the Right Questions -- Interview with Lewis Sheiner, M. D.
Lewis Sheiner, M. D. is Professor of Laboratory Medicine,Medicine, and Pharmacy at the University of California San
Francisco. He is active in the AIDS Clinical Trials Group (ACTG)
of the U. S. National Institute of Allergy and Infectious
Diseases (NIAID), and a member of the Antiviral Advisory
Committee of the U. S. Food and Drug Administration (FDA).
Dr. Sheiner has a particular interest in how to obtain the
best possible information from clinical trials when real-world
circumstances do not allow ideal study design. We asked him for
his recommendations on how existing trials might be made more
useful and more efficient. In the following edited text of our
interview, some technical supporting information was omitted; we
added clarifying comments in [brackets].
Dr. Sheiner: We spend about a hundred million dollars on
clinical testing of a major new drug, and I doubt that we get
even one million dollars worth of information. We are immensely
inefficient. I think the problem can be traced to the fact that
we are asking the wrong question. The whole process has to be
revamped.
The problem, as I see it, is like that of an academic
scientist whose main motivation becomes earning money -- instead
of to understand the universe better, which is what a research
scientist should be doing. Such an academic might engage in
research which was most remunerative, or might try to get
promoted by getting a lot of papers published without a lot of
thought about what was in the papers. To some extent the system
would regulate this behavior; for example, it would not be easy
to publish papers that had no content. But there's still a
spectrum -- subjects that lead to papers more quickly, papers
that record just new information as opposed to new understanding,
and which can be published in other than the best journals. So
the system might keep a person producing a product that looked
something like what an academic should produce. It wouldn't be
high quality and would therefore be wasteful.
To be productive, scientists need to keep their eye on the
ball, on the problem, which is understanding the subject matter
better or teaching students better. Then everything else falls
out; they become successful as a researcher, or successful as a
teacher, and get the rewards. But they should not keep the
rewards in mind as the reason for it.
The problem is that the drug industry is not keeping its eye
on the ball. The drug industry is concerned, properly (because
we've assigned coming up with drugs to the private sector) with,
"How do I make money." And that translates into, "How do I get
my drug approved?" So when they're doing clinical testing of
drugs, the question is, "What do we need to do to get the FDA to
approve the drugs." They're not asking, "What does a doctor (and
the patient) need to know, in order to use this drug sensibly."
If everybody kept their eye on that ball, everything else would
fall out fine. But the drug companies have their eyes on the
ball of "What do I have to do to satisfy the FDA."
Therefore the FDA has to come up with rules that will cause
people who are not actually addressing the problem, but rather
are trying to get the drug approved, to behave in ways that sort
of address the actual problem. It's a tough business to make a
set of rules that will cause the behavior to come out one way
when the motivation is another. We see this all over society.
What you want to do is get people engaged in the real question,
and then they use their ingenuity to answer it.
This is an underlying structural problem, since we've
assigned this major contribution to the public health [new drug
development] to private industry. I'm not sure we could do any
better. I do know that the set of rules the FDA has come up with
has been determined largely by the legislation they have been
given, which is that you have to have safety and efficacy, and by
conservatism having to do with the process by which drugs are
approved -- peoples' impression is that essentially you get one
shot. You get to hold the manufacturer ransom from earning any
money on their product for a while, while they supply information
on the drug. Once you decide that the company has supplied
enough information to warrant marketing, then you don't have much
opportunity to learn more from that source. Other sources start
to kick in [after the drug is approved]: physicians, academic
people will pursue research on the use of various drugs for their
own reasons, we may get the answers eventually. But there's no
organized search for information, except in the few instances
where the FDA requires additional studies after approval. Or
where the public, through the NIH or others, decides we need a
large trial for certain conditions because we don't know the
answer; we've seen that in coronary disease and certain other
areas.
What are the questions that are being asked when clinical
trials are designed? And what kinds of answers can we expect?
The question that's being asked, even in those large trials done
under government auspices, is primarily, "Does a drug work"? You
control things, give a group of patients a new therapy, another
group an old therapy or a placebo, look at the outcome, and ask
if there is a difference. But that is an empirical
[observational] result; it doesn't tell you anything scientific.
Science is understanding the world.
We focus on the question, does it work or not? That's a
mistake. Because the important question is, "What do we need to
know in order to use this treatment sensibly." The question is
not, "Does it work," but "How does it work in different patient
groups, in combinations with other drugs, and so on."
The second thing we do wrong, after deciding to answer what
I think is a limited question, is that we then optimize the way
in which we do an experiment to answer that question. That's a
serious mistake. When you optimize a system to do one thing,
it's almost certain that you make it inefficient to do anything
else. Whereas if you are a little robust, and say, "I'll do the
best I can with respect to one question, but I'll also spread my
energies around and do OK with other questions," it turns out
that you can do lots of things with the data. It's like biology;
when you get highly specialized evolution of the animal it's
great, but only if the environment doesn't change at all.
The epistemologic paradigm we are using is that we're going
to find a counterexample to the claim that the drug doesn't work.
So we set up a "null hypothesis" which is that the drug does not
do anything, and then we look for a circumstance in which we can
gather evidence which refutes that.
The randomized controlled trial essentially functions as a
counterexample that disproves the null hypothesis. And then we
try to make that example as strong as possible by magnifying the
treatment difference -- by giving one group zero dose, and the
other group as large a dose as safely possible.
We try to maximize the "signal" and minimize the "noise."
So we get a homogeneous group of patients, and treat them, in
respect to everything except the drug of interest, very
homogeneously. We treat them by protocol, which in general
doesn't look like real medical therapy.
It's standard practice in clinical trials to reject a
minimum of 80 percent of people who are potentially qualified.
It's a rare trial that uses 20 percent. For example, when
testing a drug for high blood pressure, at least 80 percent of
those who come in with high blood pressure are usually rejected
because they are the wrong age, or the wrong size, or they are
taking some other drug, or they had some unusual test result,
etc.
The first problem is because of the optimization: how can
we extrapolate from this highly select sample to the whole
patient population? [We can't] -- but we do it [wrongly] all the
time. By trying to minimize noise [the variation among
patients], we put ourselves in a situation where we don't have a
representative patient sample.
If you really had your eye on the ball of what do we need to
know to use this drug, you would want to study lots of different
kinds of patients, with lots of different regimens, and look at
various outcomes. But the randomized controlled trial usually
looks at a very homogeneous group of patients, with just two
doses, zero and some large dose, and usually looks at only a
restricted set of outcomes. [Therefore it doesn't give you the
information you need to use the drug in real medical practice.]
So when the trial is done, we have proved that there is some
group in which the drug does something -- so we can deny the
"null hypothesis" that it does nothing. But we are not very far
ahead of the game. And it costs us millions, doing these
extremely rigid studies.
[What makes the problem even worse is] that we want the
sense of assurance, that we're absolutely sure the answer is
right, that we're not making a mistake. The kind of mistake we
seem to fear the most is holding that a drug is active when it's
not.
So we design our studies in a very rigid way, and analyze
them according to very low-assumption models -- one group vs. one
other group, and simple statistics essentially based on the fact
that we randomized. That way of testing is very demanding in
stringency of design; you have to make sure that nothing else is
going on. You can't deal with the fact that patients are
different.
Consequently, we have to generate an experiment that does
not resemble real life, and is almost useless for answering any
other question [than whether the drug works in some situation,
however artificial]. Even if we get a good estimate of what is
happening to one kind of patient, that does not help us much with
other patients [who are different from those in the study].
But if you designed the study to spread the doses out, to
spread the kinds of patients out, you would be forced to analyze
it with a model which took into account more variability -- which
means a model with many more assumptions.
In this kind of model [which Dr. Sheiner proposes] there
would be randomization of doses. The "noise" level is allowed to
be higher, since variation among patients is allowed. You would
have to average this out by having larger patient groups, but
larger groups would be easier to get, since you would be using
designs in which you treat people much more like you treat them
in real life.
Why not randomize to ranges of doses, but allow the
physician to adjust the dose in that range, depending upon what
they observe in their patients, which is what physicians do?
Then you make it easier to enroll more people, as the trial
deviates less from what we would call good treatment.
If you look at the expanded access for ddI, for example,
there was no problem to get ten thousand people for this program,
because entry criteria were less stringent than for clinical
trials. From those ten thousand people we learned very little,
unfortunately, because we did not have an appropriate design.
But if you put a bit of design in that kind of study -- if you
randomize the doses into ranges, and then gather data on what
actually occurs -- you can get good information. There are
models to deal with this sort of variation [among patients],
certainly better than ignoring it, which is what we do now.
Because we ask the wrong question, and optimize our study
design to answer it, we learn little about the questions we do
need answered. You need to understand this in terms of the
process of drug approval and use. People today are designing
drugs that we know have pharmacologic action; that is not the
important question. The real issues are two: toxicity, which
can always be unpredictable, and ultimate efficacy (clinical
benefit). Those are the questions that need to be answered.
The societal question is, for life-threatening diseases, do
we have to answer that second question [ultimate clinical
benefit] before we allow a drug to be used? The answer seems to
be clearly no -- we've decided that we will use a drug when a
prudent person believes that the probable benefit is greater than
the probable harm. We can base that on the known pharmacology,
on the biology, that the drug ought to be helpful. We may make
mistakes -- but we do act before we are sure. What we need,
then, is a mechanism for checking whether we made the right
decisions. But it doesn't have to be pre-marketing [before the
drug is approved].
So the view I'm proposing is, "Present us with enough
evidence to make us believe that using this drug is a sensible
thing to do at the present time. But understand that we're going
to have to gather more evidence, and if the drug really is not
efficacious, we have to be able to take it off the market."
That's the fundamental change that I believe we need to make.
Then the approval risk -- the risk the FDA takes when it
approves a drug, the risk that it may turn out to be harmful --
is lower [than now, when the drug usually will not be tested
further by the manufacturer, and usually cannot be recalled].
Let's lower the approval risk, so we don't have a biased
approval process. If the FDA has a chance to correct a
decision if it turns out to be wrong, it can approve drugs much
earlier, because it will not have to worry about a mistake
forever. That's the overall view.
[We asked about the "large simple trial" design, which is
now generating interest for community-based AIDS research.]
Dr. Sheiner: The large simple trials come in at the stage
of checking to see if the drug works. You take all comers, and
perhaps randomize them to dose vs. no dose; dose is anything the
doctor wants to give, and may be adjusted. Or look at dose-range
groups for analysis.
Within the large simple trial it is very easy to pull out
more information about the response, by knowing exactly what dose
everybody gets, by using pharmacokinetics [e.g. blood levels of
the drug] to see what the actual exposure was, by measuring area
under the curve, or by using explicit protocols for dosage
adjustment, so that you know why doses were changed.
Allowing a mechanism whereby drugs that do not prove to have
efficacy can be removed from the market should be attractive to
pharmaceutical companies. They can start making money while they
are still testing -- so the price of those studies is being paid
while they are making money, not out of pocket. But this isn't a
free ride; companies cannot market their drugs and then leave it
to somebody else to figure out whether they work. Drug companies
will hope that they can do this, can do less work to make money.
We have to tell them that's wrong.
But studies could be done for less money, if we were clear
about how well we needed to know what. A lot of the money that
is being spent on drug trials is being spent to do studies
extremely stringently to get very objective evidence, where a few
assumptions would allow us to get perfectly adequate information
to allow us to proceed for a lot cheaper.
For example, in hypertension [although not in AIDS], what
happens to moderate hypertensives treated for three months with
placebo has not changed in the last decade and a half or two. So
the fact that half of the patients you have to pay for in every
study receive placebo is ridiculous. The only rational
justification for assigning exactly half the patients to a
standard therapy or placebo would be if you are as ignorant of
what happens with standard therapy as with the new drug, which is
almost never true. Some smaller number ought to be assigned to
the standard therapy, because we have prior information about it.
Procedures have taken over, and we do these things without
thinking about why.
[We brought up the case of hypericin, an experimental
antiviral just now beginning clinical trials. Apparently it was
delayed for three years because financial resources were not
available to meet the high FDA hurdles for initial human
testing.]
Dr. Sheiner: Asking a drug company capitalized at billions
to invest another million dollars in some testing that makes
someone feel a little more secure is no problem; they're going to
spend 100 or 200 million dollars anyway. But take an academic
institution, and 100 thousand dollars means something to you.
The whole drug-development system is set up for people who have
lots of money to spend. What we're doing is buying what I
believe is a false sense of security, but none the less we've got
many procedures in place that are quite costly. There are many
[questionable] reasons for maintaining them because: (a) we did
it before and it's unfair to do otherwise; (b) because somebody's
going to complain that you haven't done enough toxicity studies
if something goes wrong later; (c) you get raked over the coals
in Congress if you didn't have the same standards; and on and on.
It's mixed up with the political process, with the consumer. You
just shouldn't be in the making-drugs business unless you've got
millions of dollars in capital.
Take hypericin. This is the sort of situation where
somebody found a natural product [not created by rational drug
design] and believes that it might work. But at the moment, we
don't understand enough about that molecule and about the virus,
etc., to say whether it ought to work or not to work. It's all
based on empirical evidence. It's not like you designed this
drug to interact with viral proteins, etc. I'm not even saying
that would be the right way to go; the era of modern drug
development, of designing drugs to interact with certain
molecules, is very recent; it still doesn't account for the
majority of useful drugs.
The way that hypericin, or most drugs used today, came to
attention has been the standard way for new drugs to come along;
somebody notices that some natural compound does something, and
you do a careful set of empirical experiments that leads step by
step to the conclusion that this drug is useful. But in that
case, it is hard to say what you ought to be doing other than
that. You cannot jump over the process, know that a drug looks
like another drug, and interacts with the same molecule, so you
know that it will be active, and the only question is toxicity.
Here, you have a long trail to go on. Unfortunately, it's a
trail where the traditions have been developed in the context of
an industry that has no problem spending ten million dollars on
that phase.
We suddenly have people [community-based AIDS research
groups] wanting to enter the drug-development process in a
legitimate way, but having to meet standards that developed when
money was no object. I'm not sure that those standards make
sense. What about the thousands of people who have taken
(underground) drugs; does that experience mean nothing?
[We suggested that, if safety is under control, then why not
do a small trial first under close medical supervision --
quickly, before a drug goes into widespread "underground" use?]
Dr. Sheiner: I have outlined where we really could change
the way we do things in terms of asking drug companies to do some
testing after marketing. Now you are talking about something
that makes even more sense, which is that one size doesn't fit
all. At every point, we have a balance of many factors: our past
experience, what we know about whether a drug is dangerous or
not, what we know about animals, the fact that people may have
been eating it for ages as a natural product, the need for new
treatment because it's a life-threatening disease. Why don't we
sit down in each situation, thrash it out and think about it, and
come up with what makes sense for that particular situation?
Obviously that is the sensible way to behave. I despair that it
will be done soon.
[We asked about the usefulness of observational or
epidemiological data, compared to data from randomized controlled
trials.]
People miss the difference between two situations. One is
observational or epidemiological data which has been accurately
gathered but on which we have not exercised control; such data
are valuable, although a little bit of control adds a lot. But
what is useless are data that are not accurate, because we did
not even record them properly.
The best can be the enemy of the good. Many clinical trials
are so stringent that, for example, researches nurses must draw a
blood sample exactly six hours after giving the dose. But often
they can't draw it at exactly six hours; the patient is away from
the bed, etc. But they write down six hours because they're
afraid to write down the actual time. The methods of analysis
that we use in classical trials are crude, and highly sensitive
to violation of protocol, because they make no assumptions about
what is going on, because they don't build into their model time
variations, etc. So we make a very strict protocol; but then the
actual protocol, what actually happens, generally does not look
very much like the nominal protocol [which was supposed to
happen]. So we fool ourselves, when we use these supposedly
objective methods, because often what happened is not what we
intended to happen.
One reason for paying attention to more observational-type
data is that you will get to find out what did happen. The
nominal and the actual designs will match, although you need a
more complicated model. We are so much worse off not knowing
when people in studies use drugs that are not allowed in the
protocol, than if we asked them to write down what they took and
what dose. A basic rule in life is that you want to look at
reality.
These concerns are so mixed up with social and political
issues that it's hard to just talk about what makes sense
scientifically [aside from the social context].
In the public, and in Congress, we have to focus on the
process of getting the right folks thinking about the problem,
rather than on what actually happens. Or every time there's a
mistake, you will have somebody like a (Congressman) Dingell
making a fuss. The process itself has layers of assurance so
that you avoid criticism, can engender a sense of security. You
and I both know it's a false sense of security, and that we are
paying this huge overhead for what is essentially a political and
social phenomenon of everybody wanting to have their fingers in
the pie and be able to second guess, trying to be sure of
something you can't be sure about, trying to legislate away risk.
We can and do spend a fortune trying to do it, but it's an ill-
spent fortune.
source: AIDS Treatment News




