Sometimes, you come across problems that you think should be well studied but aren’t.

(please correct me if I’m wrong)

Medicine is famous for not having good cross pollination with the maths. See this famous paper which rediscovers the Riemann sum.

However, as bad as medicine is at noticing math, most disciplines are especially bad at adapting the study of algorithms, one of the most theoretical branches of Computer Science. This is true even for seasoned programmers.

There is a reason for this. Brute force works remarkably well. In fact, so well that Ken Thomspon, author of the Unix operating system, said:

“When in doubt, use *brute force*.” – *Ken Thompson*

There are some geniuses in the field, notably Don Knuth. In *Coders At Work* some of the best Computer Scientists admit to reading little to none of his series *The Art of Computer Programming*, saying the math is too difficult.

So, it’s not surprising that medical dosages are done with brute force. There is a method for picking the starting point, called median effective dosage which produces a result in 50% of the population.

However, doctors are averse to either overdosing or underdosing depending on the condition. They tend to start at some point and then proceed straight through the search space until they get to the right level. This algorithm is called linear search. It is brute force and it is guaranteed to find the right level should it exist.

There are much better ways to do a search assuming a sorted list. Let’s create a game to illustrate this:

Suppose I chose a number between 1 and 100 and You had to guess which number I chose. Every time you guessed, you pay me a dollar and I’ll tell you if you are above or below. If you guess correctly I pay you 10 dollars. Clearly, you are not going to start at 1 and just keep guessing up to 100. Instead you can start at 50 and then eliminate half the range. If you keep halving what’s left you can find any number in 7 tries. Thus earning 3 dollars of profit.

The process of halving the interval is called binary search and it works great when the cost of guessing to high or to low is the same. In this case you paid a dollar for each guess. Suppose though instead, you paid 2 dollars if you guess above and only 1 dollar if you guess below. What strategy should you pursue?

What if you pay the difference between the guess and the true answer and 1 dollar if it’s below. Obviously, you would pay this at the end of the game so as not to reveal the answer immediately. What does the reward have to be to make this game fair and what is the optimal strategy. In effect, this is the game doctors play when they are prescribing a dosage. Every guess, costs the patient time and health, but there is a higher cost to over prescribing or under prescribing. And this asymmetry can be captured in such a game.

A possible solution is to divide the interval into 4 and then guess that number. Keep doing that and you are guaranteed to get an answer as well and you’ll also guess under about 3/4 of the time. The right way to sub divide the interval and guess depends on the cost of each guess and the cost of guessing over. Is there an answer? Sure. But I haven’t seen a medical journal that has researched this question and so we do brute force, and that’s a problem.

Below is python code that simulates these questions:

def linear_search(alist, item, cost_func): first = 0 last = len(alist)-1 found = False cost = 0 while first<=last and not found: current = first if alist[current] == item: found = True else: if item < alist[current]: cost = cost + cost_func(alist[midpoint],item) else: cost = cost + 1 first = current+1 return cost+1 def binary_search(alist, item, cost_func, cost=0, divisor=4): blist = list(alist) if len(blist) == 0: return cost else: midpoint = len(blist)//divisor if blist[midpoint]==item: return cost + 1 else: if item<blist[midpoint]: c = cost + cost_func(blist[midpoint],item) return binary_q_search(blist[:midpoint],item, cost_func, c) else: c = cost + 1 return binary_q_search(blist[midpoint+1:],item, cost_func, c) def binary_q_search(alist, item, cost_func, cost=0, divisor=4): blist = list(alist) if len(blist) == 0: return cost else: midpoint = len(blist)//divisor if blist[midpoint]==item: return cost + 1 else: if item<blist[midpoint]: c = cost + cost_func(blist[midpoint],item) return binary_q_search(blist[:midpoint],item, cost_func, c) else: c = cost + 1 return binary_q_search(blist[midpoint+1:],item, cost_func, c) def trivial_cost_func(guess, true): return 1 def scalar_cost_func(guess, true): return 100 def dif_cost_func(guess, true): return (guess-true) lin = [] binar = [] quar = [] cost_func = scalar_cost_func a = 1000 for i in xrange(0,a): lin.append(linear_search(xrange(0,a),i, cost_func)) binar.append(binary_search(xrange(0,a),i, cost_func)) quar.append(binary_q_search(xrange(0,a),i, cost_func,divisor=4)) print "The average cost for linear search: " + str(sum(lin)/float(a)) print "The average cost for binary search: " + str(sum(binar)/float(a)) print "The average cost for quarter search: " + str(sum(quar)/float(a))

Continued in Part 2