If you’ve spent any time hillwalking at all you’ll be
familiar with the “false summit effect”. From afar you
can tell where the top of the mountain is, but as you get onto the
slope itself your view is occluded. You fix your sights on a little
hill just in front of you that blocks your view of everything else.
But just as you reach this summit, you find that there’s more
to come. And in the worst case, you might have to go back down from
the mini-summit you’re on to reach the slopes of the higher
part of the hill.
That is the false summit.
Searching for things in computer science is often likened to
climbing a hill, where the summit represents the goal, and every
point on the landscape represents a candidate for this goal. The
idea is that, without knowing where the summit is, one can get
there by always walking up hill. Assuming an ideal
(smooth) landscape one could get to the top of a hill blindfold.
The flaw, which I’m sure you’ll have spotted, is the
false summit. If we can’t see where we’re going then we
can never be totally sure if we’re on a minor summit rather
than the very top of the mountain. This is called a
local maximum—every direction you walk goes
down, but there is a direction which will eventually take you to a
The solution is to add a bit of randomness. If you’re at the
top and don’t know where to go, spin round in circles until
you’re dizzy and then strike out in the direction you end up
facing. (I don’t suggest you do this literally either.
I’m not going to be responsible for dizzy people hurling
themselves off hillsides.) With a bit of randomness added to the
search pattern you can break out of the local maxima.
Of course, this is one of those circumstances where you can never
be totally sure you’re at the summit. Though if all
you’re looking for is a hill high enough then it
will fit the bill adequately.
Photograph credit goes to Christof Autengruber on Flickr.
The next instalment in this exciting saga of family feuds and
cyberwarfare. Tybalt has sent an invitation email to Romeo,
impersonating Juliet in order to tempt Romeo to reveal himself.
Romeo opens his mail and finds this message waiting for him:
From: “Juliet” firstname.lastname@example.org
Subject: secret meeting
Come to the town square at midnight, behind the clock.
alone and make sure you’re not followed.
If he takes the bait and travels to meet his love, who knows what
terrible fate will befall him?
Luckily, we don’t have to worry, because Romeo and Juliet
have been smart. They’ve taken precautions.
( Click to read more...Collapse )
Some more on information theory. Last time I covered issues to do with the reliability of the ‘channel’, the medium used to transfer the data between sender and receiver. To summarise what was in the last post:
- Information is data transferred between sender and receiver, so that the receiver then learns something that they didn’t know before.
- All transmission channels have some error rate, so it is useful to be able to talk about, and calculate the effects of, these errors on data we transmit.
- Adding various types of redundancy can mitigate the errors. We know how much redundancy to use because we can make accurate calculations about error rates.
This deals with the problem of transmission, but we still don’t know how to create the message in the first place. What is the most efficient way of turning my thoughts into something which will travel over the channel? This problem, called coding, is an age-old problem. It’s the problem of translating thoughts into voice or words or pictures.
( Read more about the coding problem and how information theory helpsCollapse )
According to Darrell Huff, it’s easy to lie with statistics. But it’s just as easy to lie with any kind of science, for the same reason — there’s a lot of knowledge out there and most people aren’t familiar with the basics. So people can come up with intuitively plausible ideas like homeopathy using some quantum-theoretic means of healing, and many people will accept it.
Creationist lecturers make a living from this ambiguity: giving just enough knowledge to tempt the audience into making a leap to a fallacious conclusion. One of the examples the Intelligent Design creationists like is something called Information Theory, which they misrepresent and abuse readily.
Information Theory sounds like it might be about meaning and purpose. Of course, information in that sense is entirely in the eye of the beholder. (Imagine having a crossed line on the telephone while you’re talking to a friend. As far as you’re concerned, the other conversation going on is ‘noise’, while yours is ‘information’; the other people think the exact opposite.) There are no formalisms for deciding how useful something is. But the creationists don’t clarify what is meant by information; they just let you make assumptions because it suits their purpose.
In information theory, a message has information if you can’t predict what it says before you read it. So a dice with a single spot on each face provides no information — you knew it would show a 1 before you rolled. An ordinary dice is unpredictable, so when it stops you’ve learned some piece of information. This is how information theory regards information: as something which informs.
Whether it is interesting or useful is irrelevant. In fact, the most informative message possible is one which consists entirely of random numbers, since it has the most unpredictability.
As well as not defining what is meant by information, creationists then claim that information cannot be created. (This is why they bring it up: to show that complex DNA cannot evolve.) But from the correct definition given above, we can see it is very easy to create complex information. Any noisy signals — leaves rustling, background radiation — are fantastic sources of information.
They mean to say that structured, interesting information cannot be spontaneously created, so genomes require a designer. But in this regard, information theory is not helpful, so they lie about what it says and hope to create a convincing argument from the half-truths and fancy mathematical terms.
In further posts I’ll talk about the good stuff in information theory, and places where you’ll come across it. Tune in next time!
I occasionally listen to a podcast called Computational Complexity. They recently had an interesting blog post explaining a zero-knowledge proof for a Sudoku puzzle.
A zero-knowledge proof is an interesting beast. It involves two parties, the Prover (who knows something) and the Skeptic (who doesn’t). The Prover has to convince the Skeptic that they know something, without revealing what they know.
The example given on the linked article is of a pair of sudoku players. One of them (the Prover, who we shall call Pam) has solved the puzzle. The other (the Skeptic, Sam) is convinced that there is no solution. How can Pam convince Sam that there is a solution without giving him solution?
This is the purpose of a zero-knowledge proof. For a sudoku, the Prover has to take the solved puzzle and permute all the numbers. So the features of the puzzle is identical, but the particular symbols in each grid square are different. For example, if the top-left square and the middle square of the original both contained a 7, then in the permuted puzzle they would both contain an identical, but different value.
So let us say that Pam has taken her solved puzzle, and produced an identical but mixed-up version. How does she prove that to Sam? She writes each digit on a small card and arranges them face-down on a table. Sam then has a choice:
- See a single row
- See a single column
- See a 3x3 block
She then turns over the row, column or group he wants to see. If she has a real solution he can confirm that the digits 1–9 only appear once. This information doesn’t help him solve his own puzzle, since there’s no evidence that the cards he has seen are actually permutations of his own puzzle.
So Pam could have cheated. She may just be pretending to have solved the puzzle and be revealing small sets of numbers from a much easier puzzle. Which is why Sam also has a fourth choice — he can ask that Pam turn over the cards which correspond to the starting numbers for his original puzzle. These allow him to confirm that her solution, whatever it is, had the same unique starting point as his own.
The problem is that all Sam has to do is ask to see two 3x3 groups and the starting combination and then he can solve his own puzzle. Which is what we’re trying to prevent. The way round it is simple — Sam can only make one choice from the selection 1–4. Once he has made his choice and confirmed that Pam wasn’t lying in some way, she sweeps all her cards off the table and starts again. She makes a different permutation from the original solution, and allows him to make another choice.
Each time he chooses, he has a choice of 28 options: one each of 9 rows, 9 columns and 9 groups or the opening state of the board. If she’s lying he can only catch her out with probability of 1/28. She has 27 ways to cheat him for every go! But according to the post, if they repeat the game 150 times:
Paula’s chance of cheating in every round is at most (27/28)150 < 0.5%.
So while it might not be actually feasible for a couple of people fighting over an “impossible” sudoku it demonstrates the principle marvellously. And in reality, it shouldn’t take long to convince someone.
I mentioned in my last post on computational models that a Turing machine is the pinacle of computational power. There is nothing that can be computed that can’t be computed by a Turing machine. The only limit is the set of rules (the states and transitions) used by the Turing machine.
But there’s one thing we haven’t really broached yet. One problem which none of the machines have been able to solve until now. That problem is generality: the point where a machine dedicated to a specific problem becomes a fully programmable computer.
( The Universal Turing machineCollapse )
In my last post on computational models I finished with a question. I had introduced the concept of a finite state machine, and then a pushdown automaton — essentially a finite state machine with a simple memory. The memory was represented as a ‘stack’, where only the top item was visible.
My questions was — does adding more stacks to a system make it more powerful? Is there something which a two-stack machine can do which a one-stack machine can’t? The answer to that question is a resounding Yes. So let’s investigate the limitations of a one-stack machine and how two stacks help us overcome them.
( Read more...Collapse )
The general trend of this series of posts is about slowly adding features to the idea of a machine until it becomes as sophisticated as we can make. Last time I mentioned the finite state machine, the simplest possible abstraction of computation.
A finite state machine is nothing but a table of rules. It can only act on an input or not, and there is only one input that it can analyse at a time. This isn’t really as sophisticated as we want. We’d like to have a notion of memory for our computer. We should be able to store results somewhere and change behaviours depending on the contents of our memory.
( A better computing deviceCollapse )
It’s hard for most people to think about things abstractly. People understand abstractions only by generalising from specific examples, rather than learning the abstraction first. It’s difficult to say what the idea of a chair is, for example, but if you’ve seen enough examples you can form your opinion. You begin to generalise and so don’t make a fool of yourself when you visit a new restaurant.
Building machines in the garage
Mathematics often seems to be all about dealing only in the abstract. The work of early twentieth century mathematicians was particularly aimed at defining maths purely in abstract terms — to divorce it entirely from notions of measurement and counting. They were ultimately unsuccessful, but it is definitely true that modern mathematics is not about counting sheep or weighing grain.
And yet even real mathematicians hanker after the physical examples, the concrete. Once you have answered a question in general terms, it’s a matter of putting flesh to the ghost. A furniture designer knows what a chair is and what it must have. The excitement comes in creating a new example of a chair that no-one has seen before. The same is true with mathematical ideas.
To compute, simply enough, is to process an input to produce an output. People wondered about the limits of computation: could a process be found to answer any question? Intuitively one would say no, but that won’t advance the field of human understanding. How much is no, and what questions come into the category yes?
Boil the question down to its essentials. There is a process, so in theory there should be a processor — some machine to carry out the task. How about we design this machine? There will be steam and shiny brass knobs, I hope.
So in the next few weeks I hope to write about this machine. We will go through many iterations, adding features and redesigning from scratch. Eventually we should reach the point where every new gizmo we add to it doesn’t actually make it more powerful. Then we will have reached our answer — a processing machine which can compute everything computable.
If you were paying attention to the field of pseudoscience a couple of weeks ago you’ll remember the story of the maths teacher who revolutionised division by zero and without a by-your-leave was teaching it to secondary school pupils. Not only was he subverting the standard procedure of childhood education (by teaching his pet fantasies as the next great Truth) he was also talking a load of cobblers.
His point rests on the unfortunate fact of there not being an answer for division by zero. In fact, it’s defined as not being answerable. In computer programming this has the unfortunate result of occasionally causing a problem or two. Division doesn’t get used in maths or programming as much as, for example, addition but it does pop up. What was your average score for all the games you’ve played? Well, add up all the scores and divide by the number of games.
But what if you’ve only just installed the software and the number of games played = zero. What then is the average, since we cannot divide by zero? This problem is easy enough to predict and avoid: we only calculate the average if there’s been at least one game played. Some circumstances are less easy to predict, and sometimes the programmer just plain forgets to check.
The result of division by zero is an error. This error is named Not a Number, or NaN for short. If the programmer commits such a faux pas then two things will happen. The program may follow Elvis and “leave the building”, which will often result in a little dialog box in Windows saying the program attempted a division by zero.
The other thing that happens — and this is, I think, more common in languages which don’t mind about well-typed results — is that “NaN” is rendered as a literal result. So the average score over the last zero games played is… NaN.
Just to illustrate the point, codeman38 kindly allowed me to use the weather widget shown. The person who wrote this obviously didn’t check for the presence of zeroes before dividing and the result is as you see — very silly.