## Archive for **August 2011**

## Service Announcement

Folks, for two weeks starting from today I will be away in parts foreign. Rest assured, however, that the S I’s candles are always burning, and regular posts will continue in my absence, although I may be slow responding to your comments.

Meanwhile, if you find yourself with an hour or so to spare, make yourself some tea and toast and watch this talk about free will. Science is showing us that our minds are purely biological phenomena, governed ultimately by the laws physics. Does this mean we have free will?

## Infinite Beauty

In the last post I described one of the uses of large prime numbers, and added a link to the biggest yet known. It is a Mersenne prime, meaning that it is 2 to the power of another prime, minus one, and can therefore be written as 2^{43112609}-1; fully expanded, it is over twelve million digits long.

So large prime numbers are useful. And fortunately, they are an inexhaustible resource. When I say we know this for certain, I really do mean it. We are more certain of the infinitude of primes than we are of the size or age of the universe; more certain of it than we are that the Earth revolves around the sun.

Just as there are some books which everybody ought to read, there are some solutions to problems that are just so *neat* that it seems a shame not to know them. Euclid’s proof, devised over two thousand years ago, is a real glimpse of mathematical beauty.

How do we know that there are an infinite number of primes? Imagine that we don’t. We might think that the prime numbers stop after a while, and that every subsequent number is nonprime. If that were the case, there would have to be some highest-possible prime, the largest of them all. Call this ultimate über-prime *P*.

*P* stands at the head of a gigantic list of prime numbers, going all the way from *P* to the smallest prime, 2. We might arrange these numbers in order – write them down on a piece of paper longer than the universe.

2, 3, 5, 7, 11, 13 … and so on for millions of light years of paper until finally … *P*

This piece of paper contains an exhaustive list of every prime number that exists. Now imagine taking all the numbers on this list and multiplying them together, to give an even more unimaginably huge number, *R*.

2 x 3 x 5 x 7 x 11 x 13 … x *P* = *R*

*R* ends up being a number so huge that *P* seems tiny – but we know for a fact that *R* is not a prime number. We know this because *R*, divided by any prime number on our list, will leave no remainders. *R* is not prime because it is a multiple of 2 – and of 3, and of 5, and of 7, and even of *P*…

Now take this mind-blowingly, egregiously vast* *number *R* and add 1.

*R* + 1 = *Q*

What do we know about *Q*? Well, we know that if we divide it by 2, we’ll get one left over – the one we’ve just added to *R*. And if we divide it by 3, we’ll get one left over. And by 5, one left over… all the way up to *P*. If we divide *Q* by *P*, we’ll get one left over.

Which means that *Q* is a prime number – much, much bigger than *P*, the number we assumed was the largest prime. And if we take *Q* to be the largest prime, essentially making it into our *P*, we know there will be an even bigger *Q* to dwarf it.

When it comes to arguments, what could be more decisive – or as elegant – as this?

REFERENCES

This proof and others are discussed in G H Hardy’s classic essay on mathematical beauty, *A Mathematician’s Apology*. The full text can be found here.

## Critical Factors

Prime numbers are the building blocks of integers. Any positive integer is either the product of two or more prime numbers, or a prime number itself.

6 is the product of 2 and 3.

943 is the product of 23 and 41.

44,467 is the product of 53 and 839.

These numbers each took about ten seconds to calculate – I just drew prime numbers out of a list and multiplied them together.

So tell me: which pair numbers did I multiply together to get the number 47,124,299?

There is only one answer to this question, and in one sense it is easy to find: simply divide 47,124,299 by each prime number in turn; if the result is itself prime, there is the answer. But armed with the same tool as me – a pocket calculator – it would take you days to find the answer to a number I generated in seconds. If you and I were in a race, I would have a clear advantage – and the bigger the primes I chose, the harder your job would be.

The fast factorisation of numbers has been a problem for thousands of years, and we still aren’t very good at it. Granted, my computer could find the prime factors of 47,124,299 in a nanosecond or two, but this is a measure of computational speed, not cleverness. It is still exhaustively checking off the primes, one by one, just as someone would have done on paper a thousand years ago.

This asymmetry is exploited in the area of public-key cryptography, one of the most powerful forms of encryption that exists that doesn’t involve weird quantum effects of spinning electrons.

If you want to use public-key cryptography, you choose two large prime numbers *p* and *q* – gigantic numbers, hundreds of digits long – and multiply them together to create an even bigger number, *N*. You would then be free to advertise this number *N* to anyone who wants it; one might list different people’s *N*s in something like a phone book, free for people to look up.

To send you a message *M*, I would have to look up your *N*, and perform a simple calculation involving *M* and *N*. The outcome of this one-way function is the encrypted message. This can then be transmitted to you, and nobody – *not even me* – could convert the encrypted message back to *M*.

The only person who can read the message is you, *because you know what p and q are*. You alone on Earth know the prime factors of *N*, and that means only you can perform the reverse maths that decrypts the message.

Of course, *N* is public, and anybody could work out *p* and *q* from *N* simply by exhausting the possible primes. As we have seen, this is simple, but time-consuming: at present sizes of commercially available *p*s and *q*s, factorising an *N *would take the combined computing power of the entire planet several billion years.

REFERENCES

Simon Singh, *The Code Book*

At the time of writing, the largest known prime was 2^{43112609} – 1. The number can be found written out fully here.

## A Common Tragedy

The Tragedy of the Commons occurs when a number of people using a finite resource realise that they each stand to gain from taking more than their fair share. You know everybody else benefits from taking more, so that’s probably what they *are *doing; the more likely they are to be taking more, the more sense it makes for *you* to take more. And since everybody knows that it makes sense for *you* to take more… and so on, until the resource is depleted.

Fishers often know that their overfishing will lead to extinction of certain species of fish, and farmers often know that overfarming will leave the land infertile; but cessation is simply not feasible for them as individuals competition.

In a way the problem of short-term benefits outweighing long-term disadvantages resembles addiction. An chocoholic knows chocolate will make him fat; but it’s just too tasty to say no to! His willpower isn’t strong enough.

But our chocoholic friend has an option: he can employ a willpower-assisting strategy. He might not feel he needs chocolate just now, but can imagine a future point when the craving really sets in, when his willpower won’t be enough. So he acts now, while he can, and flushes the chocolate down the toilet, removing the temptation in advance. He enacts policy that anticipates future temptations. Sometimes people entrust this to others. “Don’t bring me any chocolate, I’ll only end up eating it.”

This is one solution to the Tragedy of the Commons. The consumers make a pact: they all agree to how much they can safely extract from the resource, and agree to be punished if they take more, even if – especially if – it later becomes profitable in the short term to do so. This leads to the creation of national parks, protected areas, one-child policies, fishing quotas and so on.

This kind of contract exploits a very human difference in our valuation of rewards depending on how far away in the future they are. Agreeing to protect a resource is easy when the resource is plentiful.

The trouble is that a sufficiently plentiful resource might not even be *seen *as a resource. What value do you place, for example, on air? Not much, until you start to see it polluted.

There was a time when the Earth was seen by most people as an infinite source and an infinite sink. Why regulate things that will never run out? It is only after we realise that something is in danger that it makes sense to protect it – but if the realisation comes late, and we see that the resource is actually scarce, then the short-term benefits of looting it faster than the other fella become very real to us indeed.

There is a time window between thinking something too free to regulate and thinking it too precious to regulate, and the window is often narrow. What have we missed it for? And what do we still have time to protect from our future, greedier selves?

REFERENCES

This was written on a train having just finished *The Logic of Life* by Tim Harford, and with Dennett’s *Freedom Evolves* fresh in memory. Both very worth reading.

## Ideas Worthy of Nurture

For human failing/strength/preference/proclivity *x*, which is more important, nature or nurture?

Nothing could be more of an empirical question. Science can’t explain everything, but there are some things that are absolutely slap-bang in the centre of what science can explain.* This is one of them. The methodology is well laid-out. Take a group of people who have similar a genetic makeup but different environments (like identical twins raised apart), and another group have a shared environment but different genes (like adopted children). See how much variation in *x* there is between groups, and compare that to the variation *within* the groups. Perform the necessary statistical tests, see what the outcome is.

This should be as simple, or as complicated or imperfect or conclusive or vague, as any other scientific enquiry. Nevertheless, the nature/nurture question is different. No other issue has more power to fog the process of rational investigation, because it is so intimately involved in how we apportion blame.

It is easy to blame people for things they choose. But it is much harder to blame them for what they *are*.

For human trait *x*, whichever one you’re interested in, the research will exist – or it won’t. It will be a well-planned experiment or something so poorly executed you’d be amazed it snuck through peer review. It’ll tell you one thing or another, or something in between, or nothing. But in a lot of cases, this won’t matter. In a world of conflicting information, complicated science and a lack of understanding of the relationship between how we were born and what we can become, a lot of people will select the evidence that suits the prejudices of the time. And sometimes great harm results.

To an extent, this is a question of who speaks loudest. The voice of a scientist with graphs and facts is too easily drowned out by a hysterical politician’s claims that people are born violent or raised gay, brought up female or psychotic from birth (or the other way around, as suits). The scientist’s problem is not just making herself heard: she must also overcome the public’s misunderstandings about what exactly we *mean* when we say that a gene influences behaviour.

If something is genetically determined, that does not make it inevitable. And just because a thing is natural, that doesn’t make it good. Until these two ideas are widely understood, a society built on an accurate understanding of human nature will always face hostility from people who won’t be told what they don’t want to hear.

REFERENCES

All of this and more (and better) in *The Blank Slate: The Modern Denial of Human Nature* by Steven Pinker. See also *Freedom Evolves* by Daniel C Dennett for the difference between *determined *and *inevitable*.

* Don’t even get me started on homeopathy.

## Alu and You

Selfish gene theory is a gene-centred view of natural selection. Your genome is made up of thousands of individual genes, each one of which has only one goal: replication. And since the resources available for replication are finite, the genes compete.

One way in which a gene might ensure it gets reproduced is to gang up with some other genes to make a body. This body will act as a temporary, disposable vehicle, to be thrown away once it has had children, but worth designing carefully. The genes try to make the body survive at least to childrearing age, by equipping it with sharp teeth or keen eyes. Genes have to learn to work together: a gene for light bones might do well to pair up with a gene for wings; genes for gills and flippers go hand in hand. All of the wonderful good design of life comes from genes cooperating to compete – building bodies that enhance their chances of replication.

But even within a body, the competition between genes is still going on.

Imagine your genome as a sequence of letters, T, C, A and G – 2.9 billion letters arranged in a line, maybe on a tape of paper. Every generation, this piece of paper gets transcribed and copied. Sometimes, mistakes are made.

About 65 million years ago, early in the evolution of primates, an error in gene replication created a monster called Alu.

Alu is a transposon: a short piece of DNA – about 300 letters long – that is able to reproduce itself *within* the genome. It appeared when a gene necessary for protein synthesis was mistranscribed. It only had to appear once. Since then, Alu has been quietly copying itself in the genomes of primates – humans included.

Its success has varied over history. Right now it is believed to create one extra copy of itself in the genome every 200 generations or so; in times it has been more successful. In those periods, almost every child had one more Alu unit than its parent.

Like other genes, it is copied from one generation to the next; the effects are cumulative. Over the immensely long course of its existence, Alu has done well. Imagine huge segments of your DNA existing as the same sequence of letters repeating over and over again – not just hundreds of times, but *millions* of times.

65 million years after it first appeared as a mutation in a single primate, Alu now occupies 10% of your genome. 10% of your DNA is made up of these 300 letters, meaningless junk repeated over and over again, simply because in the great competition of evolution, Alu found a way to cheat.

But even Alu can mutate, adapt, evolve, and now there are subtle variations of Alu, superfamilies that compete with each other, trying to out-copy one another…

And the game goes on.

REFERENCES

Richard Dawkins: *The Selfish Gene* and *The Ancestor’s Tale*

The numbers vary from one paper to another; I took the above from “Alu Repeats and Human Genetic Diversity”, Batzer and Deininger; see also *Molecular Biology of the Cell*, second edition.