03 August 2009

Year 2038 Problem (Y2K32)

Remember the last few years leading up to the year 2000? Everyone was talking and worrying about the Y2K bug. I was quite confident back then, and still firmly believe today (although I haven't looked for actual proof), that a lot of computer programmers made a lot of money by not doing much of anything. I doubt that there were more than a small number of Y2K bugs, and that they were very easily fixed or had negligible impact on business.

I was so confident of this that in 1997 I set the year on my computer at work to 2000. All of the software produced by the company I was working for handled it just fine. In fact, every piece of software on that computer handled it just fine except for one. It's been 12 years now, so I can't remember what that one piece of software that was, but I do remember that it worked just fine except for displaying the wrong year. I also remember that this one bug would have no impact on my productivity or the productivity of the company I was working for.

Why am I bringing this up? Because there really is a bug that will actually make a large impact on productivity around the world and there is no easy fix for it. This bug will become a major problem on January 19th, 2038 at 3:14:07 Greenwich Means Time (or for you ISO geeks, 2038-01-09T03:14:07+00:00). That's when the Unix timestamp runs out of seconds.

Now no one seems to be talking about it because very few people know much about computers, let alone what the Unix timestamp is. First of all, Unix and Unix-like operating systems (such as the many versions of Linux) are the main back bone of the Internet. Windows and Mac web servers will not save the day because most of them also use the Unix timestamp, so the Y2K32 bug will effect almost the entire World Wide Web.

The Unix timestamp is the number of seconds before and after the date January 1st, 1970. On January 19th, 2038 at 3:14:07 we run out of seconds.

This may seem like an odd date and time, but it makes perfect sense if you understand computers. Computers don't count like humans. Computers only know 1 and 0; like a switch, "on" and "off" (actually, that's all computers are, a bunch of switches). These 1s and 0s (or switches) are called bits. Eight bits are called a byte. If a number in a computer is only one byte in size, it has a range from 0 to 255, or if it is signed (can be negative), -128 to 127.

Here is an example of how a computer counts from 0 to 10 inside a byte:

0 : 00000000
1 : 00000001
2 : 00000010
3 : 00000011
4 : 00000100
5 : 00000101
6 : 00000110
7 : 00000111
8 : 00001000
9 : 00001001
10 :  00001010

Do you get the idea?

If we keep going to 255 we get 11111111. If we add 1 to 255 we go right back to 00000000 because we run out of space in the byte, which has only eight bits (or digits). Likewise, if we subtract 1 from 0 we get 255, or 11111111.

If this byte is signed (can be negative), the computer interprets the first bit as a negative sign. Thus, 255 becomes -1.

Here's some signed examples

-128 : 10000000
-127 : 10000001
-2 : 11111110
-1 : 11111111
0 : 00000000
1 : 00000001
2 : 00000010
126 : 01111110
127 : 01111111

Now that you know how to count like a computer, lets take a look at the Unix timestamp. The Unix timestamp is not just one byte but four bytes. That means that it's made up of 32 bits.

Here's what a 32 bit signed number looks like (I put in some spaces so it's easier for humans to look at):

-2147483648: 1000 0000 0000 0000 0000 0000 0000 0000
-2147483647: 1000 0000 0000 0000 0000 0000 0000 0001
-2 : 1111 1111 1111 1111 1111 1111 1111 1110
-1 : 1111 1111 1111 1111 1111 1111 1111 1111
0 : 0000 0000 0000 0000 0000 0000 0000 0000
1 : 0000 0000 0000 0000 0000 0000 0000 0001
2: 0000 0000 0000 0000 0000 0000 0000 0010
2147483646 : 0111 1111 1111 1111 1111 1111 1111 1110
2147483647 : 0111 1111 1111 1111 1111 1111 1111 1111

This means that 2,147,483,647 seconds after midnight of January 1st,1970 is 7 seconds after 3:14 am on January 19th, 2038. One more second wraps us around to 2,147,483,648 seconds before midnight of January 1st,1970, which is 52 seconds after 8:45 pm on December 13, 1901.

As I said, there is no simple solution to this problem. The timestamp can be made unsigned (always a positive number), which would be good until the year 2106, but this would be incompatible with current systems.  Many newer systems use a 64 bit timestamp, which is good until the year 4,292,277,026,596. However, this doesn't solve  the problem of hundreds of millions of 32 bit systems; many of which are embedded systems that can't be upgraded. Some file formats also use a 32 bit timestamp. Most people want to keep their data longer than the next 29 years, so this is a major concern.

That's right. It will be less than 29 years before this all happens. I'll be almost 69, so as long as I print everything I want to live longer than I do, I should be safe. I think I'll get rid of my computer when I'm 68 and only read books.