The Y2K38 Bug... Y2K revisited???

I still remember the hysteria, tensions and doomsday predictions that accompanied the Y2K bug. I was in my penultimate year of engineering.

The two things I still remember about that time are:
  • The doomsday conspiracies about how the world as we know it may cease to exist at the stroke of midnight on Dec 31, 1999.
  • The sudden spurt in the demand for COBOL (an archaic programming language) still widely prevalent in all financial and banking software. The local entrepreneurs(??) looked upon this as a great opporunity, and overnight hundreds of COBOL training institues ("we specialise in the Y2K bug") sprang up all over town. The Y2K bug actually launched the careers of a lot of engineers who had either not yet found jobs or were in jobs that did not pay much.
Finally Jan 1, 2000 dawned and the world was pretty much the same, except for the massive hangovers that a lot of people had due to the hard partying. :-)
 

A lot of people started the "I had told you that nothing is going to happen" routine. But the problem was minimised due to hectic efforts all over the world rectifying a lot of COBOL code to take in the century digits of the year. Y2K was caused because many older machines did not store the century digits of dates, hence the year 2000 and the year 1900 would appear the same.

Of course we now know that the prevalence of computers that would fail because of this error was greatly exaggerated by the media. Computer scientists were generally aware that most machines would continue operating as usual through the century turnover, with the worst result being an incorrect date. This prediction withstood through to the new millennium.  Affected systems were tested and corrected in time, although the correction and verification of those systems was monumentally expensive. 

The year-2038 bug is similar to the Y2K bug in that it involves a time wrap not handled by programmers.

There are however several other problems with date handling on machines in the world today. Some are less prevalent than others, but it is true that almost all computers suffer from one critical limitation. Most programs use Coordinated Universal Time (UTC) to work out their dates. Simply, UTC is the number of seconds elapsed since Jan 1 1970. A recent milestone was Sep 9 2001, where this value wrapped from 999'999'999 seconds to 1'000'000'000 seconds. Very few programs anywhere store time as a 9 digit number, and therefore this was not a problem.

Modern computers use a structure named time_t which internally uses a standard 4 byte integer for this second count. This is 31 bits, storing a value of 231. The remaining bit (bit no. 32)  is the sign.  

A few examples would help in explaining this:-

Date & time                              time_t
1-Jan-1970, 12:00:00 AM GMT     0
1-Jan-1970, 12:01:00 AM GMT     60
1-Jan-1970 , 01:00:00 AM GMT    3600
2-Jan-1970, 12:00:00 AM GMT     86400
1-Jan-1971, 12:00:00 AM GMT     31536000
1-Jan-1972, 12:00:00 AM GMT     63072000
1-Jan-2038, 12:00:00 AM GMT     2145916800
19-Jan-2038 , 03:14:07 AM GMT  2147483647

This means that when the second count reaches 2147483647, it will wrap to -2147483648, when one more second elapses.

The precise date of this occurrence is Tue Jan 19 03:14:07 2038. At this time, a machine prone to this bug will show the time Fri Dec 13 20:45:52 1901, hence it is possible that the media will call this The Friday 13th Bug 

This problem may not be as easy to fix as the Y2K bug. Almost all the software that we see around us, and a lot that we do not see (embedded systems) are coded excusively in C. Unlike the Y2K problem, the Y2K38 probelm is more deeply rooted in the system and can rear up in more than one way. All personal computers are likely to be 64-bit by the year 2038. So the problem should disappear (for the time being) due the increase in the storage capacity of time_t from 4 bytes to 8 bytes. But most embedded systems will mostly still be 32-bit or less. Note that the large majority of embedded systems today are still 8 or 16-bit. These could include microwave ovens, wrist-watches, elevators, gas-station pumps, car fuel injection computers, radios etc. There are orders of magnitudes more small embedded systems in the world than there are desktop computers.

To get into finer points, application software running on 64-bit systems may not use the POSIX time_t type correctly. For instance, a C programmer may inadvertently cast time_t to 32-bits during a time calculation; or implicitly knock off the top 32 bits of a 64-bit time while storing or retrieving the time.

For more details on this (hopefully will be solved by 2038) problem, go to http://www.2038bug.com/ or simply type 2038 in google. 

 

Comments