Reverse Engineering for Software Systems

In its most rudimentary form, reverse engineering can be defined as “going backwards through the development cycle”. It has  its origins in espionage when rival countries would break apart a product from the opposite country and figure out its mechanism in order to build a similar product. Wikipedia defines reverse engineering as:

Reverse engineering (RE) is the process of discovering the technological principles of a device, object or system through analysis of its structure, function and operation. It often involves taking something (e.g., a mechanical device, electronic component, or software program) apart and analyzing its workings in detail to be used in maintenance, or to try to make a new device or program that does the same thing without utilizing any physical part of the original.

When I started my career at an Indian public sector telecommunication company, one of the projects that I worked on was a versatile multiplexer. At that time the Indian market was still closed and technological know how from the more developed countries was limited. We did not even have a sample of the multiplexer, just a technical brochure with a feature list of the product. I am not sure if that can be called reverse engineering, but our team did manage to build a working product. By the time it hit the market, I had left the company so I do not know how well it fared. However, I had other brushes with reverse engineering, both successful and not-so-successful.

One of them was when I was writing the software for a PC card based POS system. The challenge was that the system had to display a certain message when the system was booting up. Since the PC display had been replaced by a custom hardware display, there was an inordinately long time before the user would see a message. The POS software was written in C on top of MS- DOS. Since I was familiar with assembly code for the Intel 8086, I retrieved the BIOS that came with the PC motherboard and inserted hexadecimal code to initialize and display a message up- front after the BIOS had completed the initializations. This was the easy part. The challenge was that the code worked when I burnt it on a blank ROM, but refused to work when the exact code was added  into the BIOS. I had to tear apart the entire assembly code using a dis- assembler and reference many a book on BIOS and DOS internals before hitting upon the source of the problem. The IBM reference manual for the PC XT provided the mnemonics, and I chanced upon the chunk of code that did a sum- check to ensure that the BIOS was correct. Re- computing the sum- check number and replacing it in the right place did the job.

This was an example of a relatively successful tweaking using reverse engineering. My other endeavors have been less successful. A few years later when I was working to re- engineering a software used for manufacture of sophisticated X- Ray rooms, I was impeded not only by my lack of knowledge of COBOL and the network databases used by the legacy system, but also by the complete lack of documentation by the programmers. After a month of unsuccessful attempts at understanding the source code, I and my lead had to switch over to start top down and understand the business scenarios.

Even on a more recent project when re- engineering an enterprise system for a medical chain, I and my team figured that doing a top down analysis was much more useful to build the new application. Later we were informed by the customer that a number of unsuccessful attempts had been made earlier to re- engineer the system, all of them had started by trying reverse engineer the legacy application.

The major challenges I see with reverse engineering are:

  • The source code, where it is available, is often ‘spaghetti’- and very difficult to decrypt
  • The cost of reverse engineering is extremely high,and its returns vary from being at best, low to, at worst, unpredictable
  • Often because of obsolete technology, the technical skills are hard to find
  • Even if the source code is readable, it may be dependent on hard coded data or implicit assumptions that may not be visible to the person trying to reverse engineer the code
  • In case of applications, business requirements change so much over a period of time that a lot of the functionality in the code may not even be relevant any longer

A slightly more useful artifact to start the reverse engineering process is the database structure or the ER diagram where it is available. Even there, my experience has been that beyond the list of entities that may help us to understand the system, the details are misleading. The reasons are similar to those listed above- archaic and anachronistic remnants from the past.

To conclude, while reverse engineering sounds very good, and can be successfully used, it is always a procedure of last resort.

This entry was posted in Software Engineering, Tips and Tricks and tagged . Bookmark the permalink.

2 Responses to Reverse Engineering for Software Systems

  1. Bala says:

    It is indeed very good post….

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s