|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
|
| News | Recommended Links | Recommended Papers | Comprehension | Slicing | Outlining | Literate programming | Beautifiers |
| Code metrics | Code Browsers | Call graph analyzers | Diff tools | Xref tools | Code review and inspections | Humor | Etc |
Please note that one of the best editor that can help in comprehension and has built-in slicing facilities is probably XEDIT family (Kedit, THE, etc.). Some members of this family (THE) are free.
The second very useful and underused feature is outlining. The classical implementation of outlining can be found in MS Word, although without special macros MS Word does not support slicing. It's important to note that Ms Word is probably the simplest way to implement "literate programming" in MS windows environment. I recommend to use HTML-mode. In many cases conversion of the source to HTML and working with HTML instead of ASCII text can lead to better software quality. In this case Source tree becomes a WEB page and can contain useful links from comments to other parts of the tree.
Dr. Nikolai Bezroukov
|
|||||||
About: Sunifdef is a command line tool for eliminating superfluous preprocessor clutter from C and C++ source files. It is a more powerful successor to the FreeBSD 'unifdef' tool. Sunifdef is most useful to developers of constantly evolving products with large code bases, where preprocessor conditionals are used to configure the feature sets, APIs or implementations of different releases. In these environments, the code base steadily accumulates #ifdef-pollution as transient configuration options become obsolete. Sunifdef can largely automate the recurrent task of purging redundant #if logic from the code.
Changes: Six bugs are fixed in this release. Five of these fixes tackle longstanding defects of sunifdef's parsing and evaluation of integer constants, a niche that has received little scrutiny since the tool branched from unifdef. This version provides robust parsing of hex, decimal, and octal numerals and arithmetic on them. However, sunifdef still evaluates all integer constants as ints and performs signed integer arithmetic upon them. This falls short of emulating the C preprocessor's arithmetic in limit cases, which is an unfixed defect.
I would suggest a slight variation on the theme. Fire up the application, start it on one of its typical tasks, and then interrupt it in the debugger to catch it. While the process is stopped mid-flight, take note of the call stack to see which classes and methods are being used. Maybe step through a few calls, then let the program run some more.
By doing this repeatedly, you will quickly get a sense for which parts of the code see the most action, and would provide the most obvious places to start studying the code base, and provide the best bang-for-buck return on your time.
Doxygen I thought did java-doc like parsing for C++? I was thinking he should look for something able to build a UML diagram based on the code... I hate UML, but if there isn't any documentation telling you the structures of the code it might be a place to look.
Doxygen is more than a javadoc replacement.
I like Doxygen + Graphviz. Just set Doxygen to document all (instead of just the code with tags) and set it to generate class diagrams, call trees, and dependency graphs and allow it to generate a cross reference document that you can read using your web browser. Set the html generator to frame based, and your browsing of code will be easier. I would also set Doxygen to inline the code within the documentation.
I've use Doxygen to reverse engineer very large programs and had good luck with it. I will say Doxygen is not going to do all your work for you, but it will make your job easier. Especially if you add comments to the code as you figure each section out.
Now if you like to see the logical flow of each method then try JGrasp (jgrasp.org). It has a neat feature called CSD that allow you to follow the logic of the code a little better. It's a java based IDE so that may be a turn off for you. I do whole heartedly recommend the Doxygen (w/ Graphviz).
Good luck.
In fact, it nicely highlights the
difference between "software engineers" and
"code monkeys". Code monkeys just dive in;
they never pause to think. In fact
In fact, a software engineer will happily spend a day or two putting the right tools in place, *including* a full backup and a proper version management system for when he's going to have to touch anything.
The first thing you want to know about a
new code base (after you find out what it's
supposed to be doing) is its structure.
Tools like Doxygen (see previous posts) show
you that structure *far* quicker and *far*
more reliably than any amount of dumb
code-browsing can. And besides
The second thing is to figure out if it calls any "large" functionalities like subroutine libraries or even stand-alone programs like databases, let alone if it makes operating system calls. The call-tree will give you an excellent view, and the linker files can complete the picture. You wouldn't be the first maintenance programmer who found out after months that his application critically depends on some other application he wasn't told about.
The third thing is to see where your code does dirty things. Let the compiler help you. Just compile your application with warnings on and have a look at what the compiler comes up with. You might be surprised (and horrified). Then compile with the settings used by your predecessor and check that your executable is bit-for-bit identical to what's running (you wouldn't be the first sucker who's given a slightly-off code base).
If performance is at all important, then running the whole thing for a night on a standard case under a good profiler will also tells you lots of important things. Starting with where your code spends its time, where it allocated memory and how much, and where the heavily-used bits of code are. All neatly written down in the profiler logs.
Finally, run your application with a tool
to detect memory management errors the first
chance you get. Useful tools are Valgrind
(in a Linux environment), Purify (expensive,
but probably worth it) under Windows, and
sundry proprietary utilities under Unix.
Just about 90% of the errors made in C
programs come from memory management
problems, and half of them don't show up
except through memory leakage and
overwritten variables (or stacks
When you know all that (if you have the tools in place, all of this can be done within 1 day + 1 overnight run + 1 hour reading the profiler output), go ahead and trace through the code in a debugger. You'll be in a *far* better position to judge what you should be reading.
ETrace : Run-time tracing http://freshmeat.net/projects/etrace/ [freshmeat.net]
This book is worth a read http://www.spinellis.gr/codereading/ [spinellis.gr]
Draw some static graphs of functions of interest using CodeViz http://freshmeat.net/projects/codeviz/ [freshmeat.net]
Write lots of notes, preferably on paper with a pen rather than electronically.
Creating small demo apps that use the code can also help.
mhack
Get the guys who use it to explain what they're trying to do, read the code for a couple of days and then have them show you how they use the application. Then plan on six months to a year to get to the point where you can look at buggy output and know immediately where the failure is occurring. In the mean time just work in it as much as you can and don't try to redesign major parts of it until you know what it's doing.
That said, I recommend against just diving in to some random bit of code. You'll probably never need most of it. Heck, I've never read the majority of the code of the project I work on, and that's after several years, with approx 1M lines to consider.
You need to get the big picture instead. Identify the entry point(s), and look for the major functions they call, and so on down until you start to get a feel for how the work is broken down. Look for the major data structures and code operating on them as well, because if you can establish the important data flows in the program you'll be well on your way. Hopefully the design is fairly modular, and if you're in OO world or you're working in a language with packages, looking at how the modules fit together can help a lot too. Any good IDE will have some basic tools to plot things like call graphs and inheritance/containment diagrams, if not there are tools like Doxygen that can do some of it independently.
If you're working on a large code base without a decent overall design that you can grok within a few days, then I'm afraid you're doomed and no amount of tools or documentation or reading files full of code will help you. Projects in that state invariably die, usually slowly and painfully, IME.
Error 'Format Conversion Error, converting
from Y2K to Z2L' added to module x1
Error 'Out of Memory Banks' added to module x2
Error 'Object Expected; found adjective instead'
added to module x3
Error 'bitbucket 95% full; please empty' added
to module x4
Added 1,000,042 to some random value in module
x5
Added 5,555,555 to some random value in module
x6
Not only will you learn about the code, you'll make a great impression on your boss, when, within minutes, you are able to resolve some mysterious problem that has never happened before.
I've since added both cscope [sourceforge.net] and freescope [sourceforge.net], as well as the old Red Hat Source Navigator [sourceforge.net] for good measure.
Stepping Through (Score:5, Insightful)
If there's no one at your company that can help answer your questions and bring you up to speed, I feel for you - your employers ought to know enough to give you some extra margin. It can be very hard to take over a large code base without some human-to-human handover time.
Also, is it an object-oriented system? I assume that it's not, based on your post, but you don't say either way. If it is, the important aspects of program flow often live in the interactions between classes and objects and the business logic is decentralized. OO is great, but it can be harder to reverse-engineer business logic because it's distributed among various classes. A debugger that lets you step through running code is almost essential in this case.
Re:Stepping Through (Score:5, Insightful)
Place a breakpoint somewhere you think will get hit (e.g. main), and then start stepping over and into functions. I usually attack this problem as follows:
Place breakpoint. Use step-in functionality to drop down a ways into the program, looking at things as I go. What are they doing, how do they work, etc.
Once I feel like I understand how a section of code works, I step over that code on subsequent visits. If I feel like this isn't taking me fast enough, I let the program run for a bit, then randomly break the program and see where I am.
Lather, rinse, repeat.
Also, this should go without saying, but you should ask someone who works with you for a high-level overview of what the code is doing. The two of these combined should get you up to speed as quickly as possible.