Geoffrey Hoffman

February 24, 1998

MEng. Progress Report #2

There were a number of changes made to the formatter, though they were more design changes, and the functionality of the formatter has not changed a huge amount. One of the major changes was moving the main formatter logic and decision making into it's own class. The UI now will invoke the formatter, and pass it a DataInputStream, and some parameters it pulls from the control panel. It will then return a string containing the output. This makes the program easier to understand, and may make porting the formatter to other situations easier.

I changed some of the structure of the logic in the formatter. This was mainly to solve a whole class of problems. The problems involve knowing when to increment and decrement the indent count. initially, one can just change the count when { or } are seen. This works fine if you assert that all blocks have brackets around them, but it is possible for IF, FOR, and WHILE loops to be a single line, in which case they are indented, and they will not have a trailing }. One potential solution is to keep an array indexed by the indent count, or a stack containing some sort of type identifier for each block of code. This way, when a semicolon is seen, for example, a compare will be done to see if this is in a single line loop. If so, it will decrement.

This solution can also be mapped to another problem. Another situation where I decrement the indent count is when a BREAK is seen. This way, it will be back for the next CASE statement. The problem comes in when you have a BREAK inside of an if statement or other loop. At the moment, it decrements at the BREAK, and then again at the } which is not proper. theoretically, it can use the same method of the array or stack to track the type of the code block. There are other related SWITCH-CASE statement issues that need to be worked out, since those statements have the possibility of being very convoluted. I also need to investigate the DO-WHILE loops some more.

Another problem that is as of yet not solved is the function spacing. Since the parser is only looking one token ahead, it is hard to know if the statement is going to turn into a function, or a variable declaration. One solution would be to at least space every object listed in the base level of the class, but in a situation in which a complicated UI or some other collection of a lot of objects is declared, the formatter will make the files very spread out.

I also added an option to track the depth of a parenthetical nesting. I need this to see where a single line IF, FOR, or WHILE ends (since I won't see a { ) as well as in FOR statements, where the semicolons delineate the different parts of the FOR loop, and should not have a line added.

Some other minor tweaking has been done, including a feature that displays a { followed by a } together, instead of spacing them like the formatter would want.

For the future, the big thins involve refining the logic. I plan to sit down and try to lay out what each token depends on, and try to make the logic cleaner, and more accurate, as has been a bit patchwork, as I was trying to get it to work. I plan to refine the grammar file for the Jlex parser as well, since there are a few minor problems with it. I also plan to change all the string operations over to stringbuffers for speed. Hopefully, I should have a fairly solid version in a short period.