• Post Reply Bookmark Topic Watch Topic
  • New Topic

.class File code block and local variables  RSS feed

 
Robert Konigsberg
Ranch Hand
Posts: 172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello all,

I am reviewing the .class file format and I would like to ask a couple of questions:

Here is a sample disassembly:


which represents

The local variables were pulled in from the Local Variable code attribute. Note how each local variable has an index.

QUESTION #1 Can I assume that the first local variable will ALWAYS be 0, etc... to count-1? I suspect the answer is "no" and that I should sort the list prior to working it.
QUESTION #2 How can I determine which local variables are actually method parameters? Again, should I assume that it's the first "n" parameters, if there are two parameters, they're indexed as 0-n-1?
QUESTION #3 Any hints on how to deal with the fact that the "int i = 2" starts at PC 0, but the local variable table indicates that "i" starts at PC 2?

Thanks, Rob
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
First of all, what are you trying to do, Robert? Bytecode can vary from compiler to compiler so, if you're trying to process this bytecode in some way, you may very well be limited to one compiler.

Additionally, I'm not sure if any answers I could give you would be accurate. If this bytecode really does vary from one compiler to the next, what might be true in this bytecode might be false in the bytecode generated by a different compiler.
 
Robert Konigsberg
Ranch Hand
Posts: 172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Oh I'm building my own decompiler for the challenge, and any answers you can give would be good. Just tell me what experience you have with your compiler! Additionally, if someone else can provide some further guidance, I'd like to hear it, too!

Thanks.

Rob
[ August 16, 2004: Message edited by: Robert Konigsberg ]
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, Robert, if you're diving into bytecode deep enough to write your own decompiler, I'm afraid you probably already know a lot more about bytecode than I do. Anyway, let me do what I can.

First off, I just finished writing this article. It's going into this month's edition of the JavaRanch Journal as soon as it goes out. My guess is that you probaby know everything in that article already but, nonetheless, it's there. If you do happen to read through it, I'd be happy to hear any comments you might have.
 
Robert Konigsberg
Ranch Hand
Posts: 172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey, that's a pretty good article! I didn't bother reading about String vs. StringBuffer, but only because I intend to read it later on in my project.

I think what I'll do is rearrange some of the fields in the local variable table in a .class file, and see how that impacts javap. Are you interested in any results?

Also, if you are interested in putting another article together about some of the details under the hood, I'd be happy to help you and/or contribute.

RK
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I can confirm that for member methods, local variable 0 is always "this", and subsequent local variables correspond to the parameters, in order. For statics, there is no "this" so the parameters start at 0.

As far as the scope of "i": at PC=2, "i" has been defined and is available for use. Before that, it hasn't been and it isn't. Make sense?

As far as "register allocation" goes, there's nothing to assure that the compiler will use only contiguous blocks of local variable numbers (although of course, it should want to.) But the compiler could, for example, choose not to use #2 for anything. You just have to see what's there.
 
Robert Konigsberg
Ranch Hand
Posts: 172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I can confirm that for member methods, local variable 0 is always "this", and subsequent local variables correspond to the parameters, in order. For statics, there is no "this" so the parameters start at 0.

My (limited) experience says something different. It suggests that local variables are laid out like so:

0-(n-1): method parameters
n: this
n+1-... local variables

And for static methods the "n: this" is simply omitted.

In other words, I agree with you, except that method parameters seem to come first.

As far as the scope of "i": at PC=2, "i" has been defined and is available for use. Before that, it hasn't been and it isn't. Make sense?

Not yet, but I think I just need to deal with a whole lot of examining output. Right now I'm limiting myself to one test class so it'll be a while. I'm looking forward to figuring out how to map the IFNE (or whatever it is) and GOTO to "if", "while", and "for". I have a plan though...

As far as "register allocation" goes, there's nothing to assure that the compiler will use only contiguous blocks of local variable numbers (although of course, it should want to.) But the compiler could, for example, choose not to use #2 for anything. You just have to see what's there.


I don't understand what you're saying here. Please explain.

Additionally, I did a fun test: I took a method like this:
public void func()
{
int firstParameter = 2;
int secondParameter = 3;
}

which has three locals: this, firstParameter and secondParameter. I moved them so their order was firstParameter, secondParameter and this. However, their index value did not change. In other words it went from this:



to this


I did this by editing the .class file with UltraEdit. When I ran DJ Decompiler on it, it restored the source code, correctly. That doesn't mean anything about compilers, but it means that DJ thought about this, too.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ernest Friedman-Hill:
I can confirm that for member methods, local variable 0 is always "this", and subsequent local variables correspond to the parameters, in order. For statics, there is no "this" so the parameters start at 0.


I thought Ernest was correct about this so I wrote up a quick little test program. Take a look at this:



Note lines 5, 12, 19, and 26 in the bytecode. You can see easily that the reference to "this" is found at index 0 of the local variable table while the two parameters are at indexes 1 and 2, respectively. The local variable, z, is then placed after the parameters, at index 3.

Are you seeing something different in your own bytecode? If so, what compiler are you using?
[ August 17, 2004: Message edited by: Corey McGlone ]
 
Robert Konigsberg
Ranch Hand
Posts: 172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Arg. Once again I might have missed something. OK maybe I need to actually PAY SOME DAMN ATTENTION!!! Sorry. It's my "THIMK" syndrome.
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Robert Konigsberg:
Arg. Once again I might have missed something.


Hehe. As you said yourself, writing a decompiler is going to be a challenge. Heck, as we can see here, just interpreting the byte code spec can be a challenge.
 
Robert Konigsberg
Ranch Hand
Posts: 172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yeah, but not when I'm my own enemy!!



Yeah, *this* comes first, then parameters, then locals.

RK
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!