• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Some questions about DB file structure

 
Gautham Kumar
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Roel,

Two more questions:

1) Magic Cookie : It seems this is in bytes. What validation do we need to do with this. Do we need to convert to int and compare with a known value?
2) I have created a RandomAccessFile to read the db file and i could read the intial content.



Output:
Magic cookie value is 514
No of Fields value is 512

I am not sure if the output i am producing is valid or do i need to convert into anyother datatype and compare?

Once i sort out magic cookie and below items, i will proceed with schema description and data sections.

start of file
4 byte numeric, magic cookie value identifies this as a data file
4 byte numeric, offset to start of record zero
2 byte numeric, number of fields in each record


Please let me know,
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Gautham Kumar,

When you post code snippets, please make sure your code has proper indentation and use code tags. This makes the code (and your question) much more easy to read, no need to decipher the code). And an easier to read question will probably get an answer more quickly.

Kind regards,
Roel
 
Roel De Nijs
Sheriff
Posts: 10662
144
AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
1/ That's completely up to you. Some people didn't care about this magic cookie. Other read the file, printed the magic cookie, created a constant with this magic cookie value and used this constant to validate the magic cookie to ensure the database file being valid. Don't forget to explain your decision in choices.txt. That's true for the complete assignment. The instructions are very open, so you can go several ways. There is no good or wrong (as long as you don't violate a must requirement of course ), just explain in your choices.txt why you opted for a certain approach and that's just fine.

2/ I doubt if your database file contains 512 fields. Seems a little bit much. Normally your assignment would define which fields the database file contains with a little explanation about each fields.

If your instructions state something is 4 byte numeric you should indeed use readInt() from RandomAccessFile. And a 2 byte numeric is the equivalent of readShort(). So if we take a look at the database file structure:
  • magic cookie = 4 bytes --> readInt()
  • offset to 1st record = 4 bytes --> readInt()
  • number of fields = 2 bytes --> readShort()


  • Hope it helps!
     
    Rehan Zahoor
    Ranch Hand
    Posts: 83
    Android Java Netbeans IDE
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I did not care about the magic cookie. But checked the file for consistency.
    You can view your db file in wordpad if you are using windows. Modify your main to read the whole db file. That would be encouraging.
     
    Gautham Kumar
    Greenhorn
    Posts: 11
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Roel,

    Good News!!

    Finally, I could read successfully first 3 items using RandomAccessFile:-) and compare the output with that generated from the DB File Reader Tool from Roberto.

    Can you please validate?

    Output:
    Magic cookie value is 514
    Offset to start of record zero 70
    No of Fields in each record 6

    Question#1: Can you please verify/confirm if first 4 bytes are for Magic Cookie, next 4 bytes are for Rec Length and next 2 bytes are for "No of Fields in each record".

    Am i in the right direction?

    Qustion #2 : The second entry output of "70", is this for "offset to start of record zero" or "record length"? I am confused;-( what exactly is offset ? please explain a little bit about offset?

    Question#3 : Are there any spaces between the entries in the DB file? If so, how to skip spaces and jump to next entry using Random Access File?


    FYI, My data file format:

    *********************
    Start of file
    4 byte numeric, magic cookie value identifies this as a data file
    4 byte numeric, offset to start of record zero
    2 byte numeric, number of fields in each record

    Schema description section.
    Repeated for each field in a record:
    2 byte numeric, length in bytes of field name
    n bytes (defined by previous entry), field name
    2 byte numeric, field length in bytes
    end of repeating block

    Data section. (offset into file equal to "offset to start of record zero" value)
    Repeat to end of file:
    2 byte flag. 00 implies valid record, 0x8000 implies deleted record
    Record containing fields in order specified in schema section, no separators between fields, each field fixed length at maximum specified in schema information

    End of file
    *********************

    Thanks,
     
    Roel De Nijs
    Sheriff
    Posts: 10662
    144
    AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Gautham Kumar wrote:Question#1: Can you please verify/confirm if first 4 bytes are for Magic Cookie, next 4 bytes are for Rec Length and next 2 bytes are for "No of Fields in each record".

    According to your data file format: the order is magic cookie (4 bytes), offset to record 0 = 1st record (4 bytes) and number of fields in each record (2 bytes). So no record science here, just reading/interpreting your instructions about the data file format correctly.
    Start of file
    4 byte numeric, magic cookie value identifies this as a data file
    4 byte numeric, offset to start of record zero
    2 byte numeric, number of fields in each record


    Gautham Kumar wrote:Qustion #2 : The second entry output of "70", is this for "offset to start of record zero" or "record length"? I am confused;-( what exactly is offset ? please explain a little bit about offset?

    The 70 is (according to your data file format) the offset to start with record 0 (= the 1st record), not record length. Your data file contains 2 "sections": a 1st one contains some meta data (the magic cookie, the offset, number of fields, schema description) and a 2nd one with the actual data (the records). The offset is nothing more than the size of the 1st section. So if you would jump (seek) to position 70, you can start reading the 1st record.
    All assignments have slightly different instructions or database file lay-out. For example, my assignment didn't have an offset to start of record zero but it had a record length instead.

    Gautham Kumar wrote:Question#3 : Are there any spaces between the entries in the DB file? If so, how to skip spaces and jump to next entry using Random Access File?

    Again read your instructions carefully. It states clearly "Record containing fields in order specified in schema section, no separators between fields, each field fixed length at maximum specified in schema information". So no seperators and each field has a fixed length. In my database file for example the first field had a length of 64, so I had to read a byte array with length 64 to get the value for this field. And then I could immediately read (without repositioning my file pointer) the 2nd value using a byte array of a different length.

    Gautham Kumar wrote:Am i in the right direction?

    Your code looks similar to my custom database file reader. The only remark I have is: why do you use seek-calls between readInt-calls? If you open the file, the file pointer is at position 0 (so seek(0) does nothing). If you call readInt-method, the method returns an int and the file pointer will be at position 4 (because 4 bytes were read to get the int value) and you can read the next int (so seek(4) again does nothing). So now you need to read the schema description: a number of fields and their length (e.g. field "name", length = 64)

    Hope it helps!
    Good luck!
     
    Gautham Kumar
    Greenhorn
    Posts: 11
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Roel,

    Thanks a lot for your valuable input.

    Please check my updated code:


    I could get to "Data Section" "record zero" with seek(70) and using readLine() method print all the data elements in one row.

    Question#1: I don't understand schema definition, it is very ambigous. Can you please explain in more details? especially "n bytes (defined by previous entry)"? Are they separated by commas","'s? How to read the schema definition? Is schema definition same as field names and length of each field?

    Question#2: How to read "field names" and "field length" which seem to be critical for reading rest of contents.

    Question#3: Which methods should i use in Random Access File to read each element in data section as i have ~28 data records? with readLine(), i could print all 28 records in one line.

    I am taking one step at a time and could see light at the end of the tunnel :-)

    Appreciate your help,

    Thanks,
     
    Roel De Nijs
    Sheriff
    Posts: 10662
    144
    AngularJS Chrome Eclipse IDE Hibernate Java jQuery MySQL Database Spring Tomcat Server
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Gautham Kumar wrote:Please check my updated code:

    I will of course not verify and comment each time on your (updated) code. Just now to put you on the right track and then it's completely up to you.

    Why do you name your variable recLength as it is the offset to the 1st record? So dataOffset or recordOffset would make much more sense. And then you could use this variable to use in the seek-call (instead of using a hard-coded value).

    Gautham Kumar wrote:Question#1: I don't understand schema definition, it is very ambigous. Can you please explain in more details? especially "n bytes (defined by previous entry)"? Are they separated by commas","'s? How to read the schema definition? Is schema definition same as field names and length of each field?

    You are not the 1st one having trouble to decipher this schema description section Once you get the hang, it's very easy. Just like in the record section no separators are used. So let's have a look at the schema description:
    Repeated for each field in a record:
    2 byte numeric, length in bytes of field name
    n bytes (defined by previous entry), field name
    2 byte numeric, field length in bytes
    end of repeating block


    So for each field (and you know how many fields you have in a record) this schema describes 3 things:
  • the length in bytes of the field name (2 bytes numeric --> readShort). Let's store this value in fieldNameLength
  • the field name itself. How many bytes? This is different for every field (therefore the schema description uses "n bytes"). So how do you know the number of bytes? You just have read the number of bytes in the previous step and stored the value into variable fieldNameLength (tha's what the schema description means with "defined by previous entry". Let's see if you can figure out yourself which method you can use from RandomAccessFile to read the field name. I can't do all the work
  • the length of a field (2 bytes numeric --> readShort)


  • Gautham Kumar wrote:Question#2: How to read "field names" and "field length" which seem to be critical for reading rest of contents.

    See question #1.

    Just for the record. Being able to read field names and lengths is not critical for this assignment. Normally in your instructions you'll have a table describing the record structure with field names, the data type, field length and a short description of this field. So you could hard code these values in your program and start reading records (after you jumped to the 1st record) based on these hard-coded values. Many ranchers followed this hard-coded approach and also passed this certification. So it's not a must requirement (and not critical) to dynamically read the schema description.

    Gautham Kumar wrote:Question#3: Which methods should i use in Random Access File to read each element in data section as i have ~28 data records? with readLine(), i could print all 28 records in one line.

    As you already noticed readLine is good enough to have a view of each record and put something in the console. But it's not useful to read each field seperately. So for each record (until you'll hit EOF, so that's another loop) you'll need to read the values for each field (yet another loop ) using the field length (hard-coded or read from schema description). The method you'll need to use for each read is the same one as the method to read the field name

    Hope it helps!
    Good luck!
     
    • Post Reply
    • Bookmark Topic Watch Topic
    • New Topic