• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Jeanne Boyarsky
  • Junilu Lacar
  • Henry Wong
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Tim Cooke
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Frits Walraven
  • Tim Holloway
  • Carey Brown
Bartenders:
  • Piet Souris
  • salvin francis
  • fred rosenberger

How can I store deserialized objects in a List?

 
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi!
I have a problem to solve, where I am very stuck
Hm, at best I describe shortly the goal, then what I have so far and finally the problem where I am stuck. At the very end I will share code-snippets, so that you can better understand it.

Goal
We have two text-files and we want to simulate/build a very simple codeversioning-environment (like gitHub, but very simple).
At first our program creates a folder with thow sub-folders.
In one of the subfolder ('staging') we want to store basically all files which we have added into a .ser-file.
This .ser-file is in this 'staging'-folder.
We want to serialize basically a kind of staging area (into the .ser-file) and then deserialize it by reading from this .ser-file.
When we read from it, we want to print to the console
          a.) the filename
          b.) the hash of the file (for this we have built a convertToHash-method which creating out of the file-content an unique hashcode).


What I have so far:
I am able to acomplish all of the above mentioned, but only with/for one file.
When I want do add another file, it will get obvious that the first file was never stored within a list, like intended, as well as not in the .ser-file.
The .ser-file will be just overwritten with the second added file, while the first added file is no more longer inside - thus only the second added file will be displayed.
Please see the attached picture which shows to you the console-output, as well as the above mentioned directories, so that you get a better idea of what I mean. Please be aware that this is the second added file, and the first file has disappeared.

My problem
It clearly seems to be that my list-storing does not work.
I just show you the relevant code of
    a.) the serialization-method:

I have created outside of any method (at the top of this class) an ArrayList from the type 'StagingAreaItem', which is a class that I have created.
This class ('StagingAreaItem') contains the both attributes 'name' and 'hash'.

Code of declaring the mentioned list:


The File 'stagingAreaDir' is the staging.ser-file which you can see in the attached picture and where everything, i.e. every file that
is added shall be stored as implemented (name and hash).


And here is my deserializing-code:



As said, it works for one file, but this first file will not be saved/stored in the .ser-file (= stagingAreaDir) and thus not in my list, since I write the list into this file.

I really have no idea how I can make it to work and would appreciate any help.
regards
Michael

ListStoringProblem.PNG
[Thumbnail for ListStoringProblem.PNG]
 
Marshal
Posts: 69495
277
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can we have that very simply. Are you trying to serialise multiple objects to the same Path? That won't work; you will only retain the most recently serialised object.
What are you serialising, and why are you serialising anything?
Why doesn't your object have the digest hash as a field? You can probably find ready‑made classes that will calculate an SHA256 or similar for you. You need to be circumspect about when you call that method, so you aren't storing anything with an out of date SHA.
If you are doing version control, why do you need a serialised object? Why can't you simply use text files? Much easier and less error‑prone; I heard that Oracle want to get rid of serialisation because it is error‑prone, anyway.
Why are you using new File in line 5? Why aren't you using try with resources? I can't see that output stream being closed anywhere, which means the file may be locked as write‑only.
 
Michael Mutek
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:Can we have that very simply. Are you trying to serialise multiple objects to the same Path?
Yeah, basically the goal is to serialize multiple objects into the same .ser-file.

That won't work; you will only retain the most recently serialised object.
What are you serialising, and why are you serialising anything?
I want to track for potential changes/modification of the contents inside the files.
I plan to track for changes by using a MerkleTree (I have already implemented the code).
If I understand it correct, the idea of a MerkleTree is that two hashes from two children will create throughout concatenating them (hash1 + hash2) and then hashing the concatenated hash throughout a SHA-Algorithm (I have already implemented one),
the hash from its parent.
I have simplified my problem to having only two files and one parent-directory for now - when it works, I will extend the problem.
See the attached image.


Why doesn't your object have the digest hash as a field? You can probably find ready‑made classes that will calculate an SHA256 or similar for you. You need to be circumspect about when you call that method, so you aren't storing anything with an out of date SHA.

I know that the SHA-1 is outdated + there are ready classes for it. But for this task, I want to use still the SHA-1 and implement it myself (already done and it works, as you can see in my image in the first post -> the file gets a hash).
The objects have an attribute hash, as well as an attribute name (=filename).
But the hash needs to be created based on the content of the file using my convertToHash-method.


If you are doing version control, why do you need a serialised object? Why can't you simply use text files? Much easier and less error‑prone; I heard that Oracle want to get rid of serialisation because it is error‑prone, anyway.
Yeah, but for this task, I need to use Serialization for the cvs.
It's not my call...
edit: Or maybe I understand the task wrong? What does this sentence mean in our context:
"so you have to serialize the information stored in the “staging area” and load
again before every following call."

Maybe I shall not serialize from the beginning, but:
1.) At first store the objects normally in the .ser-file (which reperesents the staging area).
2.) And afterwards serialize/deserialize this .ser-file?

Was it maybe meant like that or totally differently and I do not get it at all?

Why are you using new File in line 5? Why aren't you using try with resources? I can't see that output stream being closed anywhere, which means the file may be locked as write‑only.



@try-with-ressources-statements:
Have learned only yesterday about them - will update my code in this regard.
But this will not solve my issue that I cannot story multiple objects within my .ser-file, right?
How can this be done?

Thanks so far.
Regards
Michael
 
Campbell Ritchie
Marshal
Posts: 69495
277
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Michael Mutek wrote:. . . I know that the SHA-1 is outdated . . .

Does that matter, as long as you get a recognisable hash from it?

. . . I cannot story multiple objects within my .ser-file, right?
How can this be done? . . .

I am afraid that, as far as I know, it can't be done. You can only serialise one object at a time in an XYZ.ser file.

You can create a List<XYZ> and serialise that; most standard List implementations are serialisable. Its elements are rather like fields of the List, and fields of an object are serialised along with it. As long as those fields can be serialised at all.
Otherwise you would need multiple xyz.ser files; you can try adding date and time or hash to their names.
 
Master Rancher
Posts: 3539
39
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:

Michael Mutek wrote:. . . I cannot story multiple objects within my .ser-file, right?
How can this be done? . . .

I am afraid that, as far as I know, it can't be done. You can only serialise one object at a time in an XYZ.ser file.


Sure you can.  You can, for example, write a file with five objects in a row, simply by calling writeObject() five times with five different arguments, then closing the stream, and file.  Then when you read it, you call readObject() five times in a row.  Works just fine.

The problem, though, is that this requires you to know in advance how many objects will be in the file.  Often, you don't know that.  So there are several possible solutions:

1. Write N objects, then when deserializing, just read objects in a loop until you get an error from an end-of-file.  This is messy, and may be hard to distinguish between a normal end-of-file error, and other errors that indicate other problems.

2. Use writeInt() and readInt() to write and read the number of objects at the beginning of the file.  To write 5 objects, use writeInt(5), then use writeObject() 5 times. To read, use readInt(), then call readObject() that number of times.

3. Just put everything in a list or array, and write the list / array.

4. Define a custom object to signal the end of file.  Write as many objects as you want, then write your EOF object. To read, just keep reading in a loop, and when you find an EOF object, you're done.

Option 3 is by far the easiest and most common (which is why it was already suggested).  Option 4 can be useful if you don't know in advance how many objects there will be, and you don't want to store them all in memory at once.  Option 2 can make sense as part of a custom serialization method.  And option 1 is really only done by accident.
 
Michael Mutek
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:

Michael Mutek wrote:. . . I know that the SHA-1 is outdated . . .

Does that matter, as long as you get a recognisable hash from it?

. . . I cannot story multiple objects within my .ser-file, right?
Actually thinking about my aimed usecase, security does not matter here indeed.
I just need basically distinctive, unique "IDs", i.e. hashes to avoid collisions...so yeah, thinking more about it, SHA-1 is here in general sufficient.

How can this be done? . . .

I am afraid that, as far as I know, it can't be done. You can only serialise one object at a time in an XYZ.ser file.
Yeah, I was thinking a lot yesterday about that, but could not wrap my head around it, but before going to sleep it hit me, that it makes sense that it cannot be done. As you said, I can serialize only one single  object at one time.
If I serialize later one more time, the now serialized object will just overwrite the content in my .ser-file - is this correct?
If so, then my solutionthought would be following:

When I want to represent a kind of staging are like in gitHHub (only simpler), then what I need at first is an object where I can store the current
working-directory which I want to add to my staging Area.
So basically for instance a object from the type List.
After that, I can serialize this list-object and store its contents in my .ser-file (= representation of my staging area).

Now, lets say I want to add to my staging area another file (like I have tried  - test2.txt-file).
At this point if I serialize it right away, then it would just overwrite the contents which are in my .ser-file - is this correct?
So what I need to do is to just add this new test2-txt-file into my list-object and then serialize my (updated) list-object again...
Would you agree with my solution idea or do I still get it wrong?


You can create a List<XYZ> and serialise that; most standard List implementations are serialisable. Its elements are rather like fields of the List, and fields of an object are serialised along with it. As long as those fields can be serialised at all.
Otherwise you would need multiple xyz.ser files; you can try adding date and time or hash to their names.

Oops, the next time I should make a habit of at first reading the whole post before starting to reply back right away D:
Anyways,
is this what you say basically the same what my solution idea says?
And yeah, I rather want to keep it as simple as possible and would therefore try your first slution-idea and skip for now the idea with
adding multiple .ser-files and somehow label them distinctively.




Thanks a lot for your reply - I feel that thanks to your replies I slowly get a better idea of this problem:)
 
Michael Mutek
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Mike Simmons wrote:

Campbell Ritchie wrote:

Michael Mutek wrote:. . . I cannot story multiple objects within my .ser-file, right?
How can this be done? . . .

I am afraid that, as far as I know, it can't be done. You can only serialise one object at a time in an XYZ.ser file.


Sure you can.  You can, for example, write a file with five objects in a row, simply by calling writeObject() five times with five different arguments, then closing the stream, and file.  Then when you read it, you call readObject() five times in a row.  Works just fine.

The problem, though, is that this requires you to know in advance how many objects will be in the file.  Often, you don't know that.  So there are several possible solutions:

1. Write N objects, then when deserializing, just read objects in a loop until you get an error from an end-of-file.  This is messy, and may be hard to distinguish between a normal end-of-file error, and other errors that indicate other problems.

2. Use writeInt() and readInt() to write and read the number of objects at the beginning of the file.  To write 5 objects, use writeInt(5), then use writeObject() 5 times. To read, use readInt(), then call readObject() that number of times.

3. Just put everything in a list or array, and write the list / array.
This is basically storing everything (and also every update (add or remove)) into a list-object (or array-object) and only then serialize this list/array-object?
If so, I like that best for know and will try that out.



4. Define a custom object to signal the end of file.  Write as many objects as you want, then write your EOF object. To read, just keep reading in a loop, and when you find an EOF object, you're done.

Option 3 is by far the easiest and most common (which is why it was already suggested).  Option 4 can be useful if you don't know in advance how many objects there will be, and you don't want to store them all in memory at once.  Option 2 can make sense as part of a custom serialization method.  And option 1 is really only done by accident.



Anyways, thanks for your suggestions - some of them sound fancy, but also pretty interesting.
Might come back to them in the future and try them out, as soon as I have solved my problems initially as common/simple or straightforward as possible,
which seems to be for me already at times quite an overwhelming challenge in itself xD
 
Michael Mutek
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Update:

I have tried now to serialize/deserialize a list-object, but somehow the different added elements are apparently not stored in the list
and my .ser-file gets overwritten whenever I run the program while the list.size stays always 1 and the new added element will overwrite
the older element - why isn't the older element stored, so that when I run twice the program,
I would have two element in the list, which I then would serialize and afterwards deserialize - here is my code so far - maybe you can spit my mistake:

Code of the add-function (shall add basically an element into the list)  :


Each element is of the type StagingAreaItem with the attributes 'filename' and 'hash'.
The 'List<StagingAreaItem> testStagingArea' was declared outside this method.


Code of my attempt to serialize the list into the .ser-file (=stagingAreaDir)



Code of my attempt to deserialize now the .ser-file, i.e. its contents:


Does anyone have an idea why my list is apparently not storing the added elements correctly, but just overwrting itself with every new
element I want to add?


Edit/Update 2:

I have kept thinking and thinking and came to the conclusion that my strategy to solve the problem is still not correct - I think the right strategy might be this one:

Step 1 (When adding the first file i.e. object)   :
The object will be added in my List.
Then this data structure, i.e. list will be serialized.

Step 2 (When adding now a second file i.e. a second object)  :
Before doing anything else,
at first I need to deserialize the .ser-file (which basically represents my list and contains its content).
Side question: Deserializing a .ser-file provides to you a sort of copy of the object which you have been serializing before?
In this case a kind of clone or our list which is an own object (I have tested it with debugger and it has its own object-ID in memory) - is this correct?


After I have deserialized the .ser-file I got basically a clone of my list which I just call now listClone.
This listClone contains all the elements from Step1 (when I have called my add-method the first time by starting the program
                                                                                 and adding 'add + filepath' as arguments') - so far only one object.
now I would
        a.) add this object from the listClone into my original list
        b.) add the second object (which is the argument when I run my program i.e. a representation of it) into my original list, too.

Finally I serialize the original list again (which now contains two elements/objects), before terminating the program.

This two steps I repeat as often as how many objects I would like to add into my staging area which is represented by the .ser-file
and the contents of the .ser-file are represented by my original list.


What you think about my solution ideas? Am I on to something or is it still very off?
Anyways, I will be offline til monday, which gives me hopefuly enough time to think about all of it further,
before I try/test it on mondays - I hope for the best:)
I will post on monday an update here - thanks so far.

cheers,
Michael
 
Marshal
Posts: 25594
69
Eclipse IDE Firefox Browser MySQL Database
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Michael Mutek wrote:Does anyone have an idea why my list is apparently not storing the added elements correctly, but just overwrting itself with every new element I want to add?



Yes. When you open a file for output, that's what it does. If there was anything there in a file of that name before, it's thrown away. Unless you use the FileOutputStream(File, boolean) constructor in which you can say you want to append to the file.

You may also be having a problem because you don't close the file after you finish writing to it. Because you didn't do that, the last part of the output may not be flushed to wherever on disk the file is located.

And some parts of your code are excessively complicated. In one method you pass a File object and you want to put that object into a FileOutputStream. So you should just do that. Converting the File object to a String and then back to a File is at best a waste of time. Probably it doesn't do any harm but why bother?

I have to say that I didn't read most of that post. It looks like you're tryiing to build a workaround for problems which would be better addressed by just doing things in the straightforward (and correct) way.
 
Campbell Ritchie
Marshal
Posts: 69495
277
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Paul Clapham wrote:. . . I didn't read most of that post.

Nor did I.

. . . the straightforward (and correct) way.

...but I did try running some straightforward code.

javac mutek/Demo.java mutek/Foo.java mutek/Serialiser.java
Note: mutek/Serialiser.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
campbell@xyz~/java$ java mutek.Demo myFile.ser
Running demo for file “myFile.ser”.
[Foo[123, false, Campbell Ritchie], Foo[234, true, Michael Mutek]]
List serialised.
[Foo[123, false, Campbell Ritchie], Foo[234, true, Michael Mutek], Foo[345, false, Mike Simmonds]]
Deserialised list = [Foo[123, false, Campbell Ritchie], Foo[234, true, Michael Mutek]]

You can see I am adding elements to the List and de-serialising the older (size=2) version. I presume you know what the warning from the javac tool means. Try that sort of simple code and play with it until it is running correctly. Note I used try with resources, so I didn't need to write close() anywhere.
More about serialisation in Thinking in Java by Bruce Eckel (4 editions latest about 2006) and Sierra and Bates's cert exam guide and probably in Boyarsky and Selikoff's cert exam guide, too. I don't see that you need anything other than ordinary serialisation at the moment.

I was mistaken yesterday (sorry: thanks to MS for finding my mistake). The ObjectOutputStream documentation shows how to write several objects to the same file and it seems quite simple.

[edit]Remove a class which had been accidentally duplicated.
 
Rancher
Posts: 4588
47
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Michael Mutek wrote:
Does anyone have an idea why my list is apparently not storing the added elements correctly, but just overwrting itself with every new
element I want to add?



The first place I would check is how you are maintaining the full List in testStagingArea.
I'm taking a bit of a guess here, but I'm going to suggest this is a new List each time, and so will only ever contain the single StagingAreaItem when it is serialised.

Exactly how you ensure that List is correct before adding a new element is up to you, and likely depends on the system itself.
 
Michael Mutek
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Paul Clapham wrote:

Michael Mutek wrote:Does anyone have an idea why my list is apparently not storing the added elements correctly, but just overwrting itself with every new element I want to add?



Yes. When you open a file for output, that's what it does. If there was anything there in a file of that name before, it's thrown away. Unless you use the FileOutputStream(File, boolean) constructor in which you can say you want to append to the file.
Ah nice one, I have not even known this constructor before - will have a look.
Since by now I have solved my problem and all works fine now, but there is quite a bit of laborious code in my solution, specifically 'helper-lists'.
With using FileOutputStream(File, boolean)' I might get rid of them (because my .ser-file will not be overwritten every time I call the add()_method),
which would lead to a nicer solution - will try it out.


You may also be having a problem because you don't close the file after you finish writing to it. Because you didn't do that, the last part of the output may not be flushed to wherever on disk the file is located.
Fixed that one in the meanwhile by using try-with-ressources-statements.

And some parts of your code are excessively complicated. In one method you pass a File object and you want to put that object into a FileOutputStream. So you should just do that. Converting the File object to a String and then back to a File is at best a waste of time. Probably it doesn't do any harm but why bother?

Yeah, there are up to now still many redundancies and it does not look too nice...
Will try to refactor a lot to get a nicer outcome.


I have to say that I didn't read most of that post. It looks like you're tryiing to build a workaround for problems which would be better addressed by just doing things in the straightforward (and correct) way.
At first, one must know how to do it str8-forward to do it str8-forward - seems I lack too many basics for solving this task in a more solid way.



Anyways, thanks for your input:)
 
Michael Mutek
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:. . . I did try running some straightforward code. . . .




Nice one - I am almost a little bit proud of myself that my solution (before have written here the posts) looks almost like your - only difference is that
you return a boolean in regard to the serialization(), while for me it is void...
My deserilaisation() also return a list (basically a kind of copy (in regard to the content)) of the original list and within the function where I have called
the deserilaisation(), I thehn add all its objects into my original list, then add the new file also into my original list and at the end serialize this original list
-> the .ser-file will contain (after the program has terminated) all objects.
I all works now, but likely is not str8forward.

I guess the reason you return a boolean in regard to your serialization() is reason which Paul Clapham has hinted me?
-> to be able to append objects to an exisiting .ser-file, I guess?
I do not see how it works, but will read the ObjectOutputStream documentation for that one, as you have recommended me to do - thanks so far:)
 
Michael Mutek
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Dave Tolls wrote:

Michael Mutek wrote:
Does anyone have an idea why my list is apparently not storing the added elements correctly, but just overwrting itself with every new
element I want to add?



The first place I would check is how you are maintaining the full List in testStagingArea.
I'm taking a bit of a guess here, but I'm going to suggest this is a new List each time, and so will only ever contain the single StagingAreaItem when it is serialised.

Exactly how you ensure that List is correct before adding a new element is up to you, and likely depends on the system itself.



Yea, that was part of my problem - now it alll works fine in regard to functionality, but it is not the most str8 forward way - as said above, I will improve my
solution a bit in terms of having a more str8forward solution.
 
Campbell Ritchie
Marshal
Posts: 69495
277
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Michael Mutek wrote:. . . Nice one . . .

Thank you.

I guess the reason you return a boolean in regard to your serialization() is reason which Paul Clapham has hinted me? . . .

No. It was simply to signal to the calling method that the serialisation had completed without throwing an exception, so you can print

. . .
List serialised.
. . .

Had an exception been thrown you would have seen a different printout.
 
Michael Mutek
Greenhorn
Posts: 23
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
@Campbell Ritchie:
Ah ok, got it - referring to you post before the your penultimate post.
@your last post: Sorry, will keep this from now on in mind ,whenever I am posting something
 
Creator of Enthuware JWS+ V6
Posts: 3346
303
Android Eclipse IDE Chrome
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Congratulations Michael Mutek,

Your question has made it to our Journal

Have a Cow!
 
You save more money with a clothesline than dozens of light bulb purchases. Tiny ad:
Devious Experiments for a Truly Passive Greenhouse!
https://www.kickstarter.com/projects/paulwheaton/greenhouse-1
    Bookmark Topic Watch Topic
  • New Topic