• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

NX: URLyBird (Database Schema Parser)

 
Lanuk Jajab
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I just downloaded my assignment 2 days ago. One of the first things I have started to work on is the design of my data access layer. I do not like the idea of hardcoding the number of columns/size etc. especially if one can deduce the same from the schema header specified in the database.
Thus, what I have implemented is a Database Schema Parser (for lack of a better term), such that one could potentially modify the Schema (or provide a different database) by adding/modifying field headers/names/types/sizes etc. without requiring recompilation. This information would then be parsed and cached in a singleton through a lazy loading process and be available for the life of the application through helper methods such as getNumberOfColumns, getColumnName(i), getColumnSize(i) etc. This could also potentially be expanded to cache multiple database schemas using the entire path specified at runtime as the key.
I guess my question is whether anyone can see potential problems with this approach.
Thanks and I appreciate any feedback.
- Lanuk.
 
Andrew Monkhouse
author and jackaroo
Marshal Commander
Pie
Posts: 12014
220
C++ Firefox Browser IntelliJ IDE Java Mac Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Lanuk
Welcome to JavaRanch.
I think this is a good aproach.
Regards, Andrew
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Lanuk,
Im am not sure I understand what you mean when you write

This information would then be parsed and cached in a singleton through a lazy loading process and be available ...

In my implementation, Data reads its meta-data (and keep it) in its opening process. It's not a "lazy" process, because Data cannot work while it doesn't know its meta-data ... :-) As I implemented Data as a multiton, well I can see its meta-data as a singleton, but hold and managed by the Data class itself. Of course, any relevant meta-data information may be published by Data through methods like those you enumerate.
But do you mean that, in your implementation, the meta-data is kept outside the Data class, I would say "centralized" ? If it's the case, I don't think it's good design.
Regards,
Phil.
 
Lanuk Jajab
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Philippe,
Maybe I didnt explain myself clearly. My intention was to cache the metadata of the database (schema) in a singleton that would be available to the Data class only (preferably as a private inner class). This is because the metadata implementation really supports a specific implementation of the DB interface and isn't a metadata approach to all databases.

The reason I mentioned the lazy load process is that the information would be parsed the first time any meta data information is requested by the Data class (since it is possible that we may not know the database until runtime) from the schema singleton and then cached for the remainder for the application life.
I plan to have a single instance of the Data class at any given time as my implementation need only worry about concurrent multiple clients and not multiple programs accessing the database file.

Hope that helps
thanks
- Lanuk.
quote:
Im am not sure I understand what you mean when you write

quote:
--------------------------------------------------------------------------------
This information would then be parsed and cached in a singleton through a lazy loading process and be available ...
--------------------------------------------------------------------------------
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, I now perfectly understood, and ... it looks perfect ! :-)
Regards,
Phil.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I like this design, and have done something very similar. One advantage to putting the schema info in a separate class is that you can set that class up as an immutable class, which makes it very easy to ensure thread-safe access. As others here know, this is a big issue for me - I will insist on some sort of synchronization somewhere for any mutable data that's accessible by more than one thread. So it's very worthwhile to clearly delineate the immutable stuff, so as not to worry about it.
One quibble though - I'm not sure why either SchemaParser or Schema should be a singleton. In the current application there's no need for more than one instance, true, but I could imagine an application with two or more Data classes accessing different tables with different schemas. I see no need to prevent this possibility by using a singleton here. As long as a Data instance holds a single Schema instance as a private instance variable, it doesn't really matter if there are any other Schema instances in the JVM, IMO.
 
Lanuk Jajab
Greenhorn
Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jim,
Some very great insights. You are correct, there really isn't a need for a Schema Singleton so long as each Data Instance has a reference to its own Schema. It is exactly how I have it currently implemented. However, as the creation of my Schema is directly follows the instantiation of DB implementation object (which follows a singleton design pattern), I have been "misidentifying" it as a singleton.
One of the things I am currently evaluating is this the merits of having my Data implementation as a singleton. I guess in my mind, from the requirements of the application :
<quote>
You may assume that at any moment, at most one program is accessing the database file; therefore your locking system only needs to be concerned with multiple concurrent clients of your server.
</quote>

there wasn't a need to provide the ability to provide a multion implementation of DB, allowing for the possibility of each having its own schema.
Thus, if anyone wanted to use a different database, they would essentially have to re-start the application (in local/networked mode) and provide a new/different database location.
I guess the real question is, at what point should we consider certain things to be out of scope??
Thanks again.
- Lanuk.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
there wasn't a need to provide the ability to provide a multion implementation of DB, allowing for the possibility of each having its own schema.
True, this isn't a requirement. But there isn't any reason to make it a singleton either, is there? The simplest thing, IMO, is to create a class that has a public constructor like normal. Such a class isn't a singleton - that would take a little extra effort to make the constructor(s) private (esp. if it's a default constructor that wasn't even declared previously ), declare a static instance field, and provide a static factory method to retrieve the singleton instance. This isn't that much work, true, but it's something - and what benefit does it give us? None that I can see. It's of no benefit now, and it actually interferes with possible enhancements down the road, IMO.
I guess the real question is, at what point should we consider certain things to be out of scope??
Tough question in general - but in this case the simpler solution is also best for later enhancements, so there's not really a conflict between these goals.
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Lanuk and Jim,
Lanuk:
there wasn't a need to provide the ability to provide a multion implementation of DB, allowing for the possibility of each having its own schema.

Jim:
True, this isn't a requirement. But there isn't any reason to make it a singleton either, is there? The simplest thing, IMO, is to create a class that has a public constructor like normal.

1� The "singleton" idea : I do think that many things are *easier* to implement if you may assume that you'll have only one instance of the Data class per database file. For example :
  • a cache of last accessed records
  • in-memory field indexes
  • a not blocking live backup
  • ...


  • Of course, keeping "a public constructor like normal" is OK, but if some features you implemented require that only one instance exists to work as expected, isn't it better to enforce that uniqueness ?
    2� The "multiton" solution : The database we write currently has only one table. That situation is so rare in the db world (:-)), that I feel it's a must to have a design which supports multiple tables from the beginning. Notice that LockManager is concerned too : to acquire locks on multiple tables, recNo information alone is not enough !
    Regards,
    Phil.
     
    Lanuk Jajab
    Greenhorn
    Posts: 19
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Thanks for all your input!
    I decided to go with the approach suggested by you, Jim, and provided a public constructor to Data.
    What I did do to guarantee one Data instance per db file is to provide a Factory that returns the same/instance or new instance based on (location of the database) such that you do provide for multiple databases. This is done by maintaining an internal hashmap using physical database location as a key. This approach still guarantees one Data instance per physical database file.

    Thanks,
    - Lanuk.
    [ July 08, 2003: Message edited by: Lanuk Jajab ]
     
    Jim Yingst
    Wanderer
    Sheriff
    Posts: 18671
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    if some features you implemented require that only one instance exists to work as expected, isn't it better to enforce that uniqueness ?
    Absolutely - if you're implementing features that make this assumption. I'd say enforcing the singleton nature is necessary in this case, especially if there are junior programmers in the area. But I'd hesitate before committing to any features that require a singleton, because it's harder to back out later if someone decided you really do want multiple tables. In my design I haven't encountered a particular reason to move to singleton yet, so I haven't.
    I suppose the best solution at this point might actualy be to have a private constructor and a public static factory method - so that other classes don't get used to calling the constructor, and we can decide later whether to make it a singleton or not, depending on future concerns. This approach is a little more confusing for junior programmers now - they may wonder "what's this method for, why not just use a public constructor" - but it buys us adapability to use later. Might be worth it.
     
    Jim Yingst
    Wanderer
    Sheriff
    Posts: 18671
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    What I did do to guarantee one Data instance per db file is to provide a Factory that returns the same/instance or new instance based on (location of the database) such that you do provide for multiple databases. This is done by maintaining an internal hashmap using physical database location as a key.
    Yup, same thing I did.
     
    Philippe Maquet
    Bartender
    Posts: 1872
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Jim,
    Well, it's exactly how I implemented my multiton ... :-)
    But I have a technical question about how to release those unique instances, because the first technique I thought about is not satisfactory :
  • Data instances are stored in a private static HashMap
  • They are reference-counted (private int referenceCount)
  • My static method Data getInstance(String dbFileName) increments the counter
  • A public void releaseInstance() method decrements it, and if zero removes the instance from the HashMap


  • It is not OK, because I noticed that that design makes possible to build two (or more) different instances pointing to the same database file :

    My question is : is it not a good use-case of WeakHashMap ?
  • No need for a releaseInstance() method anymore
  • The Data instances may be safely copied in multiple Data variables if needed
  • They are automatically removed from the WeakHashMap when all other references to them become unreachable


  • Is it OK ? I have no experience of WeakHashMap and moreover I wonder if there are not potential multi-threading issues, as WeakHashMap is documented as not synchronized. Does the garbage collector perform external synchronization when he decides to remove an entry ? Or is it safer to wrap it with Collections.synchronizedMap method ?
    Thank for your advice,
    Phil.
     
    Max Habibi
    town drunk
    ( and author)
    Sheriff
    Posts: 4118
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Jim,
    Well, it's exactly how I implemented my multiton ... :-)
    But I have a technical question about how to release those unique instances, because the first technique I thought about is not satisfactory :
  • Data instances are stored in a private static HashMap
  • They are reference-counted (private int referenceCount)
  • My static method Data getInstance(String dbFileName) increments the counter
  • A public void releaseInstance() method decrements it, and if zero removes the instance from the HashMap


  • It is not OK, because I noticed that that design makes possible to build two (or more) different instances pointing to the same database file :

    My question is : is it not a good use-case of WeakHashMap ?
  • No need for a releaseInstance() method anymore
  • The Data instances may be safely copied in multiple Data variables if needed
  • They are automatically removed from the WeakHashMap when all other references to them become unreachable


  • Is it OK ? I have no experience of WeakHashMap and moreover I wonder if there are not potential multi-threading issues, as WeakHashMap is documented as not synchronized. Does the garbage collector perform external synchronization when he decides to remove an entry ? Or is it safer to wrap it with Collections.synchronizedMap method ?
    Thank for your advice,
    Phil.

    Hi Phil,
    Using a WeakHashMap is just fine: I've had lots of private students, as well as people here, do extremely well with it. as far as synchronization, that's a non issue, because you'll be playing with the WeakHashMap in a synchronized block.
    M
     
    Philippe Maquet
    Bartender
    Posts: 1872
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Thank you Max for you reply.
    Of course I am playing with the WeakHashMap within a synchronized block (as I would do with a HashMap BTW).
    And I suppose from your answer that the garbage collector does alike when he removes an entry. Well it's seems logical : if garbage collector's actions were not thread-safe ... it'd make a hell of a mess !
    Regards,
    Phil.
     
    Max Habibi
    town drunk
    ( and author)
    Sheriff
    Posts: 4118
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Happy to Help
    M
     
    • Post Reply
    • Bookmark Topic Watch Topic
    • New Topic