Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

storage of files in website  RSS feed

 
Joshua Cloch
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi guys!

I met a problem.
In my website,the user can upload his files.it could be a text,or a .doc etc.,
The user also can use the function of my website to generate a .doc file.All these files are automatically stored in a particular directory in Tomcat server.As far,all work fine.

Do you think this kind of processing files is not good?Do you think it may not be a good design.is it better to store all these files in a database,like using blob to store the file in SQL Server.Or,it is ok,I am just not sure.
 
Ajith George
Ranch Hand
Posts: 109
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you want to do any authorization for files you uploaded on the site, it is better you store it on db. Moreover if you want to add addedd security by encrypting the documents, it is better to do with db.

Authorization is also possible on files stored on hard disk space, but you may not get the full flexibility in dealing with it.
 
Joshua Cloch
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Ajith George !

What do you mean by authorization? Is it something like encrypt the files using some famous algorithms and when i wanna use them,i should use a corresponding algorithm to decrypt them,in order to make the files stored in the database more secure?
 
Jaime M. Tovar
Ranch Hand
Posts: 133
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In a database you inherit the database user schema. So you can have a bunch of files and its owner, using database users or a primary key. If you don�t use this schema you will need to implement one by yourself, lets say one of your users finds the way to download a file that is owned by other user. Maybe you will have to create specific folders for each user, but then what happens if a user wants to have two files with the same name, even if they have different information. Having all your blobs in a database can save you a lot of work. It delegates the data organization to the database (and they are really good in the matter)
 
Darren Edwards
Ranch Hand
Posts: 69
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Ajith George:
it is better you store it on db


I do not agree with this - databases can store binary data, but it is not their intended purpose. It can slow them down and bloat the database unnecessarily.

First thing is to make sure you do not store these generated files in a place directly accessible to the web (you want access via a download servlet). When you have decided where to store them, keep a reference to the file path and the user who created it in the database - this will allow you to ensure only the user who created the file can download it.

If you are worried about having unencrypted files on the web server hard disk then generate a random name and run them through some encryption algorithm. In the database you can now store the file path (random name), user id, real file name and passcode to decrypt the file.
 
Joshua Cloch
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks guys.

It is really a problem when the user upload two files have the same name.Storing relative path of the file is an good way,but storing them in a DB may be more flexible.
 
Darren Edwards
Ranch Hand
Posts: 69
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Then do not store them under the original filename. You must be assigning some ID to the file to store multiples in the DB, so why not just call the file (on the filesystem) after the ID and store the 'real' name in the DB?

Personally I prefer to generate a random name for the file and keep the real filename in the DB.
 
Joshua Cloch
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Darren.

Indeed,that is a solution.

Could anybody compare these two kinds of storage? which one is preferred in which situation in real life.
 
Joshua Cloch
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Plus,if we use the file system instead of storing the data in DB.how could i secure these files? I mean if some one has got the relative path of the file,say http://localhost:8080/reports/company.doc,it will be very easy to download the file if he needs.
 
Bear Bibeault
Author and ninkuma
Marshal
Posts: 65826
134
IntelliJ IDE Java jQuery Mac Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I woudl personally not use the DB to store files. I would use Darren's file reference solution for the reasons that he stated.

And, as he said, do not store the files where they are addressable via URL.
 
Bear Bibeault
Author and ninkuma
Marshal
Posts: 65826
134
IntelliJ IDE Java jQuery Mac Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Personally I would not store the files in the DB. I'd use Darren's suggestion of storing file references in the DB to files stored in the filesystem in locations that are not addressable via URL.
 
Joshua Cloch
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am afraid using encryption of random name of the file is still not secure.

In real life, we can download the .css files from many websites only from the information hidden in tag "head" . If a person knows the file system of a website,do you think he may work out the path of the file and download it?
 
Bear Bibeault
Author and ninkuma
Marshal
Posts: 65826
134
IntelliJ IDE Java jQuery Mac Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You're not paying attention.

If the the file is stored outside of the web app, and a servlet is used to serve the file from the filesystem after checking credentials, where is the security problem? There is nothing the user can discern about the location of the actual file by looking at HTML source, and even if they could, there's no way to directly address it via URL.
 
Jaime M. Tovar
Ranch Hand
Posts: 133
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I work in a company that uses to store a lot of files, some times 20 gigs a month in files, that is like 80000 files a month. We use the db blob approach, because we let the db to handle all the space, names (generated random names), network sharing, security and searchability. The database internally generates blobs that are files in the filesystem, and I think is a common approach in many databases. It automates the hash name generation, the backup, mirroring, etc. Also it is a more serious approach when working in distributed apps. Lets say you have a web server farm� it is easier to share the documents if they are in a database, than if you have them in a server file system (I mean it is difficult not impossible). My recommendation. Let the database handle the data, let your code handle the business logic.

 
Joshua Cloch
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
excellent response,thank you,guys!
That'll help me a lot.
Can somebody explain a little more on how to handle these files in DB?
when a file is uploaded by a user,does it mean it should be stored in a directory first and then transfer the file to DB,at last the file in the directory will be deleted.is this the whole workflow?or,the file can be stored in DB directly when it is uploaded.
 
Jaime M. Tovar
Ranch Hand
Posts: 133
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Some frameworks give you the file upload utilities, to name struts and commons. They normally store the file in a temp file and give you the handle. Others keep it in memory and give you a object reference. In both cases you can create a byte stream and handle it via a java.sql.Blob to store it in a database. Just look for a file upload tutorial in google, there are many, also there are tutorials about using blobs in java. Take care cause some jdbc implementations can be tricky about the blob stuff.
 
Joshua Cloch
Ranch Hand
Posts: 95
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks again!
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!