• Post Reply Bookmark Topic Watch Topic
  • New Topic

Is it feasible to save a lot of text in a database?  RSS feed

 
Alan Smith
Ranch Hand
Posts: 185
Firefox Browser Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

not to give you my life story but I want to become a writer and start writing short stories, etc. Im also not too bad at Java so have decided to write a "book writing" program to increase my skills there while I am at it and also because I would use it. If I find there is a feature I would like I can just add it myself; thats the goal anyway! Basically I want to know (not having too much experience with databases) if it is feasible to store the text from chapters, character info, etc into a database. Character info such as age, name, etc is an obvious candidate for database storage but whole chapters of possibly thousands of words may not be. I just feel that a database would be far better for persistent storage than say files all over my machine. Is it even possible for a database to store that much text in an entry? I just think that if I went about a "database per book" method it would be much easier to manage in the long run. Also, I would have put this in the database forum but I dont know if its specific enough. Any suggestions would be much appreciated!

ps. Yes, I could use MS Word, etc but I want to boost my Java skills (and SQL skills if needs be) so why not try to write my own text editor and fit it to match my needs.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alan Smith wrote:Basically I want to know (not having too much experience with databases) if it is feasible to store the text from chapters, character info, etc into a database. Character info such as age, name, etc is an obvious candidate for database storage but whole chapters of possibly 250 words plus may not be. I just feel that a database would be far better for persistent storage than say files all over my machine. Is it even possible for a database to store that much text in an entry?

Easy peasy Japanesey.
It's what databases were designed to do. The VARCHAR type alone can usually store over 1,000 characters (and almost always at least 255).

But here's the rub: SQL datatypes are NOT standardized, either in terms of their storage capacity or (in some cases) their name (MS products, as usual, are particularly bad on this score). For example, a TEXT field in MS-Access will only hold 255 characters; on MySQL it will hold 65K.

They can even store things called BLOBs (Binary Large Objects), which can store practically anything you like, including a Word document.

I just think that if I went about a "database per book" method it would be much easier to manage in the long run.

You could do, but if you want to utilize the database's power, it'd probably be better to store all your books in one database. You could then break them down into books and chapters and categories... anything you like really. Most databases these days also have some very nifty indexing features for text searches.

Winston
 
Alan Smith
Ranch Hand
Posts: 185
Firefox Browser Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
But here's the rub: SQL datatypes are NOT standardized, either in terms of their storage capacity or (in some cases) their name (MS products, as usual, are particularly bad on this score). For example, a TEXT field in MS-Access will only hold 255 characters; on MySQL it will hold 65K.


Great, MS Access was what I planned on using because of its simplicity as a DBMS. The databases are stored locally like files. Ill try SQL Workbench and see if that is as easy to use. Any other free DBMS applications you know are easy to use? Cheers for the info.
 
Campbell Ritchie
Marshal
Posts: 56536
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For text you would consider a CLOB, which is like a BLOB, but the C means “character”.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alan Smith wrote:Great, MS Access was what I planned on using because of its simplicity as a DBMS. The databases are stored locally like files. Ill try SQL Workbench and see if that is as easy to use. Any other free DBMS applications you know are easy to use? Cheers for the info.

MySQL is pretty good; and it's a fully-fledged database. It's also been around for a while. There are also Java-based DBs that plug right into your jar, like Derby or JavaDB. And then of course there's DB2, which has been around since shortly after the K-T Extinction event.

This page has quite a few Open source db's; but I can't vouch for any of them except Derby.

Winston
 
Alan Smith
Ranch Hand
Posts: 185
Firefox Browser Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:For text you would consider a CLOB, which is like a BLOB, but the C means “character”.

Thanks I will look into it. And cheers for the pointer Winston.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alan Smith wrote:Great, MS Access was what I planned on using because of its simplicity as a DBMS. The databases are stored locally like files. Ill try SQL Workbench and see if that is as easy to use. Any other free DBMS applications you know are easy to use? Cheers for the info.

A few notes:

- MS Access has the MEMO field type, which can hold large texts (I don't know what the size limit is, but it should be easy to look up).

- Some commercial databases offer an unpaid version. In Oracle there is the XE edition, some free-of-charge edition of MS SQL Server exists (though I don't remember the name), the same might be true for others. Usually there are some limitations regarding the total size of the data, memory usage, CPU usage or concurrent access, but I don't think these would pose any problems to you.

- Before you settle down with a product of your choice, do a small test: try to store in it several gigabytes of text (you can generate pure gibberish, or try to store a single story several thousand times). Make sure you'll be able to effectively back up the resulting database. (You plan to back up your work, don't you? )

- I'd say database per book is not very good solution in the long run. I can imagine you'd want to be able to search for a phrase you've written several years ago, but not remembering exactly where; having to go into umpteen databases for this is not exactly an attractive idea to me.

- What is the driving reason to put your work into database? It is certainly doable, it might even help you (or hurt you - depending on your future needs) by enforcing unified formal structure over your whole work, but you don't need to run a database just for that. What benefits do you see? Aren't you actually looking for some kind of version control software instead?
 
Alan Smith
Ranch Hand
Posts: 185
Firefox Browser Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Martin Vajsar wrote:
- What is the driving reason to put your work into database? It is certainly doable, it might even help you (or hurt you - depending on your future needs) by enforcing unified formal structure over your whole work, but you don't need to run a database just for that. What benefits do you see? Aren't you actually looking for some kind of version control software instead?


No not at all but it never actually occured to me to use version control although its a great idea now that you mention it. Just the pros and cons of storing large amounts of text and without having to clutter up my system with files where it could get messy. The idea for my GUI is to have a tabbed pane where each tab I add has a textarea for writing. Each tab represents a chapter. I have buttons to add/remove any chapter I want at any stage if I dont like something, and other text areas for rough work etc. My thinking behind the db is that I can loop through the tabs, get the text from each tabs textarea and load them into the db table one by one, possibly backing up previous versions to another table in the process. Wouldn't it be easier to give each chapter an id representing chapter number than renaming files all the time. I think it would be far better than having a convention to manage files representing the chapters. Thats the part I am working on now, the rest can wait : )
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alan Smith wrote:Just the pros and cons of storing large amounts of text and without having to clutter up my system with files where it could get messy.

Managing thousands of files is certainly doable. If it were me, I'd build a database that would allow me to store keywords, tags and other attributes together with the full file path - or searched for a software that already does that (there must be some).

Example: we've got a small, home grown issue tracking database in MS Access. We keep accompanying documentation in Word (Excel, Visio, ...) files on the disk. The database generates file names for new documents (incorporating an issue id and author id into the name) and allows us to list files related to the selected issue and easily open them. Closed issues' documents are moved to an archive directory. I imagine a system similar to this might serve you well.

I don't want to hinder your enthusiasm, but imagine this: you'll want to send your book to a publisher/editor for review. You'll need to export your book into format your publisher can use, and when he send it back with remarks and corrections, you'll need to import that back into your system. That (mainly the import phase) is LOTS of work, and honestly, it is pretty cumbersome. You could try to automate the import, however - let me recall - did you intent to write software or books?
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alan Smith wrote:Wouldn't it be easier to give each chapter an id representing chapter number than renaming files all the time. I think it would be far better than having a convention to manage files representing the chapters. Thats the part I am working on now, the rest can wait : )

I don't see any reason why you can't have both (db and versioning, that is).

Martin's point is well made though - versioning can be a pain in the ____, and having a ready-made solution (they come under the collective term of "revision control systems") takes a lot of the tedium out of writing versioning code. It's been a while since I checked all the possibilities, but I wouldn't be at all surprised if at least one of the major names offers the capability to save/pull from an SQL database rather than a filesystem; although you may be restricted as to choice. This page contains some comparison criteria for you, but DB storage isn't on the list, so you may have to check out the websites. Just FYI, I believe the two biggest freeware names are still CVS and Subversion (SVN).

Another thing to think about is formatting. Are you only interested in the text, or will you need it in some sort of "printable" form? If so, you may need to think about storing style sheets (or something like them), along with the text.

Winston
 
Alan Smith
Ranch Hand
Posts: 185
Firefox Browser Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
- Martin, I intend to write both : ) I have a few journals, notebooks, etc but the stuff is all over the place; plot info, character info, themes, rough work, etc. I have had the idea for a while now to have it all in a centralised area and what better than a program I write for my own needs, it only dawned on me the other day believe it or not! I haven't got much experience with Java although I understand it (Im OCJP certified) so this is the only way I will learn and it has lots of aspects to it. I have thought about printing, file export/import (pdf particularly), a built in email feature so I can email sections to whoever, etc as I go along. I want something solid for each step. Really not to move on until the latest feature is fully working. Versioning is something that I might look into later but once off backup would do for now as long as it works.

- Winston, what do you mean by style sheets? As in Css? Really im just interested in the text and will eventually add in buttons to change the font (italics, bold, etc).

Cheers for all the help lads I didnt think I would get this much!
 
fred rosenberger
lowercase baba
Bartender
Posts: 12563
49
Chrome Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am not a writer, but this seems like an odd use of a database to me. It sounds like you are trying to use a database for the wrong purpose.

Have you looked at getting software that is designed for writers? a quick google search turned up this page, which lists ten or so options ranging from $280 to under $30. Something like that may be better than a home-grown database.

just my 2-cents.
 
Alan Smith
Ranch Hand
Posts: 185
Firefox Browser Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
fred rosenberger wrote:I am not a writer, but this seems like an odd use of a database to me. It sounds like you are trying to use a database for the wrong purpose.

Have you looked at getting software that is designed for writers? a quick google search turned up this page, which lists ten or so options ranging from $280 to under $30. Something like that may be better than a home-grown database.

just my 2-cents.


I actually have dramatica pro but I dont like it at all and that cost $300. Im trying to improve my java skills as well so whats the harm, its a nice project.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alan Smith wrote:Winston, what do you mean by style sheets? As in Css? Really im just interested in the text and will eventually add in buttons to change the font (italics, bold, etc).

Precisely. It's probably good that you're not concerning yourself with that at the moment (Programmer); but at some point you may well want to reconstruct printable versions of your text (Salesman) and be able to send either to a thousand companies (IT Engineer).

Cheers for all the help lads I didnt think I would get this much!

Hey, all part of the service (the Friendly Java forum). Most of us old farts love a question we can our teeth into.

Winston
 
Paul Clapham
Sheriff
Posts: 22828
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alan Smith wrote:Im trying to improve my java skills as well so whats the harm, its a nice project.


Yes, a project which is something that you would actually use in real life is an excellent choice. However I think you should consider using the project to improve your design skills as well. In other words, I think you're jumping ahead a bit. Your requirements so far seem to be "I want to store my text in a database"... you need something more detailed than that. What do you expect to do with the text once it's in the database? Sketch out a few scenarios of things you might want to do, then design the database accordingly. Remember, if your database design is a dud then your Java code is going to consist largely of workarounds dealing with the database.
 
Alan Smith
Ranch Hand
Posts: 185
Firefox Browser Linux Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:

Your requirements so far seem to be "I want to store my text in a database"...


Not at all, I just need somewhere to store the chapters, character info, etc related to each story. I figure a database would be better than writing files for everything. Im working out the design of the GUI too (features, layout, etc). Im going to do it all in Swing and have already made a start. Some things I am figuring out but I will get them in time. I just posted here to ask for storage suggestions for large amounts of text. So far thats all I want to get working. No point in writing into a writing program and not being able to store what you have done! Sooner I get that done the better. Other features can wait. Thats just my way of looking at it. Feel free to suggest other plans of attack!

- Winston I dont get what I need CSS for for printable versions?
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Alan Smith wrote:Winston I dont get what I need CSS for for printable versions?

I don't know that you specifically need CSS's - in fact, as I said, I'm not sure that you need anything other than raw text for the moment - but it may be worth thinking about. I'm presuming that at some point you're going to need to produce that text in some sort of "pretty" form (eg, for a publisher?), and style sheets are one way of separating the "look" from the content. That way, you can update your text without having to worry about mucking up the formatting.

In fact, I notice from the DB2 website that it can handle XML documents directly (and that facility works on their free version too), so you might be able to store your docs as 'web-ready' text. I've never used the facility though, so I can't comment on how good it is.

Winston
 
Don't get me started about those stupid light bulbs.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!