This week's book giveaway is in the Artificial Intelligence forum.
We're giving away four copies of Pragmatic AI and have Noah Gift on-line!
See this thread for details.
Win a copy of Pragmatic AI this week in the Artificial Intelligence forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Jeanne Boyarsky
  • Liutauras Vilda
  • Campbell Ritchie
  • Tim Cooke
  • Bear Bibeault
Sheriffs:
  • Paul Clapham
  • Junilu Lacar
  • Knute Snortum
Saloon Keepers:
  • Ron McLeod
  • Ganesh Patekar
  • Tim Moores
  • Pete Letkeman
  • Stephan van Hulst
Bartenders:
  • Carey Brown
  • Tim Holloway
  • Joe Ess

Storing Parsed Web Data  RSS feed

 
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello everyone/anyone :-)

I'm working on a network application that reads web data, parses it and processes it with conditional logic. Now, I'm wanting to store the data so I can run analysis on historic data at some point in the future. The book I have isn't very telling and I'm not able to find conclusive support for a particular method online. What I'm considering now is storing these values in a tab-delimited text file since I don't have any local database on my PC. Ultimately, though, I would like to be able to use Java to take the data directly from the text file (if this is where I should store it for now) and insert it into a table in SQL Server so I can query it. However, since I'm still far off from using SQL Server, is storing in a text file the best practice? Also, would it be impractical to take the data from the text file and insert it into a database table someday? Please share your thoughts. Thanks in advance!
 
Sheriff
Posts: 23649
49
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
As far as transferring the data from a text file to a database table, that should be pretty easy. Read a line from the text file, extract the data from the line, insert the data into the database table, repeat until no more lines.

But as for your design questions: As usual the design would depend on several things. For example, how much data do you have? If your text file ends up being several megabytes in size then scanning through it to find one particular row out of one million is not going to be practical. And you said it was historic data, which sounds like that means the data is not going to change. But is that a correct assumption? If the data changes frequently then you'd have to rebuild the text file frequently too. But if it changes, say, annually, then that probably wouldn't be a big deal.
 
Bartender
Posts: 19673
92
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could do worse than to store data in a TSV file. Just remember 2 things:

1. When you create the file make the path absolute, don't use a relative directory path. The concept of "current directory" isn't valid for a webapp.
2. Absolutely do not store the file inside the WAR or the webapp server directories. Storing in the webapp server directory tree pollutes the product. Storing inside the WAR can fail if the WAR hasn't been exploded and if it doesn't the entire file may get deleted if the WAR is updated.

Other options. If your data is hierarchical instead of flat, consider using YAML format. It's easy for Java to parse YAML.

Linux people have an advantage. The SQLite DBMS is almost guaranteed to be installed on a Linux box because a lot of system utilities use it for lightweight storage needs. There is a free version of Microsoft SQL Server, although it's not that lightweight. Alternatively you could install something like Apache Derby for testing purposes.
 
Paul Adams
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Gentlemen, thank you for your advice and recommendations – it’s all very useful to me. I like the idea of using Apache Derby a lot. I’ll probably work on building some data models using that and then move them to a local DB when the data becomes large.

Currently, I have about 25 rows of completely flat integer and double data, three columns wide. The data will never change, but will be appended to. I haven’t decided for how long the data will be stored, but probably for about a year. However, I don’t think I would collect more than maybe 100k rows in this timeframe.

Also, I did decide to separate the values with commas; my reasoning is that I’ll more confident about the consistency of my data if I’m parsing at commas instead of blanks.

Thanks again!
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!