Corey McGlone

Ranch Hand
+ Follow
since Dec 20, 2001
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
6
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Corey McGlone

This works:



...as does this:



This does not:



The latter gives me a compiler exception stating that the method setParameters(String[]) is not applicable for the arguments (String, String, String).

Any ideas as to why the shortcut syntax is not allowed when used as a parameter to a method?

On a side note, I've gotten around the issue by changing the signature of generator.setParameters to be:



That allows me to do this:



I'm just curious why the notation above doesn't work in that scenario. Thanks.
8 years ago

Scott Selikoff wrote:Normalize the data so that the answers and the questions are in separate tables. For example, if a question has one answer, that would be 2 rows: one in the question table and one in the answer table. If a question has 2 answers, that would 3 rows: one in the question table and two in the answer table.



That's exactly what we're shooting for. We have a question table and we have an answer table, set up just as you've described.

The trouble is that we're trying to load a text file in which two answers are represented by a single row into that table structure. We're using SSIS to load the data so we're essentially using SQL. We've considered using a cursor to step over the rows in the file and processing each one but the amount of data we're working with is so large that it prevents us from using a cursor - it's just too slow.

What we have now is a "load table" that looks like this:



...and we want to translate that into this structure:

We're trying to load a text file into a SQL Server database and ran into a bit of a challenge. The text file contains answers to questions on a questionnaire and each line represents a single response, so it'd look something like this:



The trouble we face is that, in some cases, the questions may be "check all that apply", as opposed to just "check one answer". Those types of questions can have multiple answers and the file we receive notes them like this:



In this case, the user selected both answers 4 and 5 for question 456. In the end, we'd like to have this represented by two rows in the database, one with answer #4 and one with answer #5.

Anyone have any slick ideas for how we might make that happen using SQL?
Thanks for the replies, folks. I had seen that method there, but I guess I never paid any attention to what it did. As an addendum to Rob's reply, here's the actual implementation of the method ArrayList.toArray(T[] t):

8 years ago
This is something that has long irritated me, but I've never been able to find a solution. If I have a List of apples and I want to turn it into an array of apples, I have to do this:



I'd love to be able to do this:



Unfortunately, this doesn't work because the toArray method returns an array of Objects, not an array of Apples. So is there any better way to do this? Having to write that silly for loop any time I want to make this happen seems so wasteful.

Thanks.
8 years ago

Jeanne Boyarsky wrote:Whereas this one says you should use the table name rather than the alias.



It did, after all, turn out to be just this simple. The table alias in the UPDATE statement seemed to be the problem and, once I took them out, the query ran perfectly, like this:



Thanks for the assistance.

Jeanne Boyarsky wrote:That look close to being right. What error are you getting?

One issue I see is that "SELECT endDate" is ambigious. I think you mean ig.enddate there.



Good point, although that's not the error I'm seeing. The error I get is:

Incorrect syntax near 'i'



It doesn't seem that the alias is well-liked in that position.
So I have two tables that look like this:



I want to write a query that will take all records in the Indicator table that have a NULL endDate value and copy the value of endDate in the IndicatorGroup table that correspond to that indicator over, but my syntax is apparently no good. Here's what I have:



Any ideas how else I might be able to do this?

Thanks,
Corey
Just to close up this thread - I found my issue today. As usual, the problem lied in code that I wasn't even looking at. It was inside this method:



Inside that method, I was creating a CallableStatement and using it to invoke a stored procedure on the database I'm connected to. After processing, I was properly closing the ResultSet, but I had forgotten to close the CallableStatment. Adding cStmt.close() fixed all my memory issues.

Thanks for all the help, everyone.
8 years ago

Steve Fahlbusch wrote:Use adaptive calculations. The bytes process / bytes in file should act as a good approximation of the % complete.

Get the size of the file from the OS.

Keep a count of the bytes read. Bytes read / Size of file * 100 = percent of completion.



I didn't really figure this was where my issue was - after all, once the numbers are loaded into that list, the list doesn't grow. If I'm getting past this portion of my code (which I know I am), then this shouldn't be an issue.

That said, this is using memory that isn't absolutely necessary to use so I went ahead and tried this. I read in from my file one byte at a time and process each number as I get it. This got me all the way from 15% to 16% before dying with an OutOfMemoryException.

Given that, I'm pretty sure this isn't where my problem lies.
8 years ago

John de Michele wrote:It would seem to me that keeping a running count would be just as accurate, and a much better use of resources. What happens when your file gets to 200,000 lines, or 2,000,000?



I do keep a running count of the number of lines I've processed (that's the variable "totalRowsProcessed"). But, in order to know my progress, which is expressed as a percentage, I need to know how many rows were in the original file as well.
8 years ago

Paul Clapham wrote:It isn't obvious to me why you have to read all 20,000 strings into memory before processing them. Couldn't you just do something like "read a string, process a string, until end of file"?



I've considered this, as well. The reason I read all the Strings (which are numeric values from 6-8 digits long, so they're not horribly large) into memory is so that I can get an accurate count of how many there are. This allows me to provide progress statistics. I read the numbers into a list and then set the original file to null so the "net memory usage increase" should be negligible.

Even still, like you said, 20,000 Strings of that size shouldn't really be causing that much of an issue.
8 years ago

fred rosenberger wrote:you can use the -Xmx and -Xms to increase your heap size. This may not be a permanent fix, but it might let you limp through your initial struggles.



I appreciate that but, unfortunately, that's not an option. This code is running as a J2EE application and is hosted on a box which I do not have control over. My chances of getting them to raise the heap size for my application on that box are...well...let's say I'd be better off playing the lottery.
8 years ago
I'm working on a utility that will take in a file full of ID numbers, check each ID number against a database to see if it has been modified, and then mail a CSV file back to the requester with the results. It's quite possible that the file being uploaded would be very large but, when I get to more than roughly 20,000 ID numbers in that file, I constantly run into OutOfMemoryExceptions.

I've tried to do all sorts of things to reduce memory consumption but, no matter what I do, I always run out of memory at the same stage. Even when I make changes that should result in less memory usage, I don't see any change in performance, which I find very confusing. Here's a snippet of some of the relevant code I'm using. Any thoughts on what I could do to make this more efficient would be much appreciated.

I've tried periodically calling System.gc(), as well, but it made no impact on the result and degraded my performance considerably, so I've removed it.

Thanks, folks.

8 years ago
Okay, so a couple of my co-workers helped me come up with a solution, but we're not entirely certain why it works.

I had been focused on the client-side code, assuming that the issue was there, but it actually turns out that the problem was in the server. The first request sent to the server was to process a file - this request could take quite some time. During that time, this process places updates into the session so that other threads can monitor the progress. Subsequent requests attempt to pull those attributes from the session and send them back to the client code. When I looked at the session ID's on the server side, I found that the first request was pointing to one session (we'll call it session 1) while the subsequent requests were pointing to a different session (session 2). As such, the values were being set in one session and I was trying to read them from another session, which obviously isn't going to work.

We don't completely understand what's happening, but our best guess is that the first request creates a session but, until that request is complete (which could take quite some time), that session is not completely created. That means that any other requests that come in looking for a session won't see an already-created session and will attempt to create their own. One particularly odd part of this is that, while the file processing is taking place, that thread points to session 1 while the update threads point to session 2 - as soon as the file processing is completed, session 2 is dropped and all further update requests go to session 1. The result is that, from the client side, it looks like you hang at 0% complete and, when the file is done, you jump to 100% complete. Subsequent file submissions work perfectly fine as, at this point, the session has been completely created.

So the solution...

I added an update request to the document.ready handler. This really just pings the server and tries to get variables from the session that I know won't exist. The benefit is that it forces creation of the session and, when I go on to submit a file for processing and request updates on that process, all requests now point to that single, fully created, session.

If anyone has anything to add to this, I'd love to know if this reasoning is correct, or not. It seems to fit this scenario, but I can't be sure how accurate we are.

Thanks.