• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to convert End Of line character of CSV file using java.

 
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am using some API and getting data stream, then i saved that data stream to a csv file using java.

Now the problem is that the End of Line character in this csv file is LF, but I want them to be CRLF (carriage return+line feed).
Is their any way I can do this ?

Thanks
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I know I can end each record with CRLF using :


This will end every line with CRLF (for windows environment)
But original stream has some lines ending with LF and some ending with CRLF, all I want is to swap them with each other.
 
Saloon Keeper
Posts: 15491
363
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Why?
 
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I presume somebody has told you they want CRLF, which is the only justification for using a particular line end. In which case forget about buffered writers. Use a Formatter object.
myFileFormatter.format("%s\r\n", textOfLine);

You would probably be better off using a special CSV writing program.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Stephan van Hulst wrote:Why?


Actually I am importing the CSV file to oracle DB using Oracle SQL Loader. But there is a multiline record. Now I have two CSV having exactly same records but they have different EOL character.
The one which is inserting successfully contains the CRLF as EOL character and for the multi line record, the record is seperated by LF. Attached is the snapshot(correctFile.jpg).
The other while is contains same record has EOL character as LF, and for separating multi line record CRLF is used. No for this file SQL Loader is giving me the error as:
Record 1: Rejected - Error on table ODI_DEV_TARGET.CASE, column DESCRIPTION.
second enclosure string not present


Below is the .ctl file used for SQL Loader :


Since the file which is inserting successfully contains CRLF as EOL character, I modified my .ctl file by changing INFILE :
and using this also file got successfully loaded.

So I thought that for CRLF if passing "str '\r\n'" works, than passing the "str '\n'" for LF should work but that's not the case.

Case_correct.JPG
[Thumbnail for Case_correct.JPG]
file which successfully inserted
case_incorrect.JPG
[Thumbnail for case_incorrect.JPG]
file which didn't successfully inserted
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Puspender Tanwar wrote:Actually I am importing the CSV file to oracle DB using Oracle SQL Loader.  . . .

Please always tell us the full details; if you give incomplete details you will get the answer to the wrong question. Which has happened here. Nobody is asking you for a file with CRLF. Since methods like readLine() remove the line end sequence anyway, you cannot tell which line end sequence a line uses. Also, a look at the dcumentation for readLine() shou‍ld tell you that the line end is a red herring.
How do you know you are reading the input file correctly in the first place? What happens if you simply read it and count the lines?

I think the error has to do with how you are writing to the database. What happens if you insert the contents of a line into the database via the command line?

I shall add this discussion to our databases forum.
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
By the way: why has one file got four columns in and the other three?
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am sorry for the details. Please delete this thread. I am creating another on Database Forum with full details.
Thanks
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Puspender Tanwar wrote:. . . Now I have two CSV having exactly same records

Are you sure about that?

. . . second enclosure string not present . . .

Doesn't that error message give you any hints?

So I thought that for CRLF if passing "str '\r\n'" works, than passing the "str '\n'" for LF should work but that's not the case.

I thought earlier that the line ends were a red herribng.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:Are you sure about that?


yes, Attaching the new snapshot for csv file. sorry for earlier snaps.

Campbell Ritchie wrote:Doesn't that error message give you any hints?


Yes, I understand that error. A double quote started but did't closed before EOL. But why don't that error comes up when I loaded the same file(first file, that uses different EOL character).

correctFile.jpg
[Thumbnail for correctFile.jpg]
file that got inserted
incorrectFile.jpg
[Thumbnail for incorrectFile.jpg]
file that didn't got inserted
 
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As Campbell says, the line endings are a red herring.

From your control file:


That says, to only continue reading (ie treat the following line as a continuation of the record) if the line doesn't end in double-quotes.
But your example data does.

Now, since you have a file that does work, and that appears to have double quotes, could you actually copy/paste from both files (not an image) the record in question here.
Don't put it in code tags or anything.

I'm asking that because, and this could be something to do with image quality, the two images really don't look like they have the same quote glyphs.
 
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does header have any significance in your application? They aren't same.
 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Records at lines 2, 3, 4 are also strange.
At line 2 there is one extra quote, which can escape last quote, that would mean last column isn't closed, hence missing.
Line 3 don't get at all.
Line 4 problems look similar to line 2, just way less columns.

What these are?

 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The files now look different from what you posted first. As DT says, please post the actual text.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry to late response. Here are the files.
The file which got loaded:

"ID","ISDELETED","CASENUMBER","DESCRIPTION"
"5002800000MMzJUAA1","false","00001000","Per customer-""
The client is now thinking of buying vs remodeling - I'll keep you posted!
"""
"5002800000MMzJVAA1","false","00001001",""
"5002800000MMzJWAA1","false","00001002",""
"5002800000MMzJXAA1","false","00001003",""
"5002800000MMzJYAA1","false","00001004",""
"5002800000MMzJZAA1","false","00001005",""
"5002800000MMzJaAAL","false","00001006",""
"5002800000MMzJbAAL","false","00001007",""
"5002800000MMzJcAAL","false","00001008",""
"5002800000MMzJdAAL","false","00001009",""
"5002800000MMzJeAAL","false","00001010",""
"5002800000MMzJfAAL","false","00001011",""
"5002800000MMzJgAAL","false","00001012",""
"5002800000MMzJhAAL","false","00001013",""
"5002800000MMzJiAAL","false","00001014",""
"5002800000MMzJjAAL","false","00001015",""
"5002800000MMzJkAAL","false","00001016",""
"5002800000MMzJlAAL","false","00001017",""
"5002800000MMzJmAAL","false","00001018",""
"5002800000MMzJnAAL","false","00001019",""
"5002800000MMzJoAAL","false","00001020",""
"5002800000MMzJpAAL","false","00001021",""
"5002800000MMzJqAAL","false","00001022",""
"5002800000MMzJrAAL","false","00001023",""
"5002800000MMzJsAAL","false","00001024",""
"5002800000MMzJtAAL","false","00001025",""

File which failed to load :

"Id","IsDeleted","CaseNumber","Description"
"5002800000MMzJUAA1","false","00001000","Per customer-""
The client is now thinking of buying vs remodeling - I'll keep you posted!
"""
"5002800000MMzJVAA1","false","00001001",""
"5002800000MMzJWAA1","false","00001002",""
"5002800000MMzJXAA1","false","00001003",""
"5002800000MMzJYAA1","false","00001004",""
"5002800000MMzJZAA1","false","00001005",""
"5002800000MMzJaAAL","false","00001006",""
"5002800000MMzJbAAL","false","00001007",""
"5002800000MMzJcAAL","false","00001008",""
"5002800000MMzJdAAL","false","00001009",""
"5002800000MMzJeAAL","false","00001010",""
"5002800000MMzJfAAL","false","00001011",""
"5002800000MMzJgAAL","false","00001012",""
"5002800000MMzJhAAL","false","00001013",""
"5002800000MMzJiAAL","false","00001014",""
"5002800000MMzJjAAL","false","00001015",""
"5002800000MMzJkAAL","false","00001016",""
"5002800000MMzJlAAL","false","00001017",""
"5002800000MMzJmAAL","false","00001018",""
"5002800000MMzJnAAL","false","00001019",""
"5002800000MMzJoAAL","false","00001020",""
"5002800000MMzJpAAL","false","00001021",""
"5002800000MMzJqAAL","false","00001022",""
"5002800000MMzJrAAL","false","00001023",""
"5002800000MMzJsAAL","false","00001024",""
"5002800000MMzJtAAL","false","00001025",""


Liutauras Vilda wrote:Does header have any significance in your application? They aren't same.


No, columns are not case sensitive.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I counted the double quotes as well, but they are same in both the files.
Why couldn't it be EOL character issue ?
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you for posting the text; I did a diff and the only difference is that the headings are in UPPER CASE in one file and lower case in the other. You have already confirmed that doesn't matter, since SQL is case‑insensitive. In view of the triple quote mark in line 4, I am surprised that either file can be correctly parsed into SQL.
I told you to look at the documentation for readLine() and that it removes the line ends, so LF versus CRLF is still a red herring.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Campbell Ritchie wrote:Thank you for posting the text; I did a diff and the only difference is that the headings are in UPPER CASE in one file and lower case in the other. You have already confirmed that doesn't matter, since SQL is case‑insensitive. In view of the triple quote mark in line 4, I am surprised that either file can be correctly parsed into SQL.
I told you to look at the documentation for readLine() and that it removes the line ends, so LF versus CRLF is still a red herring.


Hello Campbell,
readline() is not in the picture now. I have these csv files, I am loading them to oracle using SQL loader command
sqlldr odi_dev_target/odi_dev_target@odidb52 control=ctlfile.ctl log=logfile.log
Java code is not coming to the picture, that only came into picture when I create the CSV file. But the question here is why SQL Loader is behaving different for different EOL characters.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Also, the CSV which in not loading(EOL LF) is created using java code, but not by readline(). it is created the first code i have posted in this thread. So, readline() not in picture.
And the second CSV which is loading successfully, I have downloaded that from some application.
Actually its a salesforce.com data, first I have downloaded using their API and second is using Data Loader tool.
 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Open that second file (which doesn't load) in notepad++ or other editor and change manually those line endings and please verify to us, that problem indeed is that.
It is taking too long to debug all that stuff.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Liutauras Vilda wrote:Open that second file (which doesn't load) in notepad++ or other editor and change manually those line endings and please verify to us, that problem indeed is that.
It is taking too long to debug all that stuff.


In Notepad++ we have only option to change EOL character according to the platforms(windows, unix, mac). But that will convert all of the EOL characters, I want to swap them. Swap LF with CRLF and vice versa.
 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So, what is the problem. Do replace for all records:
find: \n
replace with: \r\n

And then you end up with 2 and 3rd line records which need to be in sync with first file, meaning have LF, which you can using regular expression change too.

Load it and see if it goes through.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Liutauras Vilda wrote:And then you end up with 2 and 3rd line records which need to be in sync with first file, meaning have LF, which you can using regular expression change too.


How to change CRLF with LF using Regex ?
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
After replacing the LF with CRLF using notepad++, I have the below(attached) file now. But in order to sync it with the first file(correct.csv), I need to convert the EOL character of 2nd and 3rd to LF. How can I do it ?

incorrect_change.JPG
[Thumbnail for incorrect_change.JPG]
CRLF at every line's ending
 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Take it.
Filename: for_puspender.csv
File size: 1 Kbytes
 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
After that, try to load your version with all CRLF's and tell us what happens.
 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Liutauras Vilda wrote:After that, try to load your version with all CRLF's and tell us what happens.


The csv you attached got loaded successfully, attached is the logFile.log. Could you please tell me how you changed CRLF to LF ?


SQL*Loader: Release 11.2.0.1.0 - Production on Mon Jul 3 19:11:12 2017

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Control File:   ctlfile.ctl
Character Set UTF8 specified for all input.

Data File:      C:\Users\puspender.tanwar\Desktop\try_SQLLDR\for_puspender.csv
 Bad File:     C:\Users\puspender.tanwar\Desktop\try_SQLLDR\badfile.bad
 Discard File: C:\Users\puspender.tanwar\Desktop\try_SQLLDR\DSCfile.dsc
(Allow all discards)

Number to load: ALL
Number to skip: 1
Errors allowed: 1
Bind array:     64 rows, maximum of 256000 bytes
Continuation:   Last non-white character != 0X22(character '"')
               Preserving continuation characters as data
Path used:      Conventional

Table ODI_DEV_TARGET.CASE, loaded from every logical record.
Insert option in effect for this table: INSERT
TRAILING NULLCOLS option in effect

  Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID                                  FIRST     *   ,  O(") CHARACTER            
   SQL string for column : "REPLACE(:ID,'<br>',chr(10))"
ISDELETED                            NEXT     *   ,  O(") CHARACTER            
   SQL string for column : "CASE WHEN :ISDELETED='true' then 'T' ELSE 'F' END"
CASENUMBER                           NEXT     *   ,  O(") CHARACTER            
   SQL string for column : "REPLACE(:CASENUMBER,'<br>',chr(10))"
DESCRIPTION                          NEXT 30000   ,  O(") CHARACTER            

value used for ROWS parameter changed from 64 to 8

Table ODI_DEV_TARGET.CASE:
 26 Rows successfully loaded.
 0 Rows not loaded due to data errors.
 0 Rows not loaded because all WHEN clauses were failed.
 0 Rows not loaded because all fields were null.


Space allocated for bind array:                 246208 bytes(8 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          1
Total logical records read:            26
Total logical records rejected:         0
Total logical records discarded:        0

Run began on Mon Jul 03 19:11:12 2017
Run ended on Mon Jul 03 19:11:13 2017

Elapsed time was:     00:00:00.50
CPU time was:         00:00:00.03

 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Liutauras Vilda wrote:After that, try to load your version with all CRLF's and tell us what happens.


sorry, in last reply I quoted wrong post. That was for the csv you attached for me.
Below is for the file which contains all records ending with CRLF:
The first record got skipped and other got inserted. Below is the logFile.log for file containing CRLF for all record:


SQL*Loader: Release 11.2.0.1.0 - Production on Mon Jul 3 19:19:36 2017

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

Control File:   ctlfile.ctl
Character Set UTF8 specified for all input.

Data File:      C:\Users\puspender.tanwar\Desktop\try_SQLLDR\csvfile3.csv
 Bad File:     C:\Users\puspender.tanwar\Desktop\try_SQLLDR\badfile.bad
 Discard File: C:\Users\puspender.tanwar\Desktop\try_SQLLDR\DSCfile.dsc
(Allow all discards)

Number to load: ALL
Number to skip: 1
Errors allowed: 1
Bind array:     64 rows, maximum of 256000 bytes
Continuation:   Last non-white character != 0X22(character '"')
               Preserving continuation characters as data
Path used:      Conventional

Table ODI_DEV_TARGET.CASE, loaded from every logical record.
Insert option in effect for this table: INSERT
TRAILING NULLCOLS option in effect

  Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
ID                                  FIRST     *   ,  O(") CHARACTER            
   SQL string for column : "REPLACE(:ID,'<br>',chr(10))"
ISDELETED                            NEXT     *   ,  O(") CHARACTER            
   SQL string for column : "CASE WHEN :ISDELETED='true' then 'T' ELSE 'F' END"
CASENUMBER                           NEXT     *   ,  O(") CHARACTER            
   SQL string for column : "REPLACE(:CASENUMBER,'<br>',chr(10))"
DESCRIPTION                          NEXT 30000   ,  O(") CHARACTER            

value used for ROWS parameter changed from 64 to 8
Record 1: Rejected - Error on table ODI_DEV_TARGET.CASE, column DESCRIPTION.
second enclosure string not present
Record 2: Rejected - Error on table ODI_DEV_TARGET.CASE, column ID.
ORA-12899: value too large for column "ODI_DEV_TARGET"."CASE"."ID" (actual: 77, maximum: 50)


MAXIMUM ERROR COUNT EXCEEDED - Above statistics reflect partial run.

Table ODI_DEV_TARGET.CASE:
 0 Rows successfully loaded.
 2 Rows not loaded due to data errors.
 0 Rows not loaded because all WHEN clauses were failed.
 0 Rows not loaded because all fields were null.


Space allocated for bind array:                 246208 bytes(8 rows)
Read   buffer bytes: 1048576

Total logical records skipped:          1
Total logical records read:             9
Total logical records rejected:         2
Total logical records discarded:        0

Run began on Mon Jul 03 19:19:36 2017
Run ended on Mon Jul 03 19:19:37 2017

Elapsed time was:     00:00:00.25
CPU time was:         00:00:00.07


"5002800000MMzJUAA1","false","00001000","Per customer-""
The client is now thinking of buying vs remodeling - I'll keep you posted!
"""

got skipped and gone to the badFile.bad


 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Alright, clear what is that bit. So the lines 2 and 3 and 4 are actually 1 record, and those excessive quotes are escaping some quotes in order to have sentence enclosed in quotes (this is what I said at the beginning), so after all, it has 4 columns too and this record ends with CRLF (line 4) as every other record.

Why are you writing that record on 3 separate lines?

 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Puspender Tanwar wrote:"5002800000MMzJUAA1","false","00001000","Per customer-""
The client is now thinking of buying vs remodeling - I'll keep you posted!
"""
got skipped and gone to the badFile.bad


Yes, because these were treated as 3 records, where:
1st has 4 columns + 1 loose quote (broken record)
2nd has just a text not enclosed in quotes, even if it were, there would be missing another 3 columns, so, that would need to be "sentence","","","" (after all, broken record)
3rd has 1 column and 1 loose quote (broken record).

You need to work on file writing phase in my opinion in order either to get lines 2, 3 and 4 on the same line, or get line endings LF for those lines, so during the load the program would know that lines 2, 3 and 4 are same record.

Let us see what guys think here and can suggest to you.

 
Puspender Tanwar
Ranch Hand
Posts: 658
2
Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Liutauras Vilda wrote:Alright, clear what is that bit. So the lines 2 and 3 and 4 are actually 1 record, and those excessive quotes are escaping some quotes in order to have sentence enclosed in quotes (this is what I said at the beginning), so after all, it has 4 columns too and this record ends with CRLF (line 4) as every other record.

Why are you writing that record on 3 separate lines?


Yes, 2, 3 and 4 lines are single record. I am not writing these records, I am using the Salesforce.com Api and I get the data in this format only(the incorrect file). I am not supposed to change these multiline records to single line, I need to import this format csv only to oracle.

comma(,) is used as seperator and double quote(") is used as enclosing character for data.
 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Puspender Tanwar wrote:I am not writing these records, I am using the Salesforce.com Api and I get the data in this format only(the incorrect file).


Dodgy API then, because it gives you a file which is broken already. I need to take it back due to recent Dave's findings.

Puspender Tanwar wrote:I am not supposed to change these multiline records to single line


Who told you that? This is only how the raw file looks like, after insertion, it gets inserted in the DB and treated as 1 record, regardless if you have it on 3 different lines or you'd have on 1 line

Load file with lines 2, 3 and 4 as one liner ending with CRLF and you'll see it goes through and in DB looking exactly the same.
 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Your end goal as far as I understand is to load data to DB after it gets downloaded from SalesForce crm system or so.

So, keep original file untouched after it gets downloaded and make copy of it, but fixed one. That fixing part needs to be happening upon file writing (as you don't want to fix them manually).
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Right, just gone through the docs, and this is the key part (https://docs.oracle.com/cd/B28359_01/server.111/b28319/ldr_concepts.htm#i1004685, the bit under Stream Record Format):

On Windows NT, if no terminator_string is specified, then SQL*Loader uses either \n or \r\n as the record terminator, depending on which one it finds first in the datafile. This means that if you know that one or more records in your datafile has \n embedded in a field, but you want \r\n to be used as the record terminator, you must specify it.



So, for the file that loads sqlldr has selected CR/LF as the terminator.
Consequently the "multiline" record is actually no such thing, it is treated as a single record and the continueif is never used...an LF is simply not recognised as the end of record.

For the one that fails it has picked up LF as the terminator, so all lines are being treated as new records.
This implies that your continueif is not working.
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
So, it's the continueif.
For the version with LF as the record terminator, does the in-record new line always come out as CRLF?

If so, how about trying CR ('\r') as the CONTINUEIF LAST instaed of the double quotes?
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Actually, am I reading the first few posts correctly?

Is the data coming from the Salesforce.com Api you call correct?
That is, does it leave Salesforce with the "correct" mix of CRLF anf LF?
 
Liutauras Vilda
Marshal
Posts: 8856
637
Mac OS X VI Editor BSD Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Dave Tolls wrote:So, it's the continueif.
For the version with LF as the record terminator, does the in-record new line always come out as CRLF?

If so, how about trying CR ('\r') as the CONTINUEIF LAST instaed of the double quotes?


Yeah, he got very unfortunate 1st record, because sentence is enclosed into the actual quotes, hence this continueif misbehaving.
 
Dave Tolls
Rancher
Posts: 4801
50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It's not really misbehaving.  It's doing what's asked of it.


If I read the earlier parts of the thread correctly, the Salesforce API is producing the correct output (CRLF terminators, with LF used inside multiline values), but the Java code is misinterpreting it?
That would make sense on their part, as it "solves" the import side...at least for Windows machines.  Wonder what they do for *nixes?
Anyway, I'll have to wait for the OP to come back to clarify that part.
 
reply
    Bookmark Topic Watch Topic
  • New Topic