• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Rob Spoor
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • Junilu Lacar
  • Tim Cooke
Saloon Keepers:
  • Tim Holloway
  • Piet Souris
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Frits Walraven
  • Himai Minh

Reading variable blocked records

 
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have a legacy COBOL system that generates some variable blocked files (DSORG=VBA), which I need to process. Can this even be done in native Java (11), or is there a 3rd-party tool that I can purchase to satisfy my requirements? I am completely lost. Any insight would be greatly appreciated.
 
Marshal
Posts: 73760
332
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You probably can, but I don't know myself. Please supply more information. Details of the format of the file would be useful.

We haven't got a dedicated COBOL forum, so I moved you to “other languages”. If you can write COBOL, you are unusual; I have heard there is a shortage of COBOL programmer who can maintain old bank systems, which enhances the salary they can earn.
 
Michael Sosa
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't think I understand what other details of the file format you want. The file can have 2 types of records; one could be 800 bytes while the other 400 bytes. The first 4 bytes of each record contain the length of the record in binary format. What else would you like to know?
 
Master Rancher
Posts: 4460
38
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Do you have a layout definition for what is in the blocks? The fields could be text, packed, binary or other.
It shouldn't be too hard to write java code to read the blocks and break out the records as a byte array for starters.
NB My wife and I both coded COBOL in the early 70s.
 
Michael Sosa
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The COBOL system is riddled with them. I was asking in general. Other than the binary field at the beginning of each record, all the fields are, for the most part, text with some packed decimal to add grief to an already depressing problem. I could get you a real layout if you think it would help, but I'm not asking someone to code a solution but tell me how to do it.  
 
Norm Radder
Master Rancher
Posts: 4460
38
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

how to do it.  


Read the records into a byte array and convert the different fields as per their expected datatype. The 4 byte length fields would be a binary value. Then use that length to pick out the record's bytes.
 
Sheriff
Posts: 16578
277
Mac Android IntelliJ IDE Eclipse IDE Spring Debian Java Ubuntu Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You might have to account for how the files from the COBOL system are encoded. If they're EBCDIC, you'll probably have to convert to ASCII before doing anything with them in Java.

Search for "how to convert EBCDIC to ASCII in Java"
 
Saloon Keeper
Posts: 24207
166
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You can do all this via brute force in user written Java code. However, you'll probably prefer to use a ready-mixed solution and that's an ETL (Extract, Transform, Load) tool. There are several to choose from both free and with commercial support. One of the most popular historically was Talend. Another one is Pentaho DI from Hitachi. Pentaho is the one I'm most familiar with - I have contributed source code modifications to it in fact but it itself is a high-performance and very flexible app written in Java. Talend is (I think!) also a Java system, but I'm not very familiar with it or with any of the other products that might be out there.

You have 2 issues.

Fist, you have to pull down the VBA data records. Pentaho can actually FTP into a mainframe and do that itself as part of the ETL pipeline. VBA as stored on disk/tape consists of a binary block-length header field followed by 0 or more records, each of which is headed by a binary logical record length  (LRECL) field. So de-blocking may be an essential first step. Depends on what the IBM FTP server will do for you.

Once you have the data broken out into records, the real func begins because chances are that the record in question may have text AND binary data in it, and both require further processing. Fields defines as CHARACTER would have to be converted from EBCDIC to ASCII, and thence to Java String or Character objects. Unless they're intended to be binary indicators (for example, "Y" for yes/true, "N" for no/false). You may have a binary length prefix on CHAR VARYING to deal with as well.

Then there's the numeric stuff. Which may be characters, but is more likely to be COMPUTATIONAL, COMP-2 or COMP-3. Computational is straight binary, but it's bitwise-continuous (big-endian) not bitwise-discontinuous (little-endian) as Intel processors use. COMP-2 is, of course, floating-point, but in addition to byte-order considerations, the original IBM floating-point binary representaion was completely different from the IEEE format used by Java. Although newer IBM mainframes added IEEE as an option.

And then there's good old COMP-3. This is actually pretty easy, since it's just BCD, 2 binary decimal digits per byte except for the last byte, which contains the sign.

Which is why an ETL tool can be so handy. Rather than hand-coding custom field transformations for a multitude of complex records you can use the GUI editor to string together processing blocks and build an easily-maintainable transformation profile.

One thing to note, however, the processing model for Pentaho DI (Kettle) at least involves a parallel set of extracted data columns running down the pipe, Conversions don't actually transform, they create new data colums in the desired form while the original column also remains in the stream. There is a certain mindset to it.
 
Anything worth doing well is worth doing poorly first. Just look at this tiny ad:
the value of filler advertising in 2021
https://coderanch.com/t/730886/filler-advertising
reply
    Bookmark Topic Watch Topic
  • New Topic