• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • Rob Spoor
  • Bear Bibeault
Saloon Keepers:
  • Jesse Silverman
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
Bartenders:
  • Piet Souris
  • Al Hobbs
  • salvin francis

Groovy script to replace delimiters and to conditionally remove LF at the end of the matched lines

 
Ranch Hand
Posts: 86
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am using Apache NiFi https://nifi.apache.org/ to build my dataflow and the actual data I am dealing with at the moment is made of delimited values. I would like to use ExecuteScript https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.script.ExecuteScript/index.html and in order to do so I have put together a simple Groovy script that should do the following:

1) replace the current delimiter with a pipe (|)

2) replace \tab with " " (note the space)

3) include a conditional whereby if and only if the last part of a line has the form of , then include a \n at the end of it.

The reason for this script has to do with some data cleaning and wrangling on a dataset that shows the following issues:

a) text (often long) cuts across lines via \tab or line feed. This can happen before a full stop, but it is a not consistent behaviour.

I have come up with the following code:



Although it does remove \tab and correctly replaced the delimiters, the conditional seems to go wrong and all the content is piped into a long single line. This is not what I was aiming for because I will later on need to split the flowfile line by line. Can you help? Is there anything particularly wrong in my code?

Thank you so much for your help.
 
Sheriff
Posts: 7111
184
Eclipse IDE Postgres Database VI Editor Chrome Java Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

I believe this is the problem. With double quotes you need to escape (double backslash) the backslash but not with single quotes.
reply
    Bookmark Topic Watch Topic
  • New Topic