• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Tim Cooke
  • Devaka Cooray
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Rob Spoor
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
Bartenders:
  • Carey Brown
  • Roland Mueller

Rules based ETL

 
Rancher
Posts: 2759
32
Eclipse IDE Spring Tomcat Server
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Does anyone know of any rules based ETL framework that can handle transformation of large amounts of data. The idea is that the transformation logic is specified in a DSL, and the ETL reads the rules in DSL and does the transformation. We would prefer the DSL to be maintained by business users, but it will be ok if the developers do it too. The idea is that we want the rules to be seperated from the Java code, so we can update the rules without doing a complete release.
 
Saloon Keeper
Posts: 28125
198
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I know something more or less like that.

The Pentaho Business Intelligence suite contains an ETL tool named "Kettle" (also known as Pentaho DI). The ETL rules are storable in an XML file and can be edited by non-programmers via a GUI editor app named "spoon".

Spoon is basically a drag/drop/drool UI where you select sources, destinations, and processing operations into a work area, configure them, and wire them together to make the transformation ruleset.

It is very performant. I have used it to populate databases with hundreds of millions of records at a shot, and that was just basic operation without exploiting its abilities to work with parallelized databases.

The one thing I don't like about it is that some of the processes are fairly non-intuitive. One of them, in fact, used to be Excel input, but I got so fed up with that one that I made modifications to the source code which have since become a permanent part of the Kettle system.
 
Jayesh A Lalwani
Rancher
Posts: 2759
32
Eclipse IDE Spring Tomcat Server
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Awesome! I will look at it. Thanks Tim
 
On top of spaghetti all covered in cheese, there was this tiny ad:
We need your help - Coderanch server fundraiser
https://coderanch.com/wiki/782867/Coderanch-server-fundraiser
reply
    Bookmark Topic Watch Topic
  • New Topic