The power of Pentaho Data Integration (PDI) for data access, blending and governance has been demonstrated and documented numerous times. However, perhaps less well known is how PDI as a platform, with all its data munging[1] power, is ideally suited to orchestrate and automate up to three stages of the CRISP-DM[2] life-cycle for the data science practitioner: generic data preparation/feature…
Source: Pentaho Big Data Blog posts