• Bug: Title in Data Grid

    I would like to report a bug, but I was not able to create an account in Jira: In PDI 8.0  in the Data Grid step: Title of the window is: "Add constant row" not "Data Grid".
    Kamil Nesetril
    created by Kamil Nesetril
  • Analyzing #PWorld15 Tweets in Real Time

    This post originally published by Chris Deptula on Tuesday, October 27, 2015   I recently attended the Strata-HadoopWorld conference in NYC.  I have been attending this conference for the past few years and ...
    Kevin Haas
    created by Kevin Haas
  • Harvesting Raw Data into Custom Data API's with Pentaho Data Integration

    This post originally published by Kevin Haas on Tuesday, July 14, 2015   When working with our clients, we find a growing number who regard their customer or transaction data as not just an internally leveraged a...
    Kevin Haas
    created by Kevin Haas
  • Parent-Child Hierarchies with Abnormal Genealogy

    This post originally published by Bryan Senseman on Wednesday, October 15, 2014   I'm a huge user of Mondrian, the high speed open source OLAP engine behind Pentaho Analyzer, Saiku, and more. While the core Mond...
    Kevin Haas
    created by Kevin Haas
  • Lessons Learned With Sqoop

    This post originally published by Chris Deptula on Wednesday, November 19, 2014.   Many of you requested more information on the inner workings of the Sqoop component. Perhaps the best way to explain is via "les...
    Kevin Haas
    created by Kevin Haas
  • Working with Small Files in Hadoop - Part 3

    This post originally published by Chris Deptula on Tuesday, February 24, 2015.   This is the third in a three part blog on working with small files in Hadoop.   In my previous blogs, we defined what consti...
    Kevin Haas
    last modified by Kevin Haas
  • Working with Small Files in Hadoop - Part 2

    This post originally published by Chris Deptula on Wednesday, February 18, 2015   This is the second in a three part blog on working with small files in Hadoop. In my first blog, I discussed what constitutes a s...
    Kevin Haas
    created by Kevin Haas
  • Working with Small Files in Hadoop - Part 1

    This post published by Chris Deptula on Wednesday, February 11, 2015   This is the first in a 3 part blog on working with small files in Hadoop. Hadoop does not work well with lots of small files and instead wan...
    Kevin Haas
    last modified by Kevin Haas
  • Insert/Update function

    UPDATE -------------------------------------------------------------------------------------------------------------------------------------------   I now changed my workflow to the following:   However,...
    Rico de Feijter
    last modified by Rico de Feijter
  • Pentaho Data Integration Best Practices: Key-based, Single Record Lookup

    This post was written by Dave Reinke and originally published on Wednesday, June 22, 2016   In a previous blog, we discussed the importance of tuning data lookups within Pentaho Data Integration (PDI) transformat...
    Kevin Haas
    last modified by Kevin Haas
  • Pentaho Data Integration Best Practices: Lookup Most Recent Record

    This post was written by Dave Reinke and originally published on Wednesday, July 6, 2016   As we continue our series of Pentaho Data Integration (PDI) Lookup Patterns, we next discuss best practice options for l...
    Kevin Haas
    last modified by Kevin Haas
  • Hadoop: How to Update without Update

    This post was written by Chris Deptula and originally published on Wednesday, January 28, 2015   With an immutable file system and no update command, how do you perform updates in Hadoop?   This proble...
    Kevin Haas
    last modified by Kevin Haas
  • CDE can't execute PDI

    I have a simple PDI transformation which pulls a table and outputs with select values.   I have uploaded this transformation to PUC and have added it as datasource to a CDE. But when I opens CDA it throws e...
    Raj Karan
    last modified by Raj Karan
  • CDE: group rows of table component

    I have created a dashboard which has a table displaying thousand rows. I want to group rows on a specific column. Once the whole table is broken into many groups I want to sum values of that column.   I am pulli...
    Raj Karan
    last modified by Raj Karan
  • PUC: How to recover trashed folder

    In Pentaho User Console, public folder got trashed by mistake how can I recover it or create a new one.
    Raj Karan
    last modified by Raj Karan
  • Pentaho Spoon - How to read email and push data in columns of DB table?

    Hello world,   I need to create a workflow with Pentaho Spoon that: - read mail; - parse body email (capturing info); - push data in respective column of Db Table.   I tried with a workflow custom compo...
    B523W5TI
    created by B523W5TI
  • salesforc eupsert problem

    Hi, I met a problem that using salesforce upsert got connection timeout sometimes by kettle 7.1 version. Even sometime the data would lost. Anyone know why about it? And how can I fix it? Thanks!
    shayne lynn
    created by shayne lynn
  • Pentaho Spoon - How to get/Parse the body of email?

    Hello world,   I need to create a Trasformation with Pentaho Spoon that: - receive the email - capture the data containg in a body of email (like a field) - write the data in a CSV file.   So. I triyng ...
    B523W5TI
    created by B523W5TI
  • Creating CDE Histograms using PDI and a User Defined Java Class

    Histograms are a great way to probe density functions.  Visually they look like ordinary bar charts, but the bars (or bins in histogram-speak) are always quantitative and ordered rather than qualitative.  A ...
    Greg Graham
    last modified by Greg Graham
  • Dynamically Exploring the Central Limit Theorem with CDE

    Back when I took the Johns Hopkins Data Science track on Coursera, one of my homework assignments for the Developing Data Products course was to create a dynamic tool using R and Shiny that would graphically demonstra...
    Greg Graham
    last modified by Greg Graham