• 2017 Pentaho Excellence Award Winner for ROI: LAZIOcrea

    LAZIOcrea won the 2017 Pentaho Excellence Award in ROI, a category that recognizes the use of Pentaho to make a positive impact on business, society and the planet. RISING HEALTHCARE COSTS PROMPTS ONE-OF-A-KIND DATA I...
    cchaffey
    created by cchaffey
  • 2017 Pentaho Excellence Award Winner for Big Data: ZeniMax Media

    This year we recognized six customers for their innovative use of Pentaho Data Integration and Business Analytics to achieve data-driven business outcomes. This blog will recognize ZeniMax Media - the winner in the Bi...
    cchaffey
    created by cchaffey
  • 2017 Pentaho Excellence Award Winner for Embedded Analytics: STIWA

    STIWA Group won the 2017 Pentaho Excellence Award in Embedded Analytics, a category that recognizes the successful embedding, extending or customizing of Pentaho to create a unique, value-added analytics solution. Cus...
    cchaffey
    created by cchaffey
  • Creating PDI Transformation Steps for Sending UDP Packets - Part I

    Pentaho Data Integration comprises a powerful high throughput framework for moving data through a network of transformation steps, turning data from one form into another.  PDI is an excellent choice for data eng...
    Greg Graham
    last modified by Greg Graham
  • Creating CDE Histograms using PDI and a User Defined Java Class

    Histograms are a great way to probe density functions.  Visually they look like ordinary bar charts, but the bars (or bins in histogram-speak) are always quantitative and ordered rather than qualitative.  A ...
    Greg Graham
    last modified by Greg Graham
  • SVG Component

    IntroductionBorn from the need to simplify data consumption on dashboards, this component's goal is to provide more flexibility by leveraging SVG rendering capabilities.   The Spark and the IdeaThe design of a pa...
    Joao Barbosa
    last modified by Joao Barbosa
  • Viz-SVG

    Introduction  Charts are a good way to visualize our data, but in some cases they are not enough. Imagine you need to show a vehicle current status, based on his sensor data, don't you think the image below is m...
    Joao Gameiro
    created by Joao Gameiro
  • JDBC SQL logging

    Introduction  A very typical use case for SQL logging is to troubleshoot various database related issues during development.   For this purpose, the P6Spy is a useful library to log all SQL statement and par...
    Kleyson Rios
    last modified by Kleyson Rios
  • Plotting ESRI Shapefiles on a Map in Pentaho

    This article was co-authored with Benjamin Webb   Foundational to any map—whether it be a globe, GPS or any online map— is the functionality to understand data on specific locations. The ability to p...
    Jesse Zuckerman
    last modified by Jesse Zuckerman
  • Analyzing #PWorld15 Tweets in Real Time

    This post originally published by Chris Deptula on Tuesday, October 27, 2015   I recently attended the Strata-HadoopWorld conference in NYC.  I have been attending this conference for the past few years and ...
    Kevin Haas
    created by Kevin Haas
  • Harvesting Raw Data into Custom Data API's with Pentaho Data Integration

    This post originally published by Kevin Haas on Tuesday, July 14, 2015   When working with our clients, we find a growing number who regard their customer or transaction data as not just an internally leveraged a...
    Kevin Haas
    created by Kevin Haas
  • Parent-Child Hierarchies with Abnormal Genealogy

    This post originally published by Bryan Senseman on Wednesday, October 15, 2014   I'm a huge user of Mondrian, the high speed open source OLAP engine behind Pentaho Analyzer, Saiku, and more. While the core Mond...
    Kevin Haas
    created by Kevin Haas
  • Lessons Learned With Sqoop

    This post originally published by Chris Deptula on Wednesday, November 19, 2014.   Many of you requested more information on the inner workings of the Sqoop component. Perhaps the best way to explain is via "les...
    Kevin Haas
    created by Kevin Haas
  • Working with Small Files in Hadoop - Part 3

    This post originally published by Chris Deptula on Tuesday, February 24, 2015.   This is the third in a three part blog on working with small files in Hadoop.   In my previous blogs, we defined what consti...
    Kevin Haas
    last modified by Kevin Haas
  • Working with Small Files in Hadoop - Part 2

    This post originally published by Chris Deptula on Wednesday, February 18, 2015   This is the second in a three part blog on working with small files in Hadoop. In my first blog, I discussed what constitutes a s...
    Kevin Haas
    created by Kevin Haas
  • Working with Small Files in Hadoop - Part 1

    This post published by Chris Deptula on Wednesday, February 11, 2015   This is the first in a 3 part blog on working with small files in Hadoop. Hadoop does not work well with lots of small files and instead wan...
    Kevin Haas
    last modified by Kevin Haas
  • Pentaho Data Integration Best Practices: Key-based, Single Record Lookup

    This post was written by Dave Reinke and originally published on Wednesday, June 22, 2016   In a previous blog, we discussed the importance of tuning data lookups within Pentaho Data Integration (PDI) transformat...
    Kevin Haas
    last modified by Kevin Haas
  • Pentaho Data Integration Best Practices: Lookup Most Recent Record

    This post was written by Dave Reinke and originally published on Wednesday, July 6, 2016   As we continue our series of Pentaho Data Integration (PDI) Lookup Patterns, we next discuss best practice options for l...
    Kevin Haas
    last modified by Kevin Haas
  • Hadoop: How to Update without Update

    This post was written by Chris Deptula and originally published on Wednesday, January 28, 2015   With an immutable file system and no update command, how do you perform updates in Hadoop?   This proble...
    Kevin Haas
    last modified by Kevin Haas
  • Dynamically Exploring the Central Limit Theorem with CDE

    Back when I took the Johns Hopkins Data Science track on Coursera, one of my homework assignments for the Developing Data Products course was to create a dynamic tool using R and Shiny that would graphically demonstra...
    Greg Graham
    last modified by Greg Graham