** I am not a big content writer, please bear with my english.
In the last few years, we have been serving a customer who owns a chain of retail super stores and we provided them a robust open source ERP solution with back office ERP modules, point of sale, mobile pos, ecommerce, apps, etc and centralized the data warehouse and published reports. Everything was going fine and smooth.
One day, the owner of the business called for a meeting and showed the reports to us from various branches and comparing the sales/inventory/purchase and profits, etc As we explained how the reports work, etc, he asked very simple set of questions. It was all based on Tomato, since there was a lorry/transport strike in many places at that time.
Will you be able to tell me ...
why is my tomato sale different in different regions/stores on same/different dates?
how much it will sell next week/month?
how much to stock and when?
how much is getting wasted daily? [perishable goods]
how much is the price of Tomato in stores of my competitors?
when my customers buy Tomato what else do they buy? can I create a combo offer?
can you find out what is my real profit from sale and compare to the fluctuating cost based on market?
BTW, also, make it available on my mobile and notify me with alerts.. since I travel a lot !!!
Well, we took this as a homework and told him Sir, we will get back to you :-)
We had a Tomato challenge to be solved and we wanted to drink a Tomato juice first and think more.
We still debate internally what is really Big Data? What is Analytics? What is Intelligence? So we are not any experts on this. However, we are learning and we are a small niche open source IT services company in Chennai, India, doing some little Pentaho work past 9 years and happy about the HDS family now. Every project/engagement is different and it demands Pentaho PDI/ETL skills, Analyzer for cubes, Visualization skills for Reports, Dashboards and used Adaptive layer to connect to Hadoop/MongoDB/BigQuery/Amazon/MarkLogic. However, this requirement was different, it pushed us to the path of R and Weka and exploring Data Science using Pentaho with R/Weka.
With the help of some mathematical genius [I call them Pythons], we could come up with a Service Catalog with defined KPIs / KRAs using algorithms to find out answers to those questions.
We took the 2 years of real enterprise data, social data, other webstore data and put it on cloud and ran the R magic to create the analytical database and built some customized visuals, mobile apps, push notifications.
The Tomato challenge did lead us to various other data insights and we have a solution now with bells and whistles.
Some of the things we could build using R with Pentaho for the Tomato challenge are:
- Product Performance Index
- Benchmark Performance Index
- New Customer share
- New Product share
- Date & Time Impact on sales
- Regularity on Portfolio split
- Customer demographic analysis
- Price sensitive index
- Substitution and Cannibalization analysis
- Uplift modeling
- Engagement analysis
- Customer brand loyalty
- Stock efficiency
- General Trends, Alerts, etc
How did we do the Big Data Analytics using Pentaho for the Tomato Challenge?
Data Warehouse Optimization:
- Integrated the data from all branches and built a centralized data warehouse using Hadoop on Amazon [EMR]
- Wrote the R scripts using algorithms [don't ask me what] to analyse the data
- Using Pentaho ETL workflow designer used the R scripts and ran them to build the Analytical database on Postgres
** In other words for someone like me, I understood this as a small dataset "view" query from a large table, however with more insights/value
- From the Analytical database we used Pentaho Analyzer to build the MDX queries and created the Cubes, Dashboards, Reports
- Also built a simple android app/HTML5 with push notification to have a native app to display the same visualization
We just named this solution as "BIG DATA TOMATO" and it is ready to use for any Retailer. The customer is also happy about it and obviously asking more questions now on the Potato !!??
Every time I see a pie chart graph, I see a cut piece of Tomato now.
Is there a market for this? Is this problem not yet solved for the retailers? Are there some embedded analytics opportunities? What else can we do on top of this and really may be create a "farmer to retailer to consumer" analytics and create a social innovation platform out of this. Would you be interested to see a demo and provide your inputs please?
We are learning this beautiful thing called Big Data Analytics.