Skip navigation
1 2 3 Previous Next

Innovation Center

88 posts

European banks will need to have Open Banking APIs in place by January 2018.

This whiteboard video explains how to enable your API platform and keep existing systems safe.

 

Implementing Open Banking APIs.png

Open banking APIs have become a financial services industry hot topic, thanks to two regulatory decisions.

The first is in the UK, where the Competition and Markets Authority (CMA) started investigating competition in retail banking. They produced a report last year which proposed several recommendations and requirements.

A principal recommendation was to improve competition in retail banking. To achieve this, the CMA decided traditional banks should expose their customer data to third parties that would deliver additional retail banking services.

In parallel with the CMA, the European Commission started its second review of the Payment Services Directive. That review also proposed that banks, with customer consent, should expose customer data to third parties, who could then potentially deliver superior services.

 

Four challenges of implementation

 

From talking to our existing banking customers, we have identified four challenges of introducing an open banking API.

The first is being compliant in time.  These are requirements from the CMA and a directive from the European Commission. The API need to be in place at the start of 2018, which leaves banks little time at this point.

Second is improving customer experience. Retail banks across Europe are increasingly focused on delivering new and improved customer experiences.

Third is competition. The principal aim of introducing open banking APIs is to allow other service providers to utilise the data, then offer new and improved services to retail banking customers.

Finally one that doesn’t come up very often, but we think is important, is the operational risk that building and exposing APIs places on traditional systems.

 

Typical existing core systems

 

No bank started life as they are today.  The majority have built up core systems over many years through mergers and acquisitions. Furthermore, they’ve delivered lots of different services over those years too.

Those systems as a result have become interlinked, inter-joined, and incredibly complex. They are traditional architectures and they scale up.

What I mean by scale up, is that if they run out of bandwidth to deliver new services, that is fixed by installing and implementing a bigger system, or a bigger storage device. Scale up systems are capital intensive and take time to become productive.

We should consider how existing systems are managed and changed. Due to the complexity, banks must make sure that those systems are reliable and secure. To achieve this, they wrap rigorous change control and management processes around the systems.  As a result, any major change, which exposing these APIs certainly is, equates to a substantial risk.

There is one other aspect that’s worth considering too. Banks know how many transactions existing core systems need to process.  By opening this API, that becomes unpredictable. The volume and shape of the transactions that those APIs will generate, is difficult to predict.

 

Database extension alternative

 

Instead of using existing core systems, our view is that most banks will build a database extension or caching layer. In this alternative when a customer consents and the bank exposes their data to third parties, banks will extract that data out of their existing core systems, transform it for the new style database, and then populate the database extension with the data.

This alternative provides several benefits. First, banks can quickly become compliant and provide open banking APIs. This solution will scale out, so as banks add more customers to this process, they can scale easily.

More importantly, expect forward thinking banks to use the API to add new services. Potentially they will start to incorporate lots of different data sources. Not only traditional data, but geospatial data, weather data and social media data too.

This would enable banks to deliver a rich set of services to their existing customers through the open banking API and potentially monetise them.

 

Moving data from existing systems to the new database

 

Most banks will have several tools which extract data out of systems and populate business information systems and data warehouses.

Extracting data from traditional systems, transforming it and blending it so that you can use it in these new, agile scale out systems however requires something different. A lot of older tools, which have been very good at doing extracting data, aren’t effective at new style transformation processes.

One tool which is effective at this is Pentaho, which specialises at transforming data from traditional sources and then blending different data sources so that they can offer a richer set of services.

 

Monetizing the API layer

 

Regardless of the approach a bank takes, it will need to support open banking APIs from the start of next year. This leaves little time to become compliant and because that’s just a cost to them right now, we do believe that quickly the more forward-thinking banks will want to extend the capability of those open banking APIs, to develop new revenue streams and monetise them.

We at Hitachi think this is an exciting time, not only for fintech start-ups but traditional banks too, who through these directives, have been given an opportunity to deliver something new to their customers.

 

If you would like to learn more about how Hitachi can help you introduce Open Banking APIs get in touch with via LinkedIn or learn more about our financial services offering here.

As a brand and as a company, Hitachi is known for building high-quality solutions – from proton beam therapy and high-speed, mass transit bullet trains to water-treatment offerings and artificial intelligence (AI) technology – offerings that make the world a better place to live. For this reason, we hold ourselves to the highest of standards and we sweat the details. We know that, in many of these cases of social innovation, failure on our part could have dire or disastrous consequences.

 

Of course, we can’t make the world a better place alone. We need partners who will sweat the details. Partners like Intel, who, with their introduction of the latest Intel Xeon family of processors or their work on computational performance of a deep learning framework, demonstrate their intense focus on innovation, quality and performance.

 

As we continue to examine Intel Xeon family of processors, we see unlimited potential to accelerate and drive our own innovations. New capabilities can help us achieve greater application performance and improved efficiency with our converged and hyperconverged Hitachi Unified Compute Platform (UCP). And, as Intel pushes the envelope even further with next-generation field-programmable gate array (FPGA) technologies as well, we estimate that users could see upwards of a 10x performance boost and a significant reduction in IT footprint.

 

In important vertical markets like financial services, we have seen tremendous success around ultra-dense processing with previous generation Intel processing technologies. There, we are able to capture packets at microsecond granularity, filter in only financial services data, and report on both packet performance plus the financial instruments embedded in the packets. We can’t wait to see how Intel’s latest processing advancements help us exceed expectations and today’s state of the art.

 

We look forward to the road ahead. In the meantime, we’ll keep sweating the details and working with partners like Intel who do the same.

 

The regulatory burdens placed on financial services organisations has reached unprecedented levels.  From data security and access with GDPR to investor protection and the various themes in MiFID II/MiFIR, businesses are besieged by new regulations on an almost monthly basis.

 

According to Business Insider, from the 2008 financial crisis through 2015, the annual volume of regulatory publications, changes and announcements has increased by a staggering 492%. It is an issue I have addressed not only at the numerous events I have spoken at and attended since joining Hitachi, but throughout my career.

 

Understandably organisations are looking for ways to ease this regulatory burden through automating onerous processes, and are looking at ways to make the Risk, Compliance, Operations and Audit (ROCA) line of business more cost effective, efficient and take away the resource burdens that these organisations currently face.

 

After all the business of these organisations is not ROCA, rather they are in the business of generating revenue, which these functions clearly don’t. The need to ease this burden, has seen the rapid rise of RegTech or Regulatory Technology.

 

The idea behind RegTech is that it harnesses the power of technology to ease regulatory pressures. As FinTech innovates, RegTech will be needed to ensure that the right checks and balances are quickly put in place so that organisations do not fall short on their regulatory obligations.

 

RegTech is not just about financial services technology or regulations, it is broader that and can be utilized in numerous industries such as HR, oil & gas, pharmaceutical etc. With RegTech, the approach is to understand the “problem” (be it operational, risk, compliance or audit related), see which regulations it will be impacted by this problem, and solve it using technology.

 

RegTech is a valuable partner to FinTech, although some refer to it as a sub-set of Fintech, in my view RegTech goes hand-in-hand with FinTech - it should work in conjunction with financial technology innovation.

 

RegTech focuses on technologies that facilitate the delivery of regulatory requirements more efficiently and effectively than existing capabilities. RegTech helps to provide process automation, reduce ROCA costs, decrease resource burdens and creates efficiency.

 

FinTech by its nature, is disruptive. It aims to give organisations a competitive edge in the market. When FinTech first took off one of its main disruptions was the creation of algorithmic and high frequency trading systems, at lightening speeds.

 

As these FinTech innovations have become faster, more in depth and more intricate, regulators across the globe have sought to establish some boundaries to prevent fraud, protect consumers and standardise the capabilities of this technology. 

 

The accelerated pace at which FinTech has been adopted and is constantly innovating, means the regulators have struggled to keep up. Now however, far reaching and broader regulations are being established regularly – hence the requirement for RegTech to help manage this plethora of rules and procedures. RegTech is particularly relevant within the ROCA arena, where having oversight of the regulations is deep within their remit.

The financial services industry is heavily regulated, through myriad interlinking global regulations. These regulations are implemented through reports – whether it’s through trade/transaction/ position/periodic reporting or through some sort of disclosure.  Reports are the lifeblood of regulation and are based on data - therefore data is a crucial part of compliance. 

 

At the core of most regulations is the need for financial services organisations to locate, protect and report on the data and information held within their systems.  The regulations require not just audit trails, but each report must demonstrate exactly how data is handled both internally and externally. 

 

Reporting and regulation is unavoidable for all financial services organisations.  FinTech, which is just developing and not regulated yet, will catch up very quickly, as the regulators quicken their pace in keeping up-to-date with innovation and possible disruptions.

 

The challenge is collating and curating this level of information from the existing systems within the banks, within the deadlines specified by the regulations. This why RegTech exists and plays such a key role.

 

At a very fundamental level, RegTech helps financial services organisations to automate many of the manual processes, especially those within legacy systems, whether that be reporting, locating customer data, transactional information or systems intelligence. 

 

The crucial element here is not only the legacy and aging systems still held within many financial institutions - where data is stored in everything from warehouses to virtual arrays, and therefore locating and retrieving information from such becomes a huge challenge - but the legacy thinking of leadership in organisations is also problematic.

 

Many of these organisations are led by individuals whose only thought is the next 6 months. As Warren Buffet, however stated “someone is sitting in the shade today because someone planted a tree a long time ago.” Leadership need to think strategically.

 

The Recent WannaCry Ransomware attack is a perfect example of the dark side of legacy thinking and systems. Had leadership in those effected organisations made strategic infrastructure investments, replacing existing systems which are vulnerable to attack with modern systems implemented with the correct governance, systems and controls, this attack would not have caused as much harm as it did.

 

By using RegTech to automate these tactical and manual processes, it streamlines the approach to compliance and reduces risk by closely monitoring regulatory obligations. Vitally, it can lower costs by decreasing the level of resource required to manage the compliance burden. And RegTech can do so much more than just automate processes.

 

Organisations are using it to conduct data mining and analysis, and provide useful, actionable data to other areas of the business, as well as running more sophisticated aggregated risk-based scenarios for stress-testing, for example.

 

Deloitte estimates that in 2014 banks in Europe spent €55bn on IT, however only €9bn was spent on new systems. The balance was used to bolt-on more systems to the antiquated existing technologies and simply keep the old technology going. 

 

This is a risky and costly strategy. The colossal resource required to keep existing systems going, patched and secure, coupled with managing the elevated levels of compliance requirements will drain budgets over time. Beyond that, the substantial risk associated with manually sourcing data, or using piecemeal solutions presents the very real risk of noncompliance.

 

RegTech is not a silver bullet and it is not going to solve all the compliance headaches businesses are suffering from. However, as the ESMA (European Securities Markets Authority) recently stated firms must “embrace RegTech, or drown in regulation”.

 

RegTech will play a leading role, especially when used to maximum effect. Take, as an example, reporting.  We know through our research than this is an industry-wide challenge; on average a firm has 160 reporting requirements under different regulations globally, each with different drivers and usually with different teams producing those reports.

 

By using RegTech, not only could those team resources be reduced, but the agility and speed with which reports can be produced will ensure compliance deadlines are adhered to. Additionally, resources can then be focused elsewhere, such as on driving innovation and helping to move the company forward. 

 

Rather than focusing on what a burden the regulations are, by using RegTech organisations will see them as an opportunity to get systems, process and data in order, and to use the intelligence and resources to drive the company to greater successes. To take it one step further, I believe regulation does not hinder or stifle innovation - but in fact breeds creativity and innovation.

 

If you would like to learn more about RegTech and my work with Hitachi follow me on Twitter and LinkedIn.

Last week I visited Hannover Messe, the world’s largest Industrial Automation & IoT Exhibition for the first time and I have to say, I was overwhelmed by the sheer size and scale of the event. With 225,000 visitors and 6,500 exhibitors showcasing their Industrial Automation & IoT capabilities, the race is definitely on to take a share of the massive future of the IoT market opportunity in these sectors. There are varying estimations but according to IDC the IoT market is expected to reach $1.9 Trillion by 2021.

 

What I learnt in Hannover was that the industrial and energy sectors are on the cusp of a huge digital data explosion. Why? Because like all industries they are under pressure to innovate and embrace new technologies that will significantly accelerate the intelligence, automation and capabilities of factory production lines, reduce manufacturing defects and fault tolerance levels, as well as improve the reliability, performance and TCO of all kinds of industrial machinery as they become digitised. Technologies that will drive this data explosion will include, machine generated data, sensor data, predictive analytics data, as well as enhanced precision robotics and Artificial Intelligence data. All of this data will give a competitive edge and valuable insight to companies who deploy these technologies wisely and who use the data generated in the right way to drive their business forward intelligently and autonomously.

 

These innovations arguably make Hannover Messe one of the most relevant exhibitions in the IoT space today and last week Hitachi was able to showcase the full power of its Industrial Automation and IoT capability. This included Lumada IoT solutions, real industrial assets, advanced research and it’s humanoid robotics vision through EMIEW, a customer service robot being developed by Hitachi to aid society in public and commercial places with information whatever your language, which generated huge interest from attendees.

 

Hitachi had a large team of IoT experts present who talked very deeply to the technologies and use cases its customers need to advance and digitise their businesses. To say Hitachi inserted itself into the IoT conversation last week is an under-statement, Hitachi is serious about this business and this was further reflected through the extensive global brand advertising campaign in and around the show which included prominent adverts in Hannover’s main railway station, Hannover Messe sky walkways and a number of global media publications, all driving the 225,000 visitors to it’s 400sq metres booth to experience its IoT solutions.

 

As I left Hannover, I came away with two key takeaways. Firstly, IoT is with us here and now, and with this broad level of investment being made by companies and the focus and the potential returns for businesses, you can start to understand how it will drive the next huge wave of industrial change. My second takeaway is the potential Hitachi has to be a dominant force in IoT and the ambition it has to be the market leader. Last week the company made a giant stride towards achieving that goal. You can follow the conversation on Twitter by searching #HM17.

 

Image-1.jpg

Scott Ross

Help! Thinking Differently

Posted by Scott Ross Mar 20, 2017

 

Help! The Beatles said it best. We'll come back to that later.

 

Beep! Beep! Beep! My alarm wakes me up and the sudden realisation sets in. Like many other money savvy 20 somethings a calendar reminder on my smartphone alerts me that, on this occasion, my car insurance expires soon. The inexorable battle through the web of compare-supermarkets online is about to commence. Before I set about my task I go into the kitchen to pour a cup of tea and discover that my milk is off. I nip across to the shop to pick up some more and pay for it with my smartphone. I pour my tea, take a deep breath and off I go.

 

Naturally I start with my current provider and run a renewal quote on their website. Having spent the past 12 months as their customer I had high hopes for this to be the standard for others to compete with. After some not-so-easy site navigation and a lot of persistence I managed to get a renewal quote. Shockingly, this was significantly more than my current agreement despite nothing changing other than my age. Having struggled through their website for the best part of an hour re-entering my personal information that they could (and should) have very easily auto-completed for me, needless to say I was far from pleased with the outcome.

 

 

Next, to the plethora of comparison sites. I use a quick google search to review which is best. I select my suitor and off I go. I discover that this website is significantly easier to navigate which somewhat alleviates the painstaking process of having to enter the exact same details that I’ve already spent the best part of my morning entering into my current provider’s site. That pleasing process was enhanced further by the next page, the results! They were staggering. A large number of providers were offering a greater level of service at a considerably lower price. “How can that be”, I asked myself? For now I had to focus on which offer was best and ponder the fact later.

 

I review the top three policies independently, through both a google search and by using the site’s own review tool, and finally settle on my desired option. Two clicks later and I’m on the new provider’s website, details already filled in, quote as per the comparison site and a blank space waiting to complete the transaction. All I needed to do now was fill in my payment details and it was complete. Easy. I would have a new provider once my old contract ends…or so I thought!

 

 

Having settled on a new provider I go about cancelling my current service before the auto-renewal kicks in and I lose my hard-earned new policy. I call the contact centre, give my details and ask to cancel. The operator asks a few questions about “why” and then begins to offer discounts and price matching against what I’ve just signed up to. Why couldn’t they offer this level of service upfront? Why does it take me leaving for them to offer something better? In today’s economy where not just the savvy, but everybody is looking to get more for their money, why would a business continually act like this? This, in my opinion, shows a poor level of customer knowledge and more importantly a poor customer experience.

 

Quickly I begin to realise that many organisations across all consumer industries are acting in a similar way. In fact only the ‘new-age’ organisations can offer something different and even then are they maximising their potential? This got me thinking (back to the title) “Help! I need somebody, not just anybody!” I need my current provider to look after me. To help me. Even better, do it for me. I need them to navigate through the renewal journey for me. To offer me a bespoke service, price, whatever…designed to meet my needs, my characteristics. Act in my best interest. Maybe this is a euphoria / utopia that we may never get to however I can’t help but imagine a world where ‘the man’ is looking out for me. Providing targeted messaging about me, my spend, how and where to spend better, wiser, cheaper. Unlike The Beatles, most organisations aren’t drowning in their own success and, instead, are screaming out for a different kind of help! But what if they weren’t? Imagine a world where your bank offers you a discount to spend at your regular coffee spot, knows you’re paying above average on your street for home insurance and provides an alternative, automatically moves your savings to the best available rate, suggest alternative insurance products based on your driving style/health/lifestyle, the list is endless.

 

 

The point of this story is the power of insight, experience and Internet of Things (IOT). If our providers harnessed the data they already have (or could have) and turned this into valuable information, they would be more relevant to us and in return (we) would be better off. We are, as a consumer, looking for greater value and what better way than our existing providers changing the game. One example could be taking the comparison game to us - offering their services bespoke to our needs, after all they already know us. Another could be to improve the journey through their website, making it easier to transact. What if my bank knew my milk was already off and alerted me to buy more and attached a special offer to their message?! By empowering their staff, systems and processes even the oldest traditional organisations can realise the advantage. Increasing their customer insight and ultimately improving customer experience will bring about new markets, greater revenues and thriving customer loyalty.

 

Don't miss my next blog to see if we can work it out!

 

 

If you would like to learn more about Hitachi and how we can help Financial Service organisations click here.

 

To learn more about me visit my LinkedIn profile here.

Why Digital Transformation must be a strategic priority in 2017

 

It’s with good reason that Digital Transformation has become the latest watchword in our industry; organisations across the world are finally seeing the profound business advantages of leveraging digital technology.

According to a 2016 Forbes Insights and Hitachi survey of 573 top global executives, Digital Transformation sits at the top of the strategic agenda. Legendary former General Electric CEO, Jack Welch sums up perfectly why Digital Transformation has become a board-level priority: “If the rate of change on the outside exceeds the rate of change on the inside, the end is near.”

 

Organisations are seeing such an unprecedented rate of change all around them, that Digital Transformation is no longer a ‘nice to have’; it is a ‘must have’ for corporate survival.  To discover the state of enterprises’ Digital Transformation project in 2017, Hitachi partnered with UK technology publication, Computer Business Review (CBR), to survey IT decision makers in its readership on their efforts. While not scientifically representative of enterprises across the UK or Europe, the research provides some enlightening anecdotal evidence.

 

In this blog, I’ll explore some of those findings and discuss why I think 2017 will be the year of Digital Transformation.

In the UK, just under two-thirds of CBR readers revealed they are through the emergent stages of Digital Transformation, and consider their organisation to be at an intermediate stage in their journey. Only one in ten described themselves as beginners, with one in four stating they are leading the pack when it comes to transforming their businesses.

 

blog 2.png

We’ve found similar scenarios within many of our customers. Some are teetering on the edge, while others, such as K&H Bank, the largest commercial bank in Hungary, are already reaping the rewards. Through upgrading its storage solutions, K&H Bank has halved the time it takes for new business information to arrive in its data warehouse, ready for analysis and cut its data recovery time by half. This enables H&K Bank to get quicker insights into its business and react faster than its competitors.

 

It is exactly this type of optimisation that is fuelling Digital Transformation. By cultivating improved internal processes and competencies it drives tangible business benefits. In fact, according to CBR readers, just under two-thirds identified improving internal operations as the top driver for Digital Transformation, while a quarter highlighted customer experience.

 

Of course, while Digital Transformation can provide both optimised operations and improved customer experience, by initially focusing on internal programmes, any issues can be overcome and learnings understood.   Take for example, Rabobank in the Netherlands. The finance and services company has transformed its compliance operations by optimising its operations through a new platform. This strategy enables simplified access to structured and unstructured data needed for investigations, easing the regulatory burden on the bank.

 

blog pic 1.png

 

This kind of Big Data analysis combined with other technologies such as cloud computing and the Internet of Things (IoT), are at the core of many successful Digital Transformation stories. Cloud computing for example, was cited by 67% of readers surveyed as helping them to progress along their digital journey.

 

Indeed, our customers have demonstrated a keen interest in cloud technology as an integrated element of a Digital Transformation strategy. Deluxe, a US-based finance organisation, is benefitting from improved flexibility, security and control through Hitachi’s enterprise cloud offerings. By moving to a private cloud within a managed services environment, it now has the technology to integrate acquisitions, deploy next-generation applications and accelerate its time-to-market.

 

Other technologies, such as data analytics, cited by 20% and IoT cited by 10% of readers, are likely to grow in popularity as more powerful technology is developed. Although Artificial Intelligence (AI) is increasing in awareness, with innovative rollouts from organisations such as Enfield Council, it is not currently a strategic focus for UK businesses as on their Digital Transformation journey - cited by only 3% of readers.  This is likely to change however as more and more applications for the technology are discovered.

 

What our survey highlighted was not if organisations are starting and progressing their Digital Transformation journey, but when and how far they are along the path. That’s not to say it’s easy. But there is help along the way - my colleague Bob Plumridge recently shared three excellent pieces of advice regardless of where you are in your journey.  And, most importantly, the rewards are worth it. Improving internal operations and processes will help drive increased innovation and therefore improve customer experience. Embarking on Digital Transformation will also help keep your pace of change ahead of the competition, just like Jack Welch advised.

Last week my team hosted an exciting event at the Four Seasons in Houston, TX progressing our efforts in this vertical.  It was an event that mixed users, partners and customers plus the many faces of Hitachi.  Our aim was two pronged:

  1. Be inspired through the continued exploration of new challenges from the industry, and
  2. Validate areas we're already progressing, and adjusting based upon user feedback.

Doug Gibson and Matt Hall (Agile Geoscience) kicked us off by discussing the state of the industry and various challenges with managing and processing Seismic data.  It was quite inspiring and certainly revealing to hear where the industry is investing across Upstream, Midstream and Downstream -- the meat, Upstream used to be king, but investments are moving to both Midstream and Downstream.  Matt expressed his passions about literally seeing the geological progression of the Earth through Seismic Data. What an infectious and grand meme!

full_section_earth_surface_included.png?format=2500wMore generally, I believe that our event can be seen as "coming out party" for works we began several years ago -- you'll continue to hear more from us as we work our execution path.  Further, being inspired by one Matt Hall we ran a series of un-sessions resulting in valuable interactions.

 

O&G Summit - 11.jpg

The Edge or Cloud?

In one of the un-sessions, Doug and Ravi (Hitachi Research in Santa Clara) facilitated a discussion about shifting some part of analytics to the edge for faster and more complete decision making.  There are many reasons for this and I think that the three most significant are narrow transmission rates, large data (as in velocity, volume and variety), and tight decision making schedules.  Even though some processes (especially geologic ones) may take weeks, months or years to conclude when urgency matters a round trip to a centralized cloud fails!  Specifically, HSE (Health, Safety and Environment) related matters, plus matters related to production of both oil and gas mandate rapid analysis and decision making.  Maybe a better way to say this is through numerical "orders of magnitude" -- specific details are anonymized to "protect the innocent."

  • Last mile wireless networks are being modernized in places like the Permian Basin with links moving from satellite (think Kbps) to 10Mbps using 4G/LTE or unlicensed spectrum.  Even these modernized networks may buckle when faced with terabytes and petabytes of data on the edge.
  • Sensing systems from companies, like FOTECH, are capable of producing multiples of terabytes per day, which join a variety of other emerging and very mature sensing platforms.  Further digital cameras are also present to protect safety and guard against theft.  This means that the full set of Big Data categories (volume, velocity and variety) exists on the edge.
  • In the case of Seismic exploration systems, used to acquire data, designs include "converged-like" systems placed in ISO containers to capture and format Seismic Data potentially up to the scale of 10s of petabytes of data.  Because of the remote locations these exploration systems operate in there is a serious lack of bandwidth to move data from edge to core over networks.  Therefore, services companies literally ship the data from edge to core on tape, optical or ruggedized magnetic storage devices.
  • Operators of brown-field factories with thousands of events and tens of "red alarms" per day desire to operate more optimally.  However, low bit rate networks and little to no storage in the factory, to capture the data for analysis, suggest something more fundamental is needed on the edge before basic analysis of current operations can start.

This certainly gets me to think that while the public cloud providers are trying to get all of these data into their platforms there are some hard realities to cope with.  Maybe a better way classify this problem is as trying to squeeze an elephant through a straw!  However, many of the virtues of cloud are desirable so what we can we do?

 

Progressing Cloud to the Edge

Certainly the faces of Hitachi have (industry) optimized solutions in the market already that enrich data on the edge, analyze + process to skinny down edge data, and business advisory systems capable of improving edge related processes.  However, my conclusion from last week is that resolutions to these complex problems are less about what kind of widget you bring to the table and more about how you approach solving a problem.  This is indeed the spirit of the Hitachi Insight Group's Lumada Platform because it includes methods to engage users, ecosystems and brings tools to the table as appropriate.  I was inspired to revisit problem solving (not product selling) because Matt Hall said, "I was pleased to see that the Hitachi folks were beginning to honestly understand the scope of the problem" as we closed our summit.

 

Is O&G the poster child for Edge Cloud?  It seems that given the challenges uncovered during our summit plus other industry interactions the likely answer is yes.  Perhaps the why is self evident because processing on the edge, purpose building for the industry and mixing in cloud design patterns is obvious as stacks are modernized.  It is the "how" part I believe deserves attention.  Using Matt's quote, from the last paragraph, guides us on how to push cloud principals to the edge.  Essentially, for this industry we must pursue "old fashioned" and sometimes face-to-face interactions with people that engage in various parts of the O&G ecosystem like geologists, drilling engineers, geophysicists, and so on.  Given these interactions which problems to solve, their scope and depth become more obvious and even compelling.  It is then when we draft execution plans and make them real that we will resolve to build the cloud at the edge.  However, if we sit in a central location, read and imagine these problems we won't develop sufficient understanding and empathy to really do our best.  So, again yes Oil and Gas will engender edge clouds, but it is the adventure of understanding user journeys that guides us on which problems matter.

 

Attributions

  1. Top Banner Picture - Author: Stig Nygaard, URL: Oil rig | Somewhere in the North Sea... | Stig Nygaard | Flickr, License: Creative Commons
  2. Seismic Image - Relinked from Burning the surface onto the subsurface — Agile, and from the the USGS data repository.
Harry Zimmer

Data is the New Oil

Posted by Harry Zimmer Jan 31, 2017

There are at least 20 terms today that describe something that was once called Decision Support (more than 30 years ago). Using computers to make fact based intelligent decisions has continually been a noble goal for all organizations to achieve. Here we are in 2017 and the subject of Decision Support has been by enlarge forgotten. The following word cloud shows many of the terms that are used today. They all point to doing super advanced Decision Support. Some of the terms like Data Warehouse have been replaced (or super-ceded) with the concept of the Data Lake. Others old terms like Artificial Intelligence (AI) have been re-energized as core to most organizations IT plans for this year.

 

Picture1.png

 

In discussing this whole set of topics with many customers around the world over the past 12 months, it has become clear to me that in general most companies are still struggling with the deployment and in some cases the ROI around this whole set of topics. The education levels are also all over the map and the sophistication of systems is inconsistent.

 

In my upcoming WebTech (webinar) I will be sharing a new model that takes into account the old and the new. It will provide some foundational educationl in under an hour that should provide incredible clarity – especially for ‘C’ level executives.

 

I have developed what I think is a very useful model or architecture that can be adapted and adopted by most, if not all organizations. This model provides an ability to self-assess where the organization is exactly and what the road forward will look like. All of this has been done with the goal of achieving the state of the art goals in 2017.

 

The model is a direct plug-in to the industry-wide digital transformation initiative. In fact, without the inclusion of the model – or something similar to it – a digital transformation project will most likely fail.

 

The other direct linkage is to another hot topic: Internet of Things (IoT). Here too there is a direct linkage to the model. In fact, as IoT becomes mainstream across all organizations, it will be a valuable new source of data using the evolving world of sensor technologies.

 

I hope you are able to join me for this WebTech. I am sure you will find that it will be extremely valuable to you and your organization and spur a ton of follow-on discussion.

 

To register for my upcoming WebTech, click here.

 

For additional readings:

  • Storytelling With Data, by Cole Nussbaumer Knaflic
  • 80 Fundamental Models for Business Analysts, by Alberto Scappini
  • Tomorrow Today, by Donal Daly
  • Predictive Analytics for Dummies, by Bari, Chaouchi, & Jung

In Superman 30, the Man of Steel theorises that the latest incarnation of his enemy Doomsday, emits so much energy that when he emerged he boiled the ocean. Not an easy task, even for a super villain; and certainly out of reach for mere mortals.

 

So why are some enterprises taking this approach with Digital Transformation projects? Moreover, if overreaching doesn’t work what steps should be taken?

 

Hitachi recently partnered with Forbes Insights, interviewing nearly 600 C-level executives from North America, Latin America, Europe and Asia-Pacific. The global research revealed that a transition toward digital maturity involves five major steps, some of which are proving easier to take than others.

 

1. Top Strategic Priority

Half of executives polled said their organisation will be vastly transformed within two years. I expect the figure is actually higher in Europe, where companies are already on the move. One bank we work with even has a permanent board member dedicated and responsible for its Digital Transformation.

 

The realisation has dawned in boardrooms that growth and survival are now tied-up with digital capabilities.

 

2. Enterprise-wide approach

The research revealed that cross-functional teams are not adequately involved in developing or implementing strategy, with the bulk of this work done by IT. In our experience this is no longer the case across Europe or the Middle East. According to Gartner for example, shadow IT investments, purchases outside of CIO control, often exceed 30 percent of total IT spend.

 

I recently attended an IDC event in Dubai dedicated to Digital Transformation in the banking and finance sector. The congress was dominated by line of business executives from sales and marketing rather than IT leaders. Each session and attendee I spoke with shared an active interest in making Digital Transformation, the cornerstone of their company strategy.

 

3. Focused on business outcomes

The ability to innovate was the top measure of success for 46% of companies polled, and it’s something I hear a lot from customers.

 

The ability to innovate cannot be achieved through technology alone, enterprises instead should seek partners who they can trust to solve the underlying technical and architectural challenges to deliver a solution that addresses and enables business outcomes.

 

161107.highres.digital.jpg

 

One thing the report does not consider however, is what will happen to those who fail to invest in digital capabilities. Failure to modernise cyber security systems for example, is an issue regularly covered by media outlets. Prof Richard Benham, chairman of the National Cyber Management Centre has even predicted that in 2017 "a major bank will fail as a result of a cyber-attack, leading to a loss of confidence and a run on that bank." 

 

Digital Transformation isn’t essential to just growth, but survival too.

 

4. Untapped potential of data and analytics

Only 44% of companies surveyed see themselves as advanced leaders in data and analytics. In my opinion, this is a conservative estimate. Some businesses may be guarding achievements – as poker players do. The term poker face relates to the blank expressions players use to make it difficult to guess their hand.  The fewer the cues, the greater the paranoia among other players.

 

You could speculate that some businesses may be keeping their best weapons secret too.

Besides, nobody wants to be that company who brags one day and is overtaken the next.

 

But we should take comfort from those companies in Europe making solid progress. K&H Bank, the largest commercial bank in Hungary, has halved the time it takes for new business figures to arrive in its data warehouse, ready for analysis and cut its data recovery time by 50%. Or consider Rabobank in The Netherlands, which has gained “control of the uncontrollable” and mastered the handling of its data for compliance purposes.

 

5. Marrying technology with people power

When dealing with challenges associated with Digital Transformation, the survey found that people are organisations’ biggest obstacle. Investing in technology is ranked lowest – indicating companies may already have the technology but not the skills. Support from strategic technology partners can help bridge the gap here.

 

These obstacles also bring me back to my original warning – don’t try to boil the ocean. An organisation might race ahead and invest heavily in technology but without the right culture and know-how, it could waste an awful lot of money at best, and lose senior support at worst.

 

So, what can your business learn from this research? Here are three things that you could do, within a safe temperature range:

 

  1. Hire new talent to challenge the status quo. Your current team may not have the fresh vision needed to shift your enterprise to mode 2 infrastructure. You need an intergenerational mix of young, enthusiastic staff and seasoned experts
  2. Nominate a senior executive as the Digital Transformation ambassador. Organisations need a senior sponsor to push the agenda. To overcome engrained ways of doing things, you need people with strong messages that can cascade down
  3. Be bold and take calculated risks. One bank in Europe has even banned its CIO from buying mode 1 IT infrastructure – meaning the bank has no choice but to embrace a more agile digital environment (rather than fall back on the devil it knows). Another bank in The Netherlands took the bold step of replacing its hierarchical structure with small squads of people, each with end-to-end responsibility for making an impact on a focused area of the business.

 

To achieve Digital Transformation, enterprises need to push the internal idea of ‘safe’ as far as possible. As Mark Zuckerberg declared “in a world that’s changing really quickly, the only strategy that is guaranteed to fail is not taking risks”. If a business does so iteratively and learns from its mistakes, it won’t run the risk of trying to boil the ocean and failing from the sheer magnitude of the task.

 

If you would like to learn more about the research, I recommend you download the full report here.

In the post that Shmuel and I published last month (The Many Core Phenomena) at the end we hinted about some upcoming news.

Hitachi has demonstrated a functional prototype running with HNAS and VSP to capture finance data and report on things like currency market movements, etc. (more on this in the near future).

Well there is obviously more to the story than just this somewhat vague statement, and Maxeler Technologies announced our mutual collaborations around a high fidelity packet capture and analytics solution.  To provide a bit more detail I'm embedding a video, narrated by Maxeler's Itay Greenspan, within this post.

 

Joint Maxeler HDS CPU-less Packet Capture Plus AnalyticsCommentary
As HDS and Maxeler set out on our collaborative R&D journey we initially were inspired by market intelligence related to an emerging EU Financial Services directive called MIFID II.  This EU directive, and its associated regulation, was  designed to help the regulators better handle High Frequency Trading (HFT) and so called Dark Pools.  In other words to increase transparency in the markets.  Myself, Shmuel Shottan, Scott Nacey, and Karl Kholmoos were all aware of HFT efforts because we ran into folks from a "captive startup" that "spilled the beans."  Essentially, some Financial Services firms were employing these "captive startups" to build FPGA based HFT solutions, enabling money making in a timespan faster than the blink of an eye.  So as Maxeler and HDS approached our R&D we assumed a hypothetical use cases which would enable the capture and decode of packets at the speed equivalent to HFT.  We then took the prototype on the road to validate/invalidate the hypothesis and see where our R&D actions would fit in the market.  Our findings were surprising, and while the prototype did its job of getting us in the door we ultimately ended up moving in a different direction.

 

As the reader/viewer can see in the video we leveraged many off the shelf components/technologies -- we actually used generation -1 tech, but heck who's counting.  As stated in the video, we accomplished our operational prototype through the use of Maxeler's DFE (Data Flow Engine) network cards, Dataflow based capture/decode capability executing on Dataflow hardware, a hardware accelerated NFS client, Hitachi's CB500, Pentaho, and Hitachi Unified Storage (HUS).  While related in the video, a point worthy of restating is: All of the implemented hardware accelerated software functions fit on about 20% - 30% of the available Dataflow hardware resources, and since we're computing in space more than a super majority of space remains for future novel functions.  Furthermore, the overall system from packet capture to NFS write does not use a single server side CPU cycle! (Technically, the NFS server, file system and object/file aware sector caches are all also running on FPGAs.  So, even on the HUS general CPUs are augmented by FPGAs.) 

 

As serial innovators we picked previous generation off the shelf technologies for two primary reasons.  The first and most important was to make the resulting system fit into an accelerate market availability model -- we wanted the results to be visible and reachable without deep and lengthy R&D cycles.  Second, was an overt choice to make the prototype mirror our UCP (Universal Compute Platform) system so that when revealed we could be congruent with our current portfolio and field skill sets.  Beyond these key points, a secondary and emergent benefit is that the architecture could readily be extended to support almost any packet analysis problem.  (While unknown to us at the time the architecture also resembles both the Azure FPGA accelerated networking stack and is close to a private version of Amazon EC2 F1 lending further credibility to it being leading edge and general purpose.)   Something that was readily visible, during our rapid R&D cycle, is Maxeler's key innovation lowering the bar for programming an FPGA from needing to be an Electrical Engineer/Computer Engineer to being a mere mortal developer with knowledge of C and Java.  For reference, what we've historically observed is a FPGA development cycle which takes no less than 6-months for a component level functional prototype, and in the case of Maxeler's DFEs and development toolchain we witnessed 3-4 weeks of development time for a fully functional prototype system.  This is well dramatic!  For a view on Maxeler's COTS derived FPGA computing elements (DFEs), and our mutual collaboration let me quote Oskar Mencer (Maxeler's CEO).

Multi-scale Dataflow Computing looks at computing in a vertical way and multiple scales of abstraction: the math level, the algorithm level, the architecture level all the way down to the bit level. Efficiency gained from dataflow lies in maximizing the number of arithmetic unit workers inside a chip, as well as a distributed buffer architecture to replace the traditional register file bottleneck of a microprocessor. Just as in the industrial revolution where highly skilled artisans get replaced by low-wage workers in a factory, the super high end arithmetic units of a high end microprocessor get replaced by tiny ultra-low energy arithmetic units that are trained to maximize throughput rather than latency. As such we are building latency tolerant architecture and achieve maximum performance per Watt and per unit of space.

 

The key to success in such an environment is data, and therefore partnership between Maxeler and Hitachi Data Systems is a natural opportunity to maximize value of storage and data lakes, as well as bring dataflow computing closer to the home of data. (Oskar Mencer)

Projecting into the now and ahead a bit, firstly we're "open for business."  Maxeler is in the HDS TAP program and we can meet in the market, engage through HDS and when it makes sense we (HDS) are keen to help users directly realize extreme benefits.  As for targets we need a tough network programming or computing problem where the user is willing to reimagine what they are doing.  In the case of the already constructed packet capture solution we could extend, with some effort, from financial packet analysis to say cyber packet forensics, Telco customer assurance, low attack surface network defenses and so on.  For other potential problems (especially those in the computing space) please reach out. With respect to projecting a bit in the future, I want to pull forward some of Oskar's words from his quote to make a point: "[Computing] per unit space."  This is something that I really had to wrap my head around to understand and I think it is both worthy of calling out and explaining a bit -- the futurism aspect will come into focus shortly.  Unlike CPUs which work off of complex queuing methodologies computing in time, Maxeler's DFEs and more generally FPGAs compute in space.  What that means is that as data is flowing through the computing element it (the data) can be processed at ultra low latencies and little cost.  This is nothing short of profound because it means that in the case of networking the valueless action of moving data from system A to system B can now provide value.  This is in fact what Microsoft's Azure FPGA acceleration efforts for Neural Networks, Compression, Encryption, Search acceleration, etc. are all about.  To drive the point home further, what if you could put a networking card in a production database system and through the live database log ETL the data, via a read operation, immediately putting it into your data warehouse?  This would completely remove the need for an entire Hadoop infrastructure or performing ELT, and that means computing in space frees data center space.  Putting ETL on a programmable card is my projection ahead to tease the reader with now possible use cases, and further ETL logic executing on a Dataflow card gets down to computing in space not time! 

Michael Hay

The Many Core Phenomena

Posted by Michael Hay Dec 13, 2016

Introduction

Have you ever started an effort to push something, figure out it is like pushing a rope, pause, and then realize someone else picked it up and is pulling you instead?  Well, that is what F1 (not the car racing) did for the market’s use of many core computing.  For Hitachi our usage of FPGAs, custom chips and standard CPUs, also known as many core computing, is something we do because it elegantly and efficiently solves customer problems.  Even if it isn't fashionable it provides benefits and we believe in benefiting our customers.  So in a very real sense the self initiated mission that Shmuel and I embarked on more than 10 years ago has just come full circle.

 

About a decade ago we (Shmuel Shottan and Michael Hay) began both an open dialogue and to emphasize our efforts utilizing FPGA technologies in combination of “traditional” multi-core processors.  The product was of course HNAS and the results were seriously high levels of performance levels unachievable using only general purpose CPUs.  Another benefit is that the cost performance of such an offering is naturally superior (both from a CAPEX and OPEX perspective) than an architecture utilizing only one type of computing.

 

Our dialogue depicted an architecture in which the following attributes defined the implementation:

  • High Degree of parallelism - Parallelism is key to performance. Many systems attempt to achieve such parallelism. While processors based implementations can provide some parallelism (provided the data has parallelism), as demonstrated by traditional MIMD architectures, (Cache coherent or message passing), such implementations require synchronization that limits scalability. We chose fine grain parallelism by implementing in FPGA state machines.
  • Off-loading - Off-loading allows the core file system to independently process metadata and move data while the multi-core processor module is dedicated to data management. This is similar to traditional coprocessors (DSPs, Systolic Arrays, Graphics engines). This architecture provides yet another degree of parallelism.
  • Pipelining - Pipelining is achieved when multiple instructions are simultaneously overlapped in execution. For a NAS system it means multiple file requests overlapping in execution.
…”So, why offload a network  file system functions?  The key reason for it was the need to achieve massive fine grain parallelism. Some applications indeed lend themselves well to achieving parallelism with multiplicity of cores, most do not. Since a NAS system will “park” on network and storage resources, any implementation that requires multiplicity of processors will create synchronization chatter larger than the advantage of adding processing elements beyond a very small number. Thus, the idea of offloading to a “co-processor” required the design from the ground up of an inherently parallelized and pipelined processing element by design. Choosing a state machine approach and leveraging this design methodology by implementing in FPGAs provided the massive parallelism for the file system, as the synchronization was 'free'…” (Shmuel Shottan)

shmuel award 2.pngThe implementation examples given in the above quote were about network file system function, dedupe, etc.  At the time the reference was published it seemed like a depiction of an esoteric technology, and did not resonate well within the technical community. Basically, it was rather like pushing on a rope meaning it was kind of interesting exercise in futility for us. People were on the general purpose CPU train and weren’t willing to think differently.  Maybe a better way to say it:  The perceived return from the investments in writing for FPGAs, GPUs and other seemingly exotic computing devices was low.  If you did the math it wasn't so, but at the time that was the perception.  Another attribute of time times Moore’s law still had plenty of gas.  So again, our message was like pushing on a rope lots of effort very little progress.  Our arguments weren't lost on everyone.  The action was tracked by HIPC, and Shmuel was invited to deliver a talk at the HiPC conference. Additionally, within the Reconfigurable Computing track of IEEE people paid attention.

 

Mr. Moore strike one for the general purpose CPU: Intel Acquired Altera, an FPGA company.

In a world where the trend of commoditization in storage, server and networking has been addressed multiple times and “accepted” by many, the addition of FPGA as an accepted “canvas” for the developers to paint on is natural and welcomed. Intel, the world largest chip company, is copying a page from its play book: Identify a mature and growing market and then embed it into their chipsets.  Does anyone remember that motherboards used to include only compute and core logic elements? Intel has identified graphics and networks, SAS connectivity (now NVMe) and core RAID engine as ubiquitous in the previous decades, and indeed most NIC, HBA,  RAID and Graphic chips providers vanished.

 

Companies who leverage the functions provided by Intel’s chipsets for supply chain efficiency while continuing to innovate and focus on the needs of the users not satisfied with mediocrity will excel. (Look at NVIDIA with their focus on GPUs and CUDA). Others, who elected to compete with Intel on cost, perished. The lesson is that when a segment matures and becomes a substantial market segment, expect commoditization, and innovate on top of and in addition to the commodity building blocks since one size does not fit all.

 

Some carry the commoditization arguments further. Not only are all networking, storage and compute systems predicted to run on the same Intel platform from one’s favorite OEM, but even software will become just “general purpose software”, thus, efforts put into software development are a waste since Linux will eventually do just as well.  This hypothesis is fundamentally flawed. It fails to distinguish between leveraging a supply chain for hardware and OSS for SW when applicable, but ignores to acknowledge that innovation is not and will never be dead.

 

IoT, AI and other modern applications will only create more demands on networking, compute and storage systems. Innovation should and will now include enabling certain relevant applications to be implemented with FPGAs and when relevant custom chips. Hitachi, being the leading provider of enterprise scalable systems is best positioned to lead in this new world.  Honestly, we know of no other vendor better positioned to benefit from such a future trend. There are several reasons for this, but the most important is that we recognize that innovation isn’t about OR but AND.  For example general purpose CPUs and FPGAs, scale-up and scale-out, etc.  Therefore, we don’t extinguish skills due to fashion we invest over the long term returning benefits to our customers.

 

Mr. Moore second swing and strike two:  Microsoft Catapults Azure into the future

Microsoft announced that it has deployed hundreds of thousands of FPGAs (field-programmable gate arrays) across servers in 15 countries and five different continents. The chips have been put to use in a variety of first-party Microsoft services, and they're now starting to accelerate networking on the company's Azure cloud platform.

 

In addition to improving networking speeds, the FPGAs which sit on custom, Microsoft-designed boards connected to Azure servers can also be used to improve the speed of machine-learning tasks and other key cloud functionality. Microsoft hasn't said exactly what the contents of the boards include, other than revealing that they hold an FPGA, static RAM chips and hardened digital signal processors.  Microsoft's deployment of the programmable hardware is important as the previously reliable increase in CPU speeds continues to slow down. FPGAs can provide an additional speed boost in processing power for the particular tasks that they've been configured to work on, cutting down on the time it takes to do things like manage the flow of network traffic or translate text. 

 

Azure CTO Mark Russinovich said using the FPGAs was key to helping Azure take advantage of the networking hardware that it put into its data centers. While the hardware could support 40Gbps speeds, actually moving all that network traffic with the different software-defined networking rules that are attached to it took a massive amount of CPU power.

"That's just not economically viable," he said in an interview. "Why take those CPUs away from what we can sell to customers in virtual machines, when we could potentially have that off-loaded into FPGA? They could serve that purpose as well as future purposes, and get us familiar with FPGAs in our data center. It became a pretty clear win for us."

"If we want to allocate 1,000 FPGAs to a single [deep neural net] we can," said Mike Burger, a distinguished engineer in Microsoft Research. "We get that kind of scale." (Microsoft Azure Networking..., PcWorld)

That scale can provide massive amounts of computing power. If Microsoft used Azure's entire FPGA deployment to translate the English-language Wikipedia, it would take only a tenth of a second, Burger said on stage at Ignite. (Programmable chips turning Azure into a supercomputing powerhouse | Ars Technica )

Third strike you’re out - Amazon announces FPGA enabled EC2 instances the new F1
The paragon in the commodity movement, Amazon Web Services not Intel, has just made bets on Moore’s Law but not where you might think.  You see, if we change the game and say that Moore was correct about a team of processing power then Mr. Moore’s hypothesis is still proved true on FPGAs, GPUs and other special purposed elements.  Both of us believe that if software is kind of like a digital organism then the hardware is the digital environment organisms live on.  And just like in the natural world both the environment and the organisms evolve independently and together dynamically.  Let’s tune into Amazon’s blog post announcing F1 to gather a snapshot on their perspective.

"Have you ever had to decide between a general purpose tool and one built for a very specific purpose? The general purpose tools can be used to solve many different problems, but may not be the best choice for any particular one. Purpose-built tools excel at one task, but you may need to do that particular task infrequently.

 

"…Computer engineers face this problem when designing architectures and instruction sets, almost always pursuing an approach that delivers good performance across a very wide range of workloads. From time to time, new types of workloads and working conditions emerge that are best addressed by custom hardware. This requires another balancing act: trading off the potential for incredible performance vs. a development life cycle often measured in quarters or years….

 

"...One of the more interesting routes to a custom, hardware-based solution is known as a Field Programmable Gate Array, or FPGA. In contrast to a purpose-built chip which is designed with a single function in mind and then hard-wired to implement it, an FPGA is more flexible. It can be programmed in the field, after it has been plugged in to a socket on a PC board….

 

"…This highly parallelized model is ideal for building custom accelerators to process compute-intensive problems. Properly programmed, an FPGA has the potential to provide a 30x speedup to many types of genomics, seismic analysis, financial risk analysis, big data search, and encryption algorithms and applications…. (Developer Preview..., Amazon Blogs)

Hitachi and its ecosystem of partners has led the way and continues to lead in providing FPGA and as relevant custom chip based innovations to many areas such as genomics, seismic sensing, seismic analysis, financial risk analysis, big data search, combinatorics, etc.

 

The real game, Innovation is about multiple degrees of freedom

To end our discussion let’s start by reviewing Hitachi’s credo.

[Our aim, as members of Hitachi,] is to further elevate [our] founding concepts of harmony, sincerity and pioneering spirit, to instill a resolute pride in being a member of Hitachi, and thereby to contribute to society through the development of superior, original technology and products.

 

Deeply aware that a business enterprise is itself a member of society, Hitachi [members are] also resolved to strive as good citizens of the community towards the realization of a truly prosperous society and, to this end, to conduct [our] corporate activities in a fair and open manner, promote harmony with the natural environment, and engage vigorously in activities that contribute to social progress.

This credo has guided Hitachi employees for more than 100 years and in our opinion is a key to our success.  What it inspires is innovation across many degrees of freedom to improve society.  This freedom could be in the adoption of a clever commercial model, novel technologies, commodity technologies, new experiences, co-creation and so on.  In other words, we are Hitachi Unlimited! 

 

VSP FPGA Blade - sketch.jpeg

During the time of pushing on the rope – with respect to FPGA and custom chip usage –  if we had listened to the pundits and only chased general purpose CPUs, when the market picked up and pulled, we’d not be in a leadership position.  Given this innovation rope here are four examples:

 

Beyond these examples, there are many active and innovative FPGA and custom chip development tracks in our Research pipline.  So we are continuing the intention of our credo by picking up other ropes and pushing abundantly to better society!

What do complex social infrastructure systems and Artificial Intelligence systems have in common? They both impose enormous and expanding computational loads. Given enough time and resources these computational loads might be successfully processed using conventional computing architectures. However, as the complexity and volume of these computational loads increases, at some point it will become economically and technically unfeasible to continue with conventional computing.

 

What if you could move specific computational problem solving loads to a chip that was designed to handle these complex problems in a much more efficient manner and at a fraction of the foot print and cost?

 

Enter Hitachi’s Ising chip. Rather than executing the problem solving procedures sequentially as with conventional compute architectures, Hitachi is proposing a different concept called “natural computing” as applied to an Ising model.

 

At this point we need to understand the following key terms:

 

1. Combinatorial optimization

Combinatorial optimization refers to a class of mathematical problems where a solution must be found that maximizes (or minimizes) a performance index under given conditions.  A characteristic of combinatorial optimization problems is that the number of candidate solutions increases explosively as the number of parameters increases. A classic example of this kind of problem is the “traveling salesperson problem”, in this problem the salesperson has a list of cities that must be visited, and the problem to be solved is “what is the most efficient travel route I can take through all cities to minimize my travel time?”. What is being sought is the best of many possible answers. As more cities are added to the list the number of possible answers grows dramatically. The social infrastructure and AI technologies can generate very large combinatorial optimization problems.

 

2. Natural computing

Quoting Springer Intl. Publishing, publisher of Natural Computing Journal: http://link.springer.com/journal/11047

"Natural Computing refers to computational processes observed in nature, and human-designed computing inspired by nature. When complex natural phenomena are analyzed in terms of computational processes, our understanding of both nature and the essence of computation is enhanced. Characteristic for human-designed computing inspired by nature is the metaphorical use of concepts, principles and mechanisms underlying natural systems. Natural computing includes evolutionary algorithms, neural networks, molecular computing and quantum computing."

 

3. The Ising model

The Ising model (named after Ernst Ising) is a mathematical model concerned with the physics of phase transitions, which occur when a small change in a parameter causes a large-scale, qualitative change in the state of a system. The properties of a magnetic material are determined by magnetic spins, which can be oriented up or down. An Ising model is expressed in terms of the individual spin states, the interaction coefficients that represent the strength of the intersections between different pairs of spin states, and the external magnetic coefficients that represent the strength of the external magnetic field.

 

 

What has Hitachi Research done?

 

Instead of heating up the data center running countless permutations of variable combinations looking for the optimal combination of those variables using conventional computing, Hitachi is proposing to use a method for natural computing and use a natural phenomenon to model the problem to be solved (mapping) and take advantage of the convergence inherent in this natural phenomenon to converge on the solution to the problem.

 

Hitachi proposes replicating the Ising model using a Complementary Metal Oxide Semiconductor (CMOS) circuit. Hitachi maps the combinatorial problem into the Ising model in the CMOS circuit in such a way that its performance index corresponds to the model’s energy, the Ising model is allowed to converge so that the spin state adopts the minimum energy configuration. This minimum energy configuration is equivalent to obtaining the optimal combination of parameters that minimizes the performance index of the original optimization problem.

 

Early prototypes have proven themselves to be effective and as much as 1800x more energy efficient when compared to competing Ising computer technologies. Hitachi continues to innovate in this area to address the growing computational demands of AI and social infrastructure systems. If you would like to learn more about this Hitachi Research activity, I recommend the following articles published in the Hitachi Review magazine:

http://www.hitachi.com/rev/pdf/2015/r2015_08_111.pdf

http://www.hitachi.com/rev/pdf/2016/r2016_06_110.pdf

INTRODUCTION

At the top of the headlines, ahead of Brexit, is Mr. Trump's ascendency from Candidate Trump to President Elect Trump.  However, hidden below these headlines are older news threads and data below the surface, which both carry lessons and messages.  Is there a new Darker Age emerging?  Well as I learned a couple of years ago, the data doesn't seem to point in that direction in fact the data trends towards something more positive1.  So what do we make of this?  Well I think that instead of living in interesting times, we live in uncertain times!  To motivate our discussion let's look at a few headlines and in some case their hidden dimensions.

 

uncertain news.png

Samsung's exploding phones, recalls and the withdrawals of products from the market are very visible. In fact, if you travel on an airplane recently one cannot avoid a reference to Samsung by the airlines.  Below the surface are also other data points about say washing machines from Samsung both in the US and in Australia. While we aren't privy to the discussions and debates inside of Samsung, I bet that they are looking at many things from their supply chain, to their processes for QA, and so on.  Maybe there is a lesson here to learn: Sweating the details and testing complex devices is critical.  Heck perhaps we can even say that hardware matters.

 

With Deutsche bank now being required to pay for past misdeeds, to the US government, can a company ever really close their books on a given quarter or year?  Is this the new Lehman Shock?  Analysts just don’t know.

 

For Apple they may or may not owe back taxes, the conclusion has yet to be seen and the Brexit is looming here too perhaps this whole thing is irrelevant then again Apple’s potential burden is nearly the same as well Deutsche Bank’s.  There is correlation for sure, but I'm not sure we can say for sure one caused the other.

 

When considering China people thought the growth machine would go on forever, but if you talk to the Chinese people there are real and new market dynamics now.  So they are in a transition from the factory of the world to well something else.  Their growth pattern is now following that transition as well.

 

Ireland’s the opposite of China because after accepting more than 60 billion euros in bailout funding they are now one of the hotest economies in Europe.  Yet their key benefits are low corporate tax rates, and now the looming Brexit.

 

On the technology side, who would have thought a NO-SQL projects espousing the ideas of scale-up monolithic storage as they make the transition from scale-out at all costs to in-memory.  However this isn't the only case of contrarian technology memes below the surface.  My colleague Vincent Franceschini pointed out an interesting company called Nervana who makes an AI chip, plus Google has their Tensor Flow chip and Microsoft is in the pursuit of FPGAs for Deep Learning.  What is the specialized hardware doing coming from cloud computing companies?

 

In all of these cases the outcomes are always clear in hindsight, but ahead of time prediction let alone scenario planning is more than hard unless you have a time machine. To me this means that the world is simultaneously more complex and more uncertain than ever certainly more so than people want to admit.  Given these points and line of thinking, I believe the stage is set for exploring the point implied in the title: Innovating in Uncertain Times.

 

uncertainty normal.pngASSUME A GIVEN AND THEN PROGRESS

For this post we'll assume Uncertainty is the New Normal, cannot be avoided and is therefore our given.  So let's progress...  Obviously we'd like to at least intellectually understand why this to be the case.  To do that we'll do a little thought experimentation brewing together a some statistics ideas, thoughts around queuing (theory), and reflect on what kind of meaning that has for innovation.

 

COPING WITH UNIFORM VARIANCE

In the old world there was a sense of consistent or uniform variance, whereby uniform variance.pngspecializations could be built and inherent variations for needs were relatively low. This resulted in both simplified service/product differentiation and necessarily preferred depth over breadth.  Small or "tight variance" shaped both market definitions and even helped to provide a framework to compare and contrast actors in markets.  (For example, It allowed market analyst firms to define what a market like storage meant, who the actors were and compare them to each other.)  Beyond market definitions and comparative models this "tight variance" meant that legitimate stretches and leaps were possible.  In the case of ICT (Information Communications Technologies), having a storage OR network OR server company stretch to be a converged systems company was realistic and we can see these changes realized in the that market today.  So the key point here is tight variance is a good thing for companies to carve out a niche in a defined market, compete and grow in a measured semi-diversified manner.

 

To start our experiment let's review some of the properties of the Gaussian Distribution.  Firstly, there are many things you can pull from the Gaussian like the mean, median, mode, variance and so on.  Depending on the character of the data, these descriptive statistics can have more or less meaning.  Specifically, when there is tight variance descriptive statistics, like the mean, have strong meaning.  In fact, the mean of a Gaussian in conjunction with a tight variance can almost represent the character of the data in the distribution.  (We'll look at this a little later, but for now be aware that the mean of a Gaussian with a large variance represents information loss and has poor descriptive meaning.) The figure that I've included in this section attempts to visualize these very points by illustrating both low statistical and coloration variance -- actually all of these "quantized units" are shades of green.  So we can kind of say that the "mean" is green with a variance of green-1, green-2, green-3, etc.  Linking this back to both market specialization and ICT specifically, this "tight" statistical and color variance is meant to imply that stretching to new adjacent and largely congruent areas is/was achievable!  Meaning that you'd expect to see companies diversify from being storage specialized to converged systems and appliances capable, and we can find evidence of this in the likes of both Hitachi Data Systems and NetApp.

 

Continuing our thought experiment let's take each one of these quantized green squares to mean an abstracted need, and then imagine how it would be handled.  In our experiment, the needs will be queued up and a "single server" will exhaust the queue of needs.  Looking at the image below we'll take all of the various shades of green to represent a family of needs.  In fact, because some elements in the family are the same, these elements can be processed as batches.  That is to say a single response or unit of work can resolve multiple need quanta at once.  Using that line of thinking in this example queue for all instances of which are marked as “same” we can perform one action and meet the need of many individual instances, shrinking the actual number of unique needs in the queue from 23 to 8.

 

processing uniform.png

 

Need consistency makes it easier for a single centralized organization, team or individual, to cope on a global scale and semi-diversify -- as in the cases of Hitachi Data Systems and NetApp.  However, what happens when both the hypothetical statistical variance of the distribution is higher and the spread of colors is greater?

 

COPING WITH NON-UNIFORM VARIANCE?

In the new world needs are more unique moving quickly beyond either real or perceived certainty and consistency.  Also expressed needs come from varying parts of businesses, a wider array of locations, and new parties meaning a depressingly gargantuan variance.  With the variance being both larger mathematically and more geographically dispersed, the ability for a single centralized organization to respond to incoming needs is next to impossible.  This is because it isn’t a game of handling adjacent similar needs, but instead coping with major chasms between the needs compounded with location dispersal.  For example, let’s say you were an organization selling Big Data Software Stacks, but now an O&G company is asking you to participate in the field of seismic sensing and interpretation by using AI, past data (from within the O&G company), etc.  At first glance this is doable, right?  However, when you more deeply understand the problem you realize not so fast.  Working through this problem actually mandates the on-boarding Seismic Interpreters, Geophysicists, and Seismic data handlers.  However, the specialization ask has another dimension:  the teams can only be fielded for select geographic locations.  This combination, specialization and dispersed location, defeats the premise of a centralized consolidated organization, mix in tight timelines and who knows.  Instead, a distributed team with the right skills in the right locations plus a matching  “centralized curator” is likely a better fit.  Let's explore this.

 

nonuniform variance.png

 

This fictitious Gaussian distribution is significantly different than the previous one.  The variance is wider and the color palette is broader.  As alluded to earlier, a large variance results in descriptive statistics losing meaning and information.  So by having a wider suggested variance and broader color palette I'm trying to over emphasize that point.  While like the previous hypothetical distribution the mean is green we cannot say that the variance is green-1, green-2, green-3, but must instead say it is blue-1, blue-2, green-1, green-2, brown-1, yellow-1, yellow-2, and so on.  In other words, while there is a clear mean using it to describe the overall distribution without understanding the wide (color) variance would result in lost information.  Similarly, if we imagine the impact to a server, optimized to handle a queue of green needs with low variance, we'll find that while the greens can be processed it is the blues, yellows, and browns that will persist in the queue unprocessed.  Let's examine this behavior with our continued thought experiment.

 

processing nonuniform.png

Since the server can process any greens the queue is initially reduced but needs that are not green remain.  Therefore, if non-green needs continue to arrive in the queue it will grow without bounds.  Beyond boundless growth when new green needs enter the queue they may be occluded, resulting in either elongated time to service or those with green needs to abandon the queue.  Typically, in any of these cases there is a negative impression created of the server.  Now let's equate this situation to a fictitious organization with the green server representing the organization.  Initially, the organization can handle the green needs, but soon enough when non-green needs arrive processing fails and the entities adding both green and non-green needs to the queue become dissatisfied.  You may want to ask why this could happen or why would an organization do this?  The answer stems from uncertainty.  For organizations to remedy the uncertainty they are looking beyond their current stable of customers, competitors, vendors and suppliers to new ones.  That is because entities outside of an organization’s domain may help them invent something novel.  So this naturally creates a scenario where novel needs are likely accepted into the organization, and novel needs carry both unknowns and have emergent properties (e.g. firm associations to specific geographical regions which are likely not apparent at an initial review).

 

ADDING MORE CENTRAL CAPABILITY

A likely response, by many organizations today, is to hire new individuals and teams capable of handling novel (non-green) needs.  Such a formation would look like multiple servers handling multiple queues, and the image in this paragraph illustrates this formation.  processing nonuniform central.pngWhen given enough time plus a lack of geographic variation, forming a centralized set of teams is both possible and economical.  However, early in this post I referenced President Elect Trump and the Brexit, but why?  Beyond the uncertainty these two actions represent a rejection of decades of globalization.  In fact, there appears to be a desire to move towards more intense local country focus -- at least for America and the U.K. so far -- at the expense of centralized global responses.  More evidence of this comes from my year's interactions with customers in various countries.  Most recently, I gave a talk in Europe where I dubbed this movement re-localization to both open the dialogue and gauge interest in increased country specific innovations.  What I found from the audience was a desire to learn about happenings globally, but optimize them for the local country.  Moreover, my visits the remainder of the week resulted in some interactions with the country's Telecommunications company.  What I learned is that this company had shed nearly all of its global holdings and was spending the resulting moneys locally to develop an IoT practice.  We spent the evening with this Telco and I was able to connect to one of their team members to talk about strange things like Google's Tensor Flow chip, FPGAs and so on.  During that discussion it was obvious to me that the goal was to learn about things from around the world, and apply the results locally unencumbered by centralized global control.  While this is mostly about my recent visit, again I've found customers, partners, and Hitachi's field teams all mobilizing to support localities first and global responses second.  If we take this new information and overlay it on our experimental idea of using more different types of servers in a centralized location, in my opinion I think this runs afoul of the emerging trend of re-localization.  Firstly, this formation is more than challenged to develop empathy of any specific locality and that means local optimizations are next to impossible.  Secondly, there is a hierarchy that is implied by any such formation, which runs afoul of needing to respond locally and autonomously.

 

So if minor stretches, suggested by the handling of tight variance and shades of grain, aren't enough to respond to the market.  Also if allowing in more variance -- non-green variance in our thought experiment -- and multiple servers with discrete specializations in a centralized location fails, what is the response?  Disruptive innovation, right?

 

DISRUPTIVE INNOVATION?

agility.pngTo understand if Disruptive Innovation is a response to uncertainty let’s start from the definition of disruption and see where that takes us.

Disruption - to cause (something) to be unable to continue in the normal way : to interrupt the normal progress or activity of (something) [Merriam-Webster, Disrupt | Definition of Disrupt by Merriam-Webster]

So there is a “normal” condition assumed in the definition and disruptions push against or move away from that normal.  Yet these are uncertain times, and does that suggest we can resolve uncertainty with disruptive thinking and innovations?  Again let's turn to a dictionary to extract the definition of uncertain to see.

Uncertain - not certain to occur :  problematical <his success was uncertain> : not reliable :  untrustworthy <an uncertain ally> : not known beyond doubt :  dubious <an uncertain claim> :  not having certain knowledge : doubtful <remains uncertain about her plans> : not clearly identified or defined <a fire of uncertain origin> : not constant. [Merriam-Webster, Uncertain | Definition of Uncertain by Merriam-Webster]

In this case I used a longer definition of "uncertain" to avoid any sense that "uncertainty" contains a reliable normal.  Given that we're living in uncertain times I think it is ok to say that disruptive innovation is unlikely to work in the Era of Uncertainty.  We're going to have to deliver something decidedly different than we've seen previously.  But what then?  Fortunately, sometimes simple responses are best and more specifically agility, more intensive localization and new kinds of pairing can help.

 

  • Agility is about flexible adaptation to the surrounding environment.  Those who are more agile tend to field more frequent responses to changing and uncertain market conditions allowing them to vet these responses finding the best fit for the time.  Elimination at too early a stage means potential solutions to known and unknown problems are potentially removed before they can be observed in the wild.
  • Additionally, due to noisy uncertainty with intense local variation, a centralized strategy to rule all strategies is doomed from the outset.  Instead more experimentation in various localities and a central discipline that helps more in match making between the localities may be a better “strategy” to employ.  This could also mean, given the bazaar2 approach to open source development, a key discipline avoiding institutional thinking and internal politics is an open source-like model.  In other words being a bit more unruly and letting sales and field teams themselves engage in the “innovation funnel" could be more productive than top down thinking.  Even in natural systems simple and profound locally executed rules can have a dramatic impact.  An example is ants; given some basic rules, like finding food and reacting to danger using scents, ants can appear to be more intelligent than they actually are.  Structure and complexity emerge in ant societies through these simple rules exchanged in a one-on-one local manner.  The same thing can be said about organizations.
  • Pairing is not new at all, but what is new about pairing or partnering is who organizations are doing it with.  Having a co-pilot who in a nonthreatening manner can share how they are transforming and at the same time navigate with you is important. Sometimes when distinct organizations work together each organization can point out a new finding to the other.  This is simply due to showing up with different biases and engaging in a new pairing.  It is almost like the “consulting effect” where an outsider can see something or capture an organizational wisdom you’re not paying attention to internally.  At a micro level agile programming can take advantage of so-called paired programming to “increased code quality: ‘programming out loud’ leads to clearer articulation of the complexities and hidden details in coding tasks, reducing the risk of error or going down blind alleys.”

 

The key will be how to balance between these three areas, and most importantly managing implied changes on centralized teams, field organizations, partners and customers.  The results are likely many new ecosystems which vary per geography.   To be clear it is likely not possible to tackle innovating in uncertain times through incremental enhancements to corporate functions.  In fact my experiences show that corporate-to-corporate collaborations often are challenged.  Even when there is willpower invested to do something, should a corporate team you're collaborating with reorganize you end up having to reset the engagement thwarting the original design and stifling innovation.  So I'm making a kind of assumption that collaboration on the edge with field organizations rather than central corporate teams is ideal to start.  Once again we can be inspired by the example of ants: Generally speaking food isn’t found in the ant nest but out in the field and the same is true with the food of business, revenue.  Exactly, what organizational structure can we explore having the potential to aid in the quest to innovate in this Era of Uncertainty?

 

CONCLUSIONS

As to a structure that would be compatible with this Era of Uncertainty, I think we can draw lessons from both natural and digital worlds.  Specifically, the ideal of simple rules in insect societies, like ants, is very attractive, but what's missing is how to bridge larger geographic dispersion.  Obviously, centralized command and control mechanisms -- referred to as the Cathedral by Eric Raymond -- are attractive because we can simply "order" certain actions in various geographies.  Yet, as I spent a long time explaining a Cathedral-like structure isn't suitable for today.  So again what to do?  Somehow we have to merge these two seemingly opposite approaches into a structure.  This is where we can look towards digital systems as inspiration in my opinion.  More specifically, I think that the internet's Domain Naming Service3 (DNS) provides a likely example of how to build an organizational structure fit for these times.  While you can get into the details of the DNS, including its rules and commands, I think that inside the mission of the DNS to resolve names and addresses there are some basic rules that are inspiring.  Notably, check locally to see if you've got the address cached if you do use it, if you don't ask the next nearest neighbor on the list for help.  This simple rule is repeated recursively, until the name is resolved to an address and eventually you get to you targeted service.  One could in fact argue that this venerable mature service, while having very simple rules, has endured the tests and challenges of the internet over time including uncertainty.  In terms of what structure and how need processing would occur, below are a series of diagrams that point to that very model.

 

Here in this first diagram you'll find centralized curation and distributed execution in two sites one labeled with a 1 and the other with a 2.  As we step through each state you can see under normal operations the needs processing capability works well locally when needs that can be processed locally arrive.  However, when there are unfamiliar needs which arrive both in sites 1 and 2, what should be done?

 

org 1.png

 

If we imagine the role of the centralized curation team to help specific localities find capability to process new needs, the next set of images starts to make sense.  Essentially, in the first pane the localities ask the central team to help them find someone who can handle the brown and blue needs.  In response, the centralized curation team responds to each locality with the address of a particular locality capable of handling the required needs.  From there, with the last step the localities exchange work to conclude meeting the needs on a peer-to-peer basis.

 

org 2.png

This means that the impression of handling the response locally is maintained, the work is accomplished and the entities expressing these needs are fulfilled.  With root servers, resolvers, and so on this organizational structure -- inspired by a digital system -- has an uncanny similarity to the DNS.  I'm sure that when the reader looks at this there are obvious misses such as overworking individuals, the need for adding skills in particular localities should workloads increase, and so on.  Honestly, I'm not after trying to resolve these ideas yet, but more interested in getting a well thought out idea in the wild for review and comment.  I suppose I'm following Eric Raymond's idea of release early and release often!

 

 

FOOTNOTES

  1. I read a great piece by Steven Pinker and Andrew Mack over at The Slate which looks at the issue of an emerging Darker Age.  Their article, The world is not falling apart: The trend lines reveal an increasingly peaceful period in history, uses data to suggest that the reality is quite the opposite.  Essentially, hidden below the noise, of bad news headlines, are data pointing to the reduction in crime, war and death to historic lows.  This obviously contradicts the nearly hourly reporting of all things bad, and it is the data that unwinds the intuitive leap from bad news to a new Darker Age.
  2. Eric Raymond's timeless essay called the Cathedral and the Bazaar chronicles musings about unruly open source projects including Linux and his own Fetchmail.  While the stories are largely about building Open Source communities I believe there are organizational lessons and implications beyond just creating code.  Specifically, I think that some of the ideas about letting a project evolve quickly and in the open are essential to consider in the era of uncertainty. 
  3. There are new movements afoot to move the ideas of the DNS onto newer stacks like the Blockchain -- one such effort is called Namecoin.  "Namecoin was also the first solution to Zooko’s Triangle, the long-standing problem of producing a naming system that is simultaneously secure, decentralized, and human-meaningful." (https://namecoing.org)  Such a system would enable a more resilient internet and thwart opportunities for censorship and also improve overall usability.  However given the theory of the Blockchain, I cannot easily imagine what a structure would look like based upon it.  Would we end up having many many small organizations/companies that band together to solve market problems?  Who knows.  I can say that I believe the DNS in its current form to be a sufficient leap both for the Era of Uncertainty and to build organizations enabling Digital Transformation.  In fact, one could argue that to effectively enable Digital Transformation, you'd need to organize based upon Digital and not Natural systems, but perhaps this is a topic for another future post.

“Everyone talks about it, nobody really knows how to do it.”

 

Luckily, things have moved on since Geoffrey More’s famous quote about Big Data, but there’s still an awful lot of confusion and frustration when it comes to analytics.

This is not a unique experience.  Companies struggle to exploit Big Data - partly because they don’t know how to overcome the technical challenges and that they don’t know how to approach Big Data analytics.

 

The most common problem is data complexity. Often this is self-inflicted, as when starting out with Big Data analytics companies try to “boil the ocean.” IT teams subsequently become overwhelmed and the task turns out to be impossible to solve.

 

It’s true, data analytics can deliver important business insights, but it’s not a solution for every corporate problem or opportunity.

 

Complexity can also be a symptom of another problem, companies struggling to extract data from a hotchpotch of legacy technologies. The reality is that many companies will be tied to legacy technologies for years to come, they need to find a way to work within this context.

 

Another source of trouble is setting wrong or poorly planned business objectives.  This can result in people asking the wrong questions and interrogating non-traditional data sets through traditional means.

 

Take Google Flu Trends, an initiative launched by Google to predict flu epidemics. It made the mistake of asking: “When will the next flu epidemic hit North America?”

When the data was analysed, it was discovered that Google Flu Trends missed the 2009 US epidemic and consistently over-predicted flu trends. The initiative was abandoned in 2013.

 

An academic later speculated that if the researchers had asked “what do the frequency and number of Google search terms tell us?” the project may have proved more successful.

 

data-000084922357-low.jpg

 

Moving mountains with simplicity

 

The renowned American poet, Henry Wadsworth Longfellow, once wrote: “In character, in manner, in style, in all things, the supreme excellence is simplicity”.

 

Too often, people associate simplicity with a lack of ambition and accomplishment. In fact, it’s the key to unlocking a great deal of power in business.  Steve Jobs once said you can move mountains with ‘simple’.

 

Over the years, technology has progressed by getting simpler rather than more complex. However, this doesn’t mean the back-end isn’t complicated. Rather, a huge amount of work goes into creating an intuitive user experience.

 

Consider Microsoft Word.  Every time you type, transistors switch on or off and voltage changes take place all over computer and storage mediums. You only see the document, but a lot of technical wizardry is happening in the background.

Cooling the ocean with an abstraction layer

 

Extracting meaningful value from data depends on three disciplines: data engineering, business knowledge and data visualisation. To achieve all three, you either need a team of super humans who can code in their sleep, have a nose for business, an expansive knowledge of their industry and adjacent industries, supreme mathematical genii and excellent management and communication skills.

 

Or, you have technology that can abstract all these challenges and create a platform layer which does most of the computations in the background. This is where Pentaho’s data engineering, preparation, and analytics platform performs some very powerful, deft manoeuvres.

 

But there is a caveat. Even if you eschew complexity and embrace a simplified data platform, you still need data savvy people. These data scientists won’t have to train for three years to memorise the finer points of Hadoop, but they will need to understand Big Data challenges.

 

Pentaho provides the method and points businesses in the right direction, but they still need to uncover what questions to ask, and what kind of answers to expect. In a previous blog, I explored how businesses can equip themselves with the right skills for the job.

 

Big data successes

 

While Big Data projects may stall or fail for any of the above reasons, we are starting to see more succeed and transform businesses, mainly thanks to the huge strides in stripping out complexity in the front-end through layer technology.

 

Let’s take the Financial Industry Regulatory Authority, Inc. (FINRA), a private self-regulatory organization (SRO) and the largest independent regulator for all US-based securities firms. Since using Pentaho, the financial watchdog has been able to find the right ‘needles’ in their growing data ‘haystack’. Analysts can now access any data in FINRA’s multi-petabyte data lake to identify trading violations – in an automated fashion making the process 10 to 100 times faster.  This means a difference of seconds vs. hours. FINRA achieved simplicity and more control of its data as a result. According to this article, FINRA ordered brokerages to return an estimated $96.2 million in funds they obtained through misconduct during 2015, nearly three times the 2014 total.

 

Similarly, through Pentaho, Caterpillar Marine Asset Intelligence demonstrated to one of its customers with a fleet of eight ships that shutting a tugboat’s engine down when idling for extended periods would save $2 million in wasted fuel fleet wide every year.

 

Big Data projects don’t have to confound and confuse. They can bring breakthrough lightbulb moments, provided they’re grounded in simplicity. Let the technology do the difficult stuff – in all else, keep it simple.

 

If you want to find out more about Big Data, then a good starting point would be my webinar on the power of digital transformation, available online now.

A couple of weeks ago I had the privilege of visiting Mesosphere up in the city (San Francisco) for a few days. They were offering their newly created training course on DC/OS and not being terribly familiar with container orchestration, Apache Mesos and the surrounding technologies of DC/OS I jumped at the chance to attend.

 

What I discovered was a very powerful technology that would allow us to quickly start the process of building Big Data solutions on top of HSP (or even UCP, which, by the way, is being done in the Denver lab). There’s also a bigger picture here in that by using HSP and Mesosphere DC/OS we can properly orchestrate, schedule and manage containers which ultimately allows us to build distributed apps across HSP (or USP) and run stateful (and/or stateless) workloads.

 

Now, I say ultimately as if this is somehow a future possibility. The reality is, Hitachi is already using this technology! How many people within Hitachi or outside of Hitachi know of Hitachi Content Intelligence (HCI)? If you know of HCI, do you know what the underlying architecture is? Yeah, you guessed it: Apache Mesos and Marathon, the same technology that underpins Mesosphere DC/OS! Now here’s the mind blowing thing that will probably put me on the HCI engineering hit list… we could theoretically, take HCI and distribute it on DC/OS running on top of HSP. What other apps can we distribute across DC/OS and HSP (or UCP)?

 

Image 1: Marathon performing container orchestration for HCI

Screen Shot 2016-08-26 at 11.28.53 AM.png

 

Image 2: Apache Mesos for HCI

Screen Shot 2016-08-26 at 11.29.07 AM.png

So with HCI, Apache Mesos and Marathon in mind, let’s take a step back and see what we can do with DC/OS on HSP.

As shown in the image below, it’s just a simple task of spinning up a bunch of CentOS 7.2 VMs in HSP, then deploying DC/OS from a bootstrap server. (https://dcos.io/docs/1.7/administration/installing/custom/)

 

Image 3: CentOS 7.2 VMs for DC/OS

Screen Shot 2016-08-24 at 11.07.02 AM.png

 

Once DC/OS was installed, I was able to quickly spin up Cassandra and HDFS.

 

Image 4: Frameworks Marathon, Cassandra and HDFS all healthy and running

Screen Shot 2016-08-25 at 8.53.57 AM.png

 

Next, I wanted to spin up an instance of Spark for our super quick Big Data solution but it turns out that Spark wouldn’t install properly. Well that’s a drag…

 

Image 5: Installable frameworks including Spark.

Screen Shot 2016-08-25 at 8.54.39 AM.png

 

Being fairly new to this I wasn’t quite able to figure out why Spark wouldn’t install, so skipping it, I decided to install a Hadoop docker container instead. All I had to do was search Docker Hub for Hadoop and found a container from Sequenceiq, tell Marathon to create a new application and point it at the hadoop docker image. Viola! I have the beginnings of our Big Data solution on HSP.

 

Image 6: Hadoop running in a docker container

Screen Shot 2016-08-25 at 8.55.52 AM.png

 

Image 7: Resource usage of our starter Big Data solution on HSP and DC/OS.

Screen Shot 2016-08-25 at 8.53.32 AM.png

 

 

Technology Incubation & Evaluation Gateway

Hitachi Developer Network