Dec 5, 2013


Trend 6: Object Storage Casts Light on Dark Data

2014 will see more awareness of the existence of dark data and the potential that it has for business value.  Gartner defines dark data as the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). Much of the focus of Big Data is getting access to dark data.


One of the main reasons that organizations fail to use this data is that they are locked up in application or vendor silos. A good example of how one industry is trying to shed light on this type of data is the healthcare industry where the hottest topic is the vendor neutral archive. The need for a vendor neutral archive arises from the practice of acquiring different vendor silos for medical images and documents by different departments in a hospital; cardiology, radiology, orthopedics, etc., and not being able to share information across these silos for the longitudinal care of a patient.  Since the technology of an imaging system is constantly improving, a department may acquire a newer system from another vendor and may not be able to use the same analysis tools since the data formats are different.

In order to cast light on this type of dark data, we must be able to separate the front-end application processing from the back end storage function to eliminate application/data silos. This is where object storage comes into play.

When you separate the data from the application it is simply a bunch of bits unless you put it into a container with meta data that describes the data and policies that govern the data. This container becomes an object and is independent of the application that created it and can be used by other applications through access to the meta data. By separation from the application, the data does not go dark when the applications become obsolete. The life of the datais not dictated by the life of the application. Hitachi’s object storage system is called Hitachi Content Platform, (HCP).

Data that is ingested into HCP can be accessed by different applications through network protocols. The data stored in HCP can be found through a very scalable high performance search engine library that includes many features including fast indexing, ranked searching, boolean, phrase, and span queries, date-range searching, and extensions. Since dark data is not data that is being updated, HCP eliminates the need for backup cycles, by keeping a replica of the data on another HCP.  A hash is taken when data is ingested to prove immutability when the data is retrieved. Dedupe, compression, and encryption are all standard features. HCP is implemented on Hitachi Data Systems virtualization infrastructure, which enables the data to be independent of the infrastructure as well as the application that created it.


Enrico Signoretti is a great promoter of object storage and points out the value of object storage in his blog post.

However, in a recent blog post he points out that if you are not able to create meta data for the data, some of the advantages vanish and object storage may not be for everyone. Meta data is key to the creation and value for object storage. HCP is open to the use of standard protocols to ingest and create meta data, and as a result there are many third party vendors who can create metadata and ingest the meta data and data into HCP. HCP enables the meta data to be customized for an application and provides up to ten custom metadata fields to enable apps and users to store their unique metadata separately from one another. As a result many ISVs (Independent Software Vendors) have developed interfaces to HCP for easy ingestion and creation of meta data.

Not all object storage systems have these extended meta data functions. Basically an object storage system, according to Searchstorage, assigns each object a unique tag that allows a server or an end user to retrieve the object without needing to know the physical location of the object. HCP enhances that “unique tag” with meta data that not only enables the object to be retrieved but also describes the content of the object, so that information about the object may be processed sometimes without ever having to access the object itself.

Below is a short list of programs and applications that support HCP. For specific applications contact your HDS representative or HDS channel partner.


Object stores will become a greater focus in 2014, and the main driver will be to shine a light on dark data to derive business value from existing data assets.

See full list of my top ten trends for 2014 here.