EMC Software Solutions Blog

Current Articles | RSS Feed RSS Feed

Store and Analyze Log Files


cloud big dataI read an interesting post by Scott Weiss titled "The Big Data Conundrum."  Scott opens the post by saying:

"As data storage costs plummet, the world is storing data that would be impractical to keep just a few years ago. Think video camera feeds, web logs and GPS tracking information. We used to throw away or sample this data, but now we can store and explore it."

This echoes a sentiment that I made in a post just two days before where I said:

"The fundamental bet that businesses should be making today should be loosely based upon the ability to store extensive amounts of data efficiently, manage it with less resources, and extract value from the data that was otherwise not possible.  That also happens to be the definition of Big Data."

We're entering the store everything era.  The trend of storing data that we previously threw away will not just continue --- it will grow.  The data exhaust created by applications such as web servers, routers, and other devices contain valuable information, but it's not necessarily the sort that needs to be stored and managed in uncompressed format and kept on tier-1 storage. 

That's a great use case for Atmos, our scale-out cloud storage product.  Atmos is ideal as an archive platform for data such as web server logs that need to be retained and analyzed, but not necessarily stored on the fastest disk possible.  (We'll sell you that too if you really want it though.)

Developers can read and write log files (or perhaps other unstructured, file-based  data) to Atmos using our REST API.  One of the big differences between storing the data on Atmos and just storing the data on ordinary NAS is that developers can control various actions that can applied to an object once it's stored simply by tagging the object with descriptive key/value pairs. 

Web server log files should, in most cases, be stored compressed.  Compression is one of the actions that Atmos can automatically apply server-side to objects based on a configured policy.  If the compression policy is configured to be triggered when metadata "key = compression" and the metadata "value = true" then any object written using this key/value pair will be stored compressed and then decompressed on a read request.  Automatically.  Neat. 

We even optimize the way the object is compressed so that the entire file need not be decompressed just to access a portion for read or update.  Balanced storage efficiency and performance?  Very neat. 

Here's an example of a policy created server-side that identifies which key/value pairs will trigger compression:

compression policy

In this particular policy as each object is created and stored with "compression equals true" key/value pairs it will be compressed and stored by Atmos appropriately.

Business logic drives storage policy.  It's the way it should work.



blog comments powered by Disqus