Attention webscale aficionados, Twitter says it is planning to open source Storm, its Hadoop-like real-time data processing tool. In a blog post Thursday, the microblogging network said it plans to release the Storm code on Sept. 19 at the Strange Loop event in St. Louis, Mo.
The question is — does the world need another real-time data processing tool? After all there are many tools like HStreaming (using Hadoop), the open source S4 and StreamBase, but the overall analytics market (if you can call it a market) is already fragmented. The Storm code comes from Twitter’s acquisition of BackType last month and seems to be an effort to get folks comfortable parsing data on Twitter.
The post does an excellent job laying out use cases for Storm and hints at more to come. While the code can deal with distributed nodes and huge amounts of data a la Hadoop or Map Reduce, Storm handles jobs that are “infinite.” It’s not for a data processing job with an end point, it’s good for streams of data and continual processing. From the post by Nathan Marz:
Here’s a recap of the three broad use cases for Storm:
But wait! There’s more! At the end of the post we are assured that there’s more to Storm than the blog post has even defined, which we can learn more about next month at the Strange Loop event. From the post:
I’ve only scratched the surface on Storm. The “stream” concept at the core of Storm can be taken so much further than what I’ve shown here — I didn’t talk about things like multi-streams, implicit streams, or direct groupings. I showed two of Storm’s main abstractions, spouts and bolts, but I didn’t talk about Storm’s third, and possibly most powerful abstraction, the “state spout”. I didn’t show how you do distributed RPC over Storm, and I didn’t discuss Storm’s awesome automated deploy that lets you create a Storm cluster on EC2 with just the click of a button.
So for those anxious to test out a new method of crunching terabytes of real-time data on the fly, get thee to GitHub! And wait.
Home > Topik > Teknologi Informasi dan Komunikasi (Information Technology) > Real-Time Data Processing > Kliping >