Interpreting the Data: Parallel Analysis with. Sawzall. Rob Pike, Sean Dorward, Robert Griesemer,. Sean Quinlan. Google, Inc. Presented by Alexey. Interpreting the Data: Parallel Analysis with Sawzall Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan Scientific Programming Journal Special Issue. Cue Sawzall, a new language that Google use to write distributed, parallel data- processing programs for use on their clusters. While the.

Author: Feshura Vudozilkree
Country: Luxembourg
Language: English (Spanish)
Genre: Technology
Published (Last): 20 March 2014
Pages: 234
PDF File Size: 13.43 Mb
ePub File Size: 3.12 Mb
ISBN: 406-2-34982-621-5
Downloads: 72048
Price: Free* [*Free Regsitration Required]
Uploader: Yozshukazahn

Search the Blog

The paper is well written with lot of examples. The paper is from the organization Google which is popular for their capabilities for massive computation on Data and is about the product they are using to solve day to day problems in Google.

Sawzall is a statically typed language for processing very large amount of data on multiple machines. It generally breaks the calculation in two phases first phase analyses the sawzaall and second phase aggregates the result.


The calculation is divided into pieces and distributed, keeping computation near data. It works above Google infrastructure.

Reading Paper — Interpreting the Data: Parallel Analysis in Sawzall – Bipin Upadhyaya

Protocol Buffers are used to describe the format of permanent records stored on disk. Software called the Workqueue is handled scheduling wirh job to run on a cluster of machines. The paper gives a detailed overview of sawzall programming language with examples. The benchmark test cases are all CPU-bound cases.

Interpreting the Data: Parallel Analysis with Sawzall

However, in the paper, the authors talked about the applications for this language being mostly IO-bound. It would seem to make sense if they gave some examples that are IO-bound and still be able to show the performance advantage of Sawzall.

Sawzall is also a level of abstraction above MapReduce, but still appears to be a bit more restrictive than Pig Latin [1]. A sawzall program has a fairly rigid structure consisting of a filtering phase the map step followed by an aggregation phase the reduce step.


It was a little bit concerning factor as with terabytes of data being processed error can easily happen.

Kamath, S Narayanam, C. You are commenting using your WordPress. You are commenting using your Snalysis account. You are commenting using your Facebook account. Notify me of new comments via email.

Skip to content Home About My Publications. Leave a Reply Cancel reply Enter your comment here Fill in your details below or click an icon to log in: Email required Address never made public. This site uses cookies. By continuing to use this website, you agree to their use. To find out qith, including how to control cookies, see here:

Author: admin