Context Navigation

← Previous Change
Next Change →

Changeset 9 for sample/WordCountFromHBase.java

Timestamp:

Jun 13, 2008, 5:45:02 PM (17 years ago)

Author:

waue

Message:

comment

File:

: 1 edited

sample/WordCountFromHBase.java (modified) (4 diffs)

Legend:

: Unmodified
: Added
: Removed

sample/WordCountFromHBase.java

-                      r8
+                      r9
  * Editor: Waue Chen
  * From :  NCHC. Taiwn
  * Last Update Date: 06/10/2008
+ * Last Update Date: 06/13/2008
  */
 /**
  * Purpose :
  *  Store the result of WordCountIntoHbase.java from Hbase to Hadoop file system
+ *  Word counting from Hbase then store result in Hadoop file system
+ *
  * HowToUse :
  *  Make sure Hadoop file system and HBase are running correctly.
  *  Then run the program with BuildHTable.java after \
  *  modifying these setup parameters.
+ *  Make sure Hadoop file system are running and HBase has correct data.
+ *  Suggest to run WordCountIntoHBase first.
+ *  finally, modify these setup parameters and run.
+ *
  * Check Result:
+ *  inspect http://localhost:60070 by web explorer
+ *
+ *  inspect http://localhost:50070 by web explorer
  */
 …
       String line = Text.decode( ((ImmutableBytesWritable) cols.get(textcol) )
           .get() );
       //let us know what is "line"
       /*
 …
       // the result is the contents of merged files "
+      //StringTokenizer will divide a line into a word
       StringTokenizer itr = new StringTokenizer(line);
       // set every word as one
       while (itr.hasMoreTokens()) {
+        word.set(itr.nextToken());
+        // nextToken will return this value in String and point to next \
+        // Text.set() = Set to contain the contents of a string.
+        word.set(itr.nextToken());
+        // OutputCollector.collect = collect(K key, V value) \
+        //  Adds a key/value pair to the output.
         output.collect(word, one);
+      }
 …
     // reuse objects
     private final static IntWritable SumValue = new IntWritable();
+    // this sample's reduce() format is the same as map() \
+    //  reduce is a method waiting for implement \
+    //  four type in this sample is (Text , Iterator<IntWritable>, \
+    //    OutputCollector<Text, IntWritable> , Reporter ) ;
     public void reduce(Text key, Iterator<IntWritable> values,
         OutputCollector<Text, IntWritable> output, Reporter reporter)
         throws IOException {
       // sum up values
+      // sum up value
       int sum = 0;
+      while (values.hasNext()) {
+        sum += values.next().get();
+      // "key" is word , "value" is sum
+      // why values.hasNext(), not key.hasNext()
+      while (values.hasNext()) {
+        // next() will return this value and pointer to next event \
+        //  IntWritable.get() will transfer IntWritable to Int
+        sum += values.next().get();
+      }
+      // IntWritable.set(int) will transfer Int to IntWritable
       SumValue.set(sum);
+      // hense we set outputPath in main, the output.collect will put
+      //  data in Hadoop
       output.collect(key, SumValue);
+    }

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 9 for sample/WordCountFromHBase.java

Legend:

sample/WordCountFromHBase.java

Download in other formats: