Changes between Version 1 and Version 2 of waue/2011/chukwa


Ignore:
Timestamp:
Jan 28, 2011, 7:16:26 PM (13 years ago)
Author:
waue
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • waue/2011/chukwa

    v1 v2  
    1 
    2 hadoop 0.21.1
    3 chukwa 0.4
    4 ub1 是 jobtracker / namenode
    5 ub2 chukwa server / mysql server
    6 
    7 > The Chukwa hadoop cluster (CC)
    8 > The monitored source nodes (SN)
    9 > The monitored source nodes set up as a hadoop cluster (SN-C)
    10 
    11 保險起見,cc sn 都建立資料夾
    12 /chukwa/
    13 /tmp/chukwa
    14 
    15 sudo apt-get install sysstat
    16 
    17 = cc 要設定的 conf/ =
     1{{{
     2#!html
     3<div style="text-align: center; color:#151B8D"><big style="font-weight: bold;"><big><big>
     4chukwa 0.4 + hadoop 0.20
     5</big></big></big></div> <div style="text-align: center; color:#7E2217"><big style="font-weight: bold;"><big>
     6chukwa 研究
     7</big></big></div>
     8}}}
     9[[PageOutline]]
     10
     11= chukwa 研究 =
     12
     13眾所周知, hadoop 是運行在分佈式的集群環境下,同是是許多用戶或者組共享的集群,因此任意時刻都會有很多用戶來訪問 NN 或者 JT ,對分佈式文件系統或者 mapreduce 進行操作,使用集群下的機器來完成他們的存儲和計算工作。當使用 hadoop 的用戶越來越多時,就會使得集群運維人員很難客觀去分析集群當前狀況和趨勢。比如 NN 的內存會不會在某天不知曉的情況下發生內存溢出,因此就需要用數據來得出 hadoop 當前的運行狀況。
     14
     15Chukwa 就是利用了集群中的幾個進程輸出的日誌,如 NN,DN,JT,TT 等進程都會有 log 信息,因為這些進程的程序裡面都調用 log4j 提供的接口來記錄日誌,而到底日誌的物理存儲是由 log4j.properties 的配置文件來配置的,可以寫在本地文件,也可以寫到數據庫。 Chukwa 就是來控制這些日誌的記錄,由 chukwa 程序來接替這部分工作,完成日誌記錄和採集工作。 Chukwa 由以下幾個組件組成:
     16
     17 || Agents ||  run on each machine and emit data. || 收集各個進程的日誌,並將收集的日誌發送給 collector ||
     18 || Collectors  ||   receive data from the agent and write it to stable storage.|| 收集 agent 發送為的數據,同時將這些數據保存到 hdfs 上 ||
     19 || MapReduce  jobs ||  parsing and archiving the data.|| 利用 mapreduce 來分析這些數據 ||
     20 || HICC || the Hadoop Infrastructure Care Center; a web-portal style interface for displaying data.|| HICC 將數據展現出來 ||
     21
     22 * !DumpTool 將結果下載保存到 mysql 數據庫
     23搭建、運行Chukwa要在Linux環境下,要安裝MySQL數據庫,在Chukwa/conf目錄 中有2個SQL腳本 aggregator.sql、database_create_tables.sq l 導入MySQL數據庫,此外還要有Hadoo的HDSF運行環境,
     24
     25[[Image(http://incubator.apache.org/chukwa/docs/r0.4.0/images/datapipeline.png)]]
     26[[Image(http://incubator.apache.org/chukwa/docs/r0.4.0/images/components.gif)]]
     27
     28[http://blog.csdn.net/lance_123/archive/2011/01/23/6159325.aspx]
     29
     30[http://blog.csdn.net/vozon/archive/2010/09/03/5861518.aspx]
     31
     32{{{
     33#!html
     34<big style="font-weight: bold;"><big>
     35chukwa 設定
     36</big></big>
     37}}}
     38
     39 * hadoop 0.20.1 已經run了 [不可 hadoop 0.21.0]
     40
     41 * chukwa 0.4
     42
     43 * ub1 是 jobtracker / namenode
     44
     45 * ub2 chukwa server
     46
     47{{{
     48
     49$ mkdir /tmp/chukwa
     50$ chown 777 /tmp/chukwa
     51$ sudo apt-get install sysstat
     52$ cd /opt/hadoop
     53$ cp /opt/chukwa/conf/hadoop-metrics.properties.template conf/
     54$ cp conf/hadoop-metrics.properties conf/hadoop-metrics
     55$ cp /opt/chukwa/chukwa-hadoop-0.4.0-client.jar ./lib/
     56}}}
     57
     58= conf/ =
     59兩台都要設定
    1860
    1961== alert ==
     62{{{
     63#!text
    2064waue@nchc.org.tw
     65}}}
    2166
    2267== chukwa-collector-conf.xml ==
    23 
     68{{{
     69#!text
    2470  <property>
    2571    <name>writer.hdfs.filesystem</name>
     
    3783    <description>The HTTP port number the collector will listen on</description>
    3884  </property>
     85}}}
    3986
    4087== jdbc.conf ==
    4188 jdbc.conf.template 改成 jdbc.conf
     89{{{
     90#!text
    4291demo=jdbc:mysql://ub2:3306/test?user=root
    43 
     92}}}
    4493== nagios.properties ==
     94{{{
     95#!text
    4596log4j.appender.NAGIOS.Host=ub2
     97}}}
    4698 
    4799== 預設值即可不用改 ==
    48 aggregator.sql
    49 chukwa-demux-conf.xml
    50 chukwa-log4j.properties
    51 commons-logging.properties
    52 database_create_tables.sql
    53 log4j.properties mdl.xml
    54 
    55 
    56 = cc, sn 都要設定的 conf =
     100
     101 * aggregator.sql
     102 * chukwa-demux-conf.xml
     103 * chukwa-log4j.properties
     104 * commons-logging.properties
     105 * database_create_tables.sql
     106 * log4j.properties mdl.xml
    57107
    58108== chukwa-env.sh ==
     109{{{
     110#!text
    59111export JAVA_HOME=/usr/lib/jvm/java-6-sun
    60112export HADOOP_HOME="/opt/hadoop"
    61113export HADOOP_CONF_DIR="/opt/hadoop/conf"
    62 
    63 = sn 要設定的 conf =
     114}}}
    64115
    65116== agents ==
    66117 agents.template ==>  agents
    67 
     118{{{
     119#!text
    68120ub1
    69121ub2
    70 
     122}}}
    71123== chukwa-agent-conf.xml ==
    72124chukwa-agent-conf.xml.template ==> chukwa-agent-conf.xml
    73 
     125{{{
     126#!text
    74127  <property>
    75128    <name>chukwaAgent.tags</name>
     
    83136    <description>The hostname of the agent on this node. Usually localhost, this is used by the chukwa instrumentation agent-control interface library</description>
    84137  </property>
    85 
     138}}}
    86139== collectors ==
    87 localhost
    88 
     140{{{
     141#!text
     142ub1
     143ub2
     144}}}
    89145== initial_adaptors ==
    90 cp initial_adaptors.template initial_adaptors
     146
     147$ cp initial_adaptors.template initial_adaptors
    91148
    92149 
    93 
    94 = cc =
    95 安裝 mysql , php , phpmyadmin, apache2
    96 
    97 開 database : test
    98 匯入:  conf/database_create_tables.sql
    99 
    100 
    101 
    102150==  log4j.properties ==
    103151
    104152$ vim hadoop/conf/log4j.properties
    105 
     153{{{
     154#!text
    106155    log4j.appender.DRFA=org.apache.log4j.net.SocketAppender
    107156    log4j.appender.DRFA.RemoteHost=ub2
     
    109158    log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
    110159    log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
    111 
    112 
    113 
    114 
    115 $ cd /opt/hadoop
    116 $ cp /opt/chukwa/conf/hadoop-metrics.properties.template conf/
    117 $ cp conf/hadoop-metrics.properties conf/hadoop-metrics
    118 $ cp /opt/chukwa/chukwa-hadoop-0.4.0-client.jar ./
    119 $ cp /opt/chukwa/chukwa-hadoop-0.4.0-client.jar ./lib/
    120 
    121 
    122 
    123 Configuring and starting the Collector
    124 
    125    1. Copy conf/chukwa-collector-conf.xml.template to conf/chukwa-collector-conf.xml
    126    2. Edit conf/chukwa-collector-conf.xml and comment out the default properties for chukwaCollector.writerClass, and chukwaCollector.pipeline. Uncomment block for HBaseWriter parameters, and save.
    127    3. If you're running HBase in distributed mode, copy your hbase-site.xml file to the collectors conf/ directory. At a minimum, this file must contain a setting for hbase.zookeeper.quorum.
    128    4. Copy conf/chukwa-env.sh-template to conf/chukwa-env.sh.
    129    5. Edit chukwa-env.sh. You almost certainly need to set JAVA_HOME, HADOOP_HOME, HADOOP_CONF_DIR, HBASE_HOME, and HBASE_CONF_DIR at least.
    130    6.
    131 
    132       In the chukwa root directory, say bash bin/chukwa collector
    133 
    134 Configuring and starting the local agent
    135 
    136    1.
    137 
    138       Copy conf/chukwa-agent-conf.xml.template to conf/chukwa-agent-conf.xml
    139    2.
    140 
    141       Copy conf/collectors.template to conf/collectors
    142    3.
    143 
    144       In the chukwa root directory, say bash bin/chukwa agent
    145 
    146 Starting Adaptors
     160}}}
     161
     162 = start  =
     163只要在server上執行(ub2)
     164
     165{{{
     166$ cd /opt/chukwa
     167$ bin/start-all.sh
     168}}}
     169
     170 = Set Up HICC =
     171只要在server上執行(ub2)
     172
     173{{{
     174cd /opt/chukwa
     175bin/chukwa hicc
     176}}}
     177
     178[http://localhost:8080]
     179name / pass = admin / admin
     180
     181 = mysql =
     182
     183安裝 mysql , php , phpmyadmin, apache2
     184
     185開 database : test
     186匯入:  conf/database_create_tables.sql
     187其餘還沒測出來
     188
     189 = Starting Adaptors =
    147190
    148191The local agent speaks a simple text-based protocol, by default over port 9093. Suppose you want Chukwa to monitor system metrics, hadoop metrics, and hadoop logs on the localhost:
    149192
    150193   1. Telnet to localhost 9093
    151    2.
    152 
    153       Type [without quotation marks] "add org.apache.hadoop.chukwa.datacollection.adaptor.sigar.SystemMetrics SystemMetrics 60 0"
    154    3.
    155 
    156       Type [without quotation marks] "add SocketAdaptor HadoopMetrics 9095 0"
    157    4.
    158 
    159       Type [without quotation marks] "add SocketAdaptor Hadoop 9096 0"
     194
     195以下為舊版的 chukwa 資料,會錯誤
     196
     197   2. Type [without quotation marks] "add org.apache.hadoop.chukwa.datacollection.adaptor.sigar.SystemMetrics SystemMetrics 60 0"
     198   3. Type [without quotation marks] "add SocketAdaptor HadoopMetrics 9095 0"
     199   4. Type [without quotation marks] "add SocketAdaptor Hadoop 9096 0"
    160200   5. Type "list" -- you should see the adaptor you just started, listed as running.
    161201   6. Type "close" to break the connection.
     
    164204      If you don't have telnet, you can get the same effect using the netcat (nc) command line tool.
    165205
    166 Set Up Cluster Aggregation Script
     206 = Set Up Cluster Aggregation Script =
    167207
    168208For data analytics with pig, there are some additional environment setup. Pig does not use the same environment variable name as Hadoop, therefore make sure the following environment are setup correctly:
     
    170210   1. export PIG_CLASSPATH=$HADOOP_CONF_DIR:$HBASE_CONF_DIR
    171211   2. Setup a cron job for "pig -Dpig.additional.jars=${HBASE_HOME}/hbase-0.20.6.jar:${PIG_PATH}/pig.jar ${CHUKWA_HOME}/script/pig/ClusterSummary.pig" to run periodically
    172 
    173 Set Up HICC
    174 
    175 The Hadoop Infrastructure Care Center (HICC) is the Chukwa web user interface. To set up HICC, do the following:
    176 
    177    1. bin/chukwa hicc
    178 
    179 Data visualization
    180 
    181    1.
    182 
    183       Point web browser to http://localhost:4080/hicc/jsp/graph_explorer.jsp
    184    2. The default user name and password is "demo" without quotes.
    185    3. System Metrics collected by Chukwa collector will be browsable through graph_explorer.jsp file.