Changes between Version 1 and Version 2 of waue/2011/chukwa
- Timestamp:
- Jan 28, 2011, 7:16:26 PM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
waue/2011/chukwa
v1 v2 1 2 hadoop 0.21.1 3 chukwa 0.4 4 ub1 是 jobtracker / namenode 5 ub2 chukwa server / mysql server 6 7 > The Chukwa hadoop cluster (CC) 8 > The monitored source nodes (SN) 9 > The monitored source nodes set up as a hadoop cluster (SN-C) 10 11 保險起見,cc sn 都建立資料夾 12 /chukwa/ 13 /tmp/chukwa 14 15 sudo apt-get install sysstat 16 17 = cc 要設定的 conf/ = 1 {{{ 2 #!html 3 <div style="text-align: center; color:#151B8D"><big style="font-weight: bold;"><big><big> 4 chukwa 0.4 + hadoop 0.20 5 </big></big></big></div> <div style="text-align: center; color:#7E2217"><big style="font-weight: bold;"><big> 6 chukwa 研究 7 </big></big></div> 8 }}} 9 [[PageOutline]] 10 11 = chukwa 研究 = 12 13 眾所周知, hadoop 是運行在分佈式的集群環境下,同是是許多用戶或者組共享的集群,因此任意時刻都會有很多用戶來訪問 NN 或者 JT ,對分佈式文件系統或者 mapreduce 進行操作,使用集群下的機器來完成他們的存儲和計算工作。當使用 hadoop 的用戶越來越多時,就會使得集群運維人員很難客觀去分析集群當前狀況和趨勢。比如 NN 的內存會不會在某天不知曉的情況下發生內存溢出,因此就需要用數據來得出 hadoop 當前的運行狀況。 14 15 Chukwa 就是利用了集群中的幾個進程輸出的日誌,如 NN,DN,JT,TT 等進程都會有 log 信息,因為這些進程的程序裡面都調用 log4j 提供的接口來記錄日誌,而到底日誌的物理存儲是由 log4j.properties 的配置文件來配置的,可以寫在本地文件,也可以寫到數據庫。 Chukwa 就是來控制這些日誌的記錄,由 chukwa 程序來接替這部分工作,完成日誌記錄和採集工作。 Chukwa 由以下幾個組件組成: 16 17 || Agents || run on each machine and emit data. || 收集各個進程的日誌,並將收集的日誌發送給 collector || 18 || Collectors || receive data from the agent and write it to stable storage.|| 收集 agent 發送為的數據,同時將這些數據保存到 hdfs 上 || 19 || MapReduce jobs || parsing and archiving the data.|| 利用 mapreduce 來分析這些數據 || 20 || HICC || the Hadoop Infrastructure Care Center; a web-portal style interface for displaying data.|| HICC 將數據展現出來 || 21 22 * !DumpTool 將結果下載保存到 mysql 數據庫 23 搭建、運行Chukwa要在Linux環境下,要安裝MySQL數據庫,在Chukwa/conf目錄 中有2個SQL腳本 aggregator.sql、database_create_tables.sq l 導入MySQL數據庫,此外還要有Hadoo的HDSF運行環境, 24 25 [[Image(http://incubator.apache.org/chukwa/docs/r0.4.0/images/datapipeline.png)]] 26 [[Image(http://incubator.apache.org/chukwa/docs/r0.4.0/images/components.gif)]] 27 28 [http://blog.csdn.net/lance_123/archive/2011/01/23/6159325.aspx] 29 30 [http://blog.csdn.net/vozon/archive/2010/09/03/5861518.aspx] 31 32 {{{ 33 #!html 34 <big style="font-weight: bold;"><big> 35 chukwa 設定 36 </big></big> 37 }}} 38 39 * hadoop 0.20.1 已經run了 [不可 hadoop 0.21.0] 40 41 * chukwa 0.4 42 43 * ub1 是 jobtracker / namenode 44 45 * ub2 chukwa server 46 47 {{{ 48 49 $ mkdir /tmp/chukwa 50 $ chown 777 /tmp/chukwa 51 $ sudo apt-get install sysstat 52 $ cd /opt/hadoop 53 $ cp /opt/chukwa/conf/hadoop-metrics.properties.template conf/ 54 $ cp conf/hadoop-metrics.properties conf/hadoop-metrics 55 $ cp /opt/chukwa/chukwa-hadoop-0.4.0-client.jar ./lib/ 56 }}} 57 58 = conf/ = 59 兩台都要設定 18 60 19 61 == alert == 62 {{{ 63 #!text 20 64 waue@nchc.org.tw 65 }}} 21 66 22 67 == chukwa-collector-conf.xml == 23 68 {{{ 69 #!text 24 70 <property> 25 71 <name>writer.hdfs.filesystem</name> … … 37 83 <description>The HTTP port number the collector will listen on</description> 38 84 </property> 85 }}} 39 86 40 87 == jdbc.conf == 41 88 jdbc.conf.template 改成 jdbc.conf 89 {{{ 90 #!text 42 91 demo=jdbc:mysql://ub2:3306/test?user=root 43 92 }}} 44 93 == nagios.properties == 94 {{{ 95 #!text 45 96 log4j.appender.NAGIOS.Host=ub2 97 }}} 46 98 47 99 == 預設值即可不用改 == 48 aggregator.sql 49 chukwa-demux-conf.xml 50 chukwa-log4j.properties 51 commons-logging.properties 52 database_create_tables.sql 53 log4j.properties mdl.xml 54 55 56 = cc, sn 都要設定的 conf = 100 101 * aggregator.sql 102 * chukwa-demux-conf.xml 103 * chukwa-log4j.properties 104 * commons-logging.properties 105 * database_create_tables.sql 106 * log4j.properties mdl.xml 57 107 58 108 == chukwa-env.sh == 109 {{{ 110 #!text 59 111 export JAVA_HOME=/usr/lib/jvm/java-6-sun 60 112 export HADOOP_HOME="/opt/hadoop" 61 113 export HADOOP_CONF_DIR="/opt/hadoop/conf" 62 63 = sn 要設定的 conf = 114 }}} 64 115 65 116 == agents == 66 117 agents.template ==> agents 67 118 {{{ 119 #!text 68 120 ub1 69 121 ub2 70 122 }}} 71 123 == chukwa-agent-conf.xml == 72 124 chukwa-agent-conf.xml.template ==> chukwa-agent-conf.xml 73 125 {{{ 126 #!text 74 127 <property> 75 128 <name>chukwaAgent.tags</name> … … 83 136 <description>The hostname of the agent on this node. Usually localhost, this is used by the chukwa instrumentation agent-control interface library</description> 84 137 </property> 85 138 }}} 86 139 == collectors == 87 localhost 88 140 {{{ 141 #!text 142 ub1 143 ub2 144 }}} 89 145 == initial_adaptors == 90 cp initial_adaptors.template initial_adaptors 146 147 $ cp initial_adaptors.template initial_adaptors 91 148 92 149 93 94 = cc =95 安裝 mysql , php , phpmyadmin, apache296 97 開 database : test98 匯入: conf/database_create_tables.sql99 100 101 102 150 == log4j.properties == 103 151 104 152 $ vim hadoop/conf/log4j.properties 105 153 {{{ 154 #!text 106 155 log4j.appender.DRFA=org.apache.log4j.net.SocketAppender 107 156 log4j.appender.DRFA.RemoteHost=ub2 … … 109 158 log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout 110 159 log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n 111 112 113 114 115 $ cd /opt/hadoop 116 $ cp /opt/chukwa/conf/hadoop-metrics.properties.template conf/ 117 $ cp conf/hadoop-metrics.properties conf/hadoop-metrics 118 $ cp /opt/chukwa/chukwa-hadoop-0.4.0-client.jar ./ 119 $ cp /opt/chukwa/chukwa-hadoop-0.4.0-client.jar ./lib/ 120 121 122 123 Configuring and starting the Collector 124 125 1. Copy conf/chukwa-collector-conf.xml.template to conf/chukwa-collector-conf.xml 126 2. Edit conf/chukwa-collector-conf.xml and comment out the default properties for chukwaCollector.writerClass, and chukwaCollector.pipeline. Uncomment block for HBaseWriter parameters, and save. 127 3. If you're running HBase in distributed mode, copy your hbase-site.xml file to the collectors conf/ directory. At a minimum, this file must contain a setting for hbase.zookeeper.quorum. 128 4. Copy conf/chukwa-env.sh-template to conf/chukwa-env.sh. 129 5. Edit chukwa-env.sh. You almost certainly need to set JAVA_HOME, HADOOP_HOME, HADOOP_CONF_DIR, HBASE_HOME, and HBASE_CONF_DIR at least. 130 6. 131 132 In the chukwa root directory, say bash bin/chukwa collector 133 134 Configuring and starting the local agent 135 136 1. 137 138 Copy conf/chukwa-agent-conf.xml.template to conf/chukwa-agent-conf.xml 139 2. 140 141 Copy conf/collectors.template to conf/collectors 142 3. 143 144 In the chukwa root directory, say bash bin/chukwa agent 145 146 Starting Adaptors 160 }}} 161 162 = start = 163 只要在server上執行(ub2) 164 165 {{{ 166 $ cd /opt/chukwa 167 $ bin/start-all.sh 168 }}} 169 170 = Set Up HICC = 171 只要在server上執行(ub2) 172 173 {{{ 174 cd /opt/chukwa 175 bin/chukwa hicc 176 }}} 177 178 [http://localhost:8080] 179 name / pass = admin / admin 180 181 = mysql = 182 183 安裝 mysql , php , phpmyadmin, apache2 184 185 開 database : test 186 匯入: conf/database_create_tables.sql 187 其餘還沒測出來 188 189 = Starting Adaptors = 147 190 148 191 The local agent speaks a simple text-based protocol, by default over port 9093. Suppose you want Chukwa to monitor system metrics, hadoop metrics, and hadoop logs on the localhost: 149 192 150 193 1. Telnet to localhost 9093 151 2. 152 153 Type [without quotation marks] "add org.apache.hadoop.chukwa.datacollection.adaptor.sigar.SystemMetrics SystemMetrics 60 0" 154 3. 155 156 Type [without quotation marks] "add SocketAdaptor HadoopMetrics 9095 0" 157 4. 158 159 Type [without quotation marks] "add SocketAdaptor Hadoop 9096 0" 194 195 以下為舊版的 chukwa 資料,會錯誤 196 197 2. Type [without quotation marks] "add org.apache.hadoop.chukwa.datacollection.adaptor.sigar.SystemMetrics SystemMetrics 60 0" 198 3. Type [without quotation marks] "add SocketAdaptor HadoopMetrics 9095 0" 199 4. Type [without quotation marks] "add SocketAdaptor Hadoop 9096 0" 160 200 5. Type "list" -- you should see the adaptor you just started, listed as running. 161 201 6. Type "close" to break the connection. … … 164 204 If you don't have telnet, you can get the same effect using the netcat (nc) command line tool. 165 205 166 Set Up Cluster Aggregation Script 206 = Set Up Cluster Aggregation Script = 167 207 168 208 For data analytics with pig, there are some additional environment setup. Pig does not use the same environment variable name as Hadoop, therefore make sure the following environment are setup correctly: … … 170 210 1. export PIG_CLASSPATH=$HADOOP_CONF_DIR:$HBASE_CONF_DIR 171 211 2. Setup a cron job for "pig -Dpig.additional.jars=${HBASE_HOME}/hbase-0.20.6.jar:${PIG_PATH}/pig.jar ${CHUKWA_HOME}/script/pig/ClusterSummary.pig" to run periodically 172 173 Set Up HICC174 175 The Hadoop Infrastructure Care Center (HICC) is the Chukwa web user interface. To set up HICC, do the following:176 177 1. bin/chukwa hicc178 179 Data visualization180 181 1.182 183 Point web browser to http://localhost:4080/hicc/jsp/graph_explorer.jsp184 2. The default user name and password is "demo" without quotes.185 3. System Metrics collected by Chukwa collector will be browsable through graph_explorer.jsp file.