{{{ #!html
雲端運算基礎課程 (Hadoop簡介、安裝與範例實作)
}}} [[PageOutline]] = 課程資訊 = * 上課時間: 2010/04/27 (二) ~ 2010/04/28 (三) 09:30 ~ 16:30 2 天,共計 12 個小時 * 上課地點: 國家高速網路與計算中心 新竹事業群(300 新竹市科學工業園區研發六路七號) <[http://www.nchc.org.tw/tw/about/traffic/headquarter.php 地圖]> 電腦教室 B * [https://edu.nchc.org.tw/course/one_course_introduction.asp?lms_auto_course_id=1342&from_course_list_url=course_index 報名網頁課程資訊] = 課程大綱 = == '''2010-04-27 (二)''' == || 上午時段 || 課程內容 || 投影片 || 實作步驟 || 錄影 / 補充資料 || || 09:30~11:10 || [raw-attachment:wiki:NCHCCloudCourse100427:00.CourseOutline.pdf 介紹課程] 與 [raw-attachment:wiki:NCHCCloudCourse100427:01.CloudIntro.pdf 雲端運算簡介] || [raw-attachment:wiki:NCHCCloudCourse100427:00.CourseOutline.pdf Part-00], [raw-attachment:wiki:NCHCCloudCourse100427:01.CloudIntro.pdf Part-01] || || 1. [http://hunch.net/?p=249 Parallel Machine Learning Problems][[BR]]2. 影像處理參考:[raw-attachment:wiki:jazz/09-11-10:09-11-12_hadoop-tw-09.pdf 吳冠龍先生,台大資工系通訊與多媒體實驗室][[BR]] Image Selection for Large-Scale Flickr Photos using Hadoop[[BR]]3. ACM 論文:[http://portal.acm.org/citation.cfm?id=1631528 Canonical image selection ...] || || 11:10~11:20 || 休息 || || || || || 11:20~11:50 || [raw-attachment:wiki:NCHCCloudCourse100427:02.HadoopIntro.pdf Hadoop簡介] || [raw-attachment:wiki:NCHCCloudCourse100427:02.HadoopIntro.pdf Part-02] || || || || 11:50~12:00 || [wiki:Hadoop_Lab1 實作A: Hadoop 單機安裝與基本操作 ] || || [wiki:Hadoop_Lab1 實作A]|| || || 下午時段 || 課程內容 || 投影片 || 實作步驟 || 錄影 / 補充資料 || || 13:00~13:30 || [raw-attachment:wiki:NCHCCloudCourse100427:03.HadoopOverview.pdf Hadoop Overview] || [raw-attachment:wiki:NCHCCloudCourse100427:03.HadoopOverview.pdf Part-03] || || || || 13:30~14:30 || [raw-attachment:wiki:NCHCCloudCourse100427:04.HDFS.pdf Hadoop Distributed File System簡介] || [raw-attachment:wiki:NCHCCloudCourse100427:04.HDFS.pdf Part-04] || || || || 14:30~14:40 || [wiki:Hadoop_Lab2 實作B: HDFS 實用指令操作] || || [wiki:Hadoop_Lab2 實作B] || || || 14:40~14:50 || 休息 || || || || || 14:50~15:40 || [raw-attachment:wiki:NCHCCloudCourse100427:05.MapReduce.pdf Map Reduce 介紹] || [raw-attachment:wiki:NCHCCloudCourse100427:05.MapReduce.pdf Part-05] || || [grid:wiki:jazz/09-04-14#MapReduce 不同語言的 MapReduce 實作] || || 15:40~16:00 || [wiki:Hadoop_Lab3 實作C: 執行 MapReduce 基本運算] || || [wiki:Hadoop_Lab3 實作C] || || || 16:00~16:25 || [raw-attachment:wiki:NCHCCloudCourse100427:05-5.HadoopSetupCommand.pdf 設定參數解析 ] || [raw-attachment:wiki:NCHCCloudCourse100427:05-5.HadoopSetupCommand.pdf Part-5.5] || || [http://forum.hadoop.tw/viewtopic.php?f=4&t=45 關於 master / slave 設定]|| || 16:25~16:30 || [wiki:hadoopStop 回家前停止hadoop服務] || || [wiki:hadoopStop 停止Hadoop] || || == '''2010-04-28 (三)''' == * 請先 [wiki:hadoopReBuild 啟動Hadoop] || 上午時段 || 課程內容 || 投影片 || 實作步驟 || 錄影 / 補充資料 || || 09:00~10:30 || [raw-attachment:wiki:NCHCCloudCourse100427:06.MR_Programing.pdf Map Reduce 程式設計] || [raw-attachment:wiki:NCHCCloudCourse100427:06.MR_Programing.pdf Part-06] || || || || 10:30~11:00 || [wiki:Hadoop_Lab4 實作D: Hadoop 程式編譯與執行] || || [wiki:Hadoop_Lab4 實作D] || 1. [wiki:Streaming Streaming 用法 ] || || 11:00~11:10 || 休息 || || || || || 11:10~12:00 || [raw-attachment:wiki:NCHCCloudCourse100427:07.Nutch.pdf Hadoop 應用實例: 搜尋引擎 Nutch 簡介] || [raw-attachment:wiki:NCHCCloudCourse100427:07.Nutch.pdf Part-07] ||[wiki:Hadoop_Lab6 實作E] || || || 下午時段 || 課程內容 || 投影片 || 實作步驟 || || 13:00~14:00 || [raw-attachment:wiki:NCHCCloudCourse100427:08.HadoopCluster.pdf Hadoop 叢集安裝設定解析] || [raw-attachment:wiki:NCHCCloudCourse100427:08.HadoopCluster.pdf Part-08] || || Yahoo Hadoop Tutorial:[[BR]] [http://developer.yahoo.com/hadoop/tutorial/module7.html Module 7: Managing a Hadoop Cluster][[BR]]- 說明了小中大不同等級叢集可以做的 Hadoop 系統參數調整 || || 14:00~15:00 || [wiki:Hadoop_Lab7 實作F: Hadoop 叢集安裝操作] || || [wiki:Hadoop_Lab7 實作F] || || 15:00~15:30 || [wiki:Hadoop_Lab8 實作G: Hadoop 叢集進階操作] || || [wiki:Hadoop_Lab8 實作G] || || 15:30~15:40 || 休息 || || || || 15:40~16:30 || [wiki:Hadoop_Lab9 實作H:DRBL 快速佈屬 Hadoop] || [raw-attachment:wiki:NCHCCloudCourse100225:09.HadoopDRBL.pdf Part-09] || [wiki:Hadoop_Lab9 實作H] || [http://www.screentoaster.com/watch/stUklTQ0dIR1xYSF9eU1xcVVFS DRBL-Hadoop Live CD 展示] ( 6 min ) || || || [raw-attachment:wiki:NCHCCloudCourse090428:10.Conclusions.pdf 課程小結] || [raw-attachment:wiki:NCHCCloudCourse100225:10.Conclusions.pdf Part-10] || || || = 補充資料 = * 補充:[wiki:Hadoop_Lab5 用 Eclipse 開發 hadoop 程式 ] * 基於 !NetBeans 的 MapReduce 開發環境 - [http://www.hadoopstudio.org/ Hadoop Studio] - Karmasphere Studio for Hadoop is a Sample screenshot MapReduce development environment (IDE) based on !NetBeans. * 如何修改 Hadoop 原始碼: * 請根據想要修改的對象,到 hadoop-*/src 找對應的原始碼(Ex. !FairScheduler, !NameNode, !DataNode 等) * 修改完回到 hadoop-* 目錄,下 ant 重新編譯。 * 論壇相關討論:[http://forum.hadoop.tw/viewtopic.php?f=7&t=19 要怎麼編譯 hadoop 的 scheduler 呢??] * [http://www.paolocorti.net/2009/12/06/using-mongodb-to-store-geographic-data/ Using MongoDb to store geographic data] * GIS 地理資訊系統的資料量是相當可觀的,但如何提供一個分散式的資料庫可以做資料查詢就是很多雲端系統應該要解決的問題。MongoDB 算是 NoSQL 資料庫實作的一支。