| 18 | | * 使用 demo.crawlzilla.info 設定爬兩層 |
| | 18 | * 使用 demo.crawlzilla.info 設定爬兩層, |
| | 19 | {{{ |
| | 20 | 索引庫名稱 ril |
| | 21 | 搜尋引擎連結位置 /home/crawler/crawlzilla/user/jazz/IDB/ril/index |
| | 22 | 搜尋引擎狀態 OK |
| | 23 | 爬取深度 2 |
| | 24 | 建立時間 20111028-16:57:36 |
| | 25 | 執行時間 0:19:4 |
| | 26 | 起始連結 http://cloud.nchc.org.tw/~jazz/ril_export.html |
| | 27 | }}} |
| | 28 | * 從統計結果可以知道我觀察的前五十大資料來源: |
| | 29 | ||0||http://www.digitimes.com.tw||204|| |
| | 30 | ||1||http://www.bnext.com.tw||112|| |
| | 31 | ||2||http://groups.google.com||74|| |
| | 32 | ||3||http://www.theregister.co.uk||56|| |
| | 33 | ||4||http://highscalability.com||52|| |
| | 34 | ||5||http://www.ithome.com.tw||49|| |
| | 35 | ||6||http://www.cloudera.com||48|| |
| | 36 | ||7||http://gigaom.com||44|| |
| | 37 | ||8||http://www.networkworld.com||38|| |
| | 38 | ||9||http://en.wikipedia.org||38|| |
| | 39 | ||10||http://www.zdnet.com.tw||36|| |
| | 40 | ||11||http://www.howtoforge.com||33|| |
| | 41 | ||12||http://wiki.apache.org||32|| |
| | 42 | ||13||http://www.ibm.com||28|| |
| | 43 | ||14||http://nosql.mypopescu.com||28|| |
| | 44 | ||15||http://www.freegroup.org||28|| |
| | 45 | ||16||http://ajaxian.com||27|| |
| | 46 | ||17||http://www.linuxfordevices.com||25|| |
| | 47 | ||18||http://news.networkmagazine.com.tw||24|| |
| | 48 | ||19||http://ieeexplore.ieee.org||24|| |
| | 49 | ||20||http://insidehpc.com||23|| |
| | 50 | ||21||http://www.readwriteweb.com||23|| |
| | 51 | ||22||http://www.linux-mag.com||23|| |
| | 52 | ||23||http://www.nosqldatabases.com||21|| |
| | 53 | ||24||http://only-perception.blogspot.com||21|| |
| | 54 | ||25||http://www.inside.com.tw||19|| |
| | 55 | ||26||http://www.linkedin.com||19|| |
| | 56 | ||27||http://www.openfoundry.org||18|| |
| | 57 | ||28||http://www.sys-con.com||17|| |
| | 58 | ||29||http://www.hortonworks.com||16|| |
| | 59 | ||30||http://news.cnet.com||16|| |
| | 60 | ||31||http://people.debian.org.tw||16|| |
| | 61 | ||32||http://www.h-online.com||16|| |
| | 62 | ||33||http://www.slideshare.net||15|| |
| | 63 | ||34||http://blog.sematext.com||15|| |
| | 64 | ||35||http://packages.debian.org||14|| |
| | 65 | ||36||http://lwn.net||14|| |
| | 66 | ||37||http://sourceforge.net||14|| |
| | 67 | ||38||http://virtualization.info||13|| |
| | 68 | ||39||http://www.infoq.com||13|| |
| | 69 | ||40||http://radar.oreilly.com||13|| |
| | 70 | ||41||http://blog.gslin.org||13|| |
| | 71 | ||42||http://gevaperry.typepad.com||13|| |
| | 72 | ||43||http://www.cyberciti.biz||12|| |
| | 73 | ||44||http://blog.roodo.com||12|| |
| | 74 | ||45||http://www.libthomas.org||12|| |
| | 75 | ||46||http://www.runpc.com.tw||11|| |
| | 76 | ||47||http://blog.opennebula.org||11|| |
| | 77 | ||48||http://cloudsecurity.trendmicro.com||11|| |
| | 78 | ||49||http://developer.yahoo.com||11|| |