Changes between Version 16 and Version 17 of jazz/13-06-02


Ignore:
Timestamp:
Jun 2, 2013, 10:30:42 AM (11 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • jazz/13-06-02

    v16 v17  
    1818== Solr / Lucene in Practice ==
    1919
    20  * Threat Connect - http://docs.trendmicro.com/all/ent/tc/en-us/tc_olh/abt-tc.html
     20 * Threat Connect (TC) - http://docs.trendmicro.com/all/ent/tc/en-us/tc_olh/abt-tc.html
    2121   - Sandbox Report - 1.2M reports / 2.4TB / Hadoop
    2222   - PAFI ( virus scan results ) - 50M reports / 514 GB / HBase
    23    - Census (? 300GB)
     23   - Census (? Index Size : 300GB)
    2424   - Sandbox VM - Windows (?) - pcap (network packet) / screenshot - 8GB/day, 3000 malware - 存在 HDFS
    2525   - Similarity Search 相似度搜尋
    2626   - 將 log 透過 MR Job 或 Pig 存成 Lucene Index (?),再匯入 Solr (Index Size: 6GB)
    2727   - 缺點:無法做到遞增索引更新(incremental index update)(也得看是否能區隔遞增的更新資料(incremental data update(?)))
    28    -
     28   - Q1: Census 是自建的系統?
     29   - Q2: Sandbox 是 Windows VM? malware 是否會故意避開 VM?
     30   - Q3: 蒐集到的 Sandbox 資料是否有遞增的特性?
     31 * 如何使用 Solr / Lucene 到 Threat Connect (TC)
     32   - Q: 必須自己寫 Web UI (RESTful API)?