Cassandraのスライドを訳した！！

ACMのプログラミングコンテストが終わり、しばらくの間サボっておりました。
９月ぐらいまでを英語強化月間とし、べんきょうします！

Data Presentations Cassandra Sigmod

クラウド関係の論文を読んでいこうと思うんだが、まずは簡単なものからってことで。
かなり雑な訳で内容も理解していないが、、、これからだ。。
あいているところやアルファベットのままのとこ、意味不明なとこはだれか教えていただけないでしょうか、、、

Why Cassandra?
• Lots of data
– Copies of messages, reverse indices of messages, per user data.
• Many incoming requests resulting in a lot of random reads and random writes.
• No existing production ready solutions in the market meet these requirements.

なぜCassandraか。
・多くのデータ
-メッセージのコビー，メッセージの逆インデックス，ユーザー毎の
・多くのincomingは，ランダムリード，ランダムライトの結果を要求します。
・これらの問題を解決するための要求を満たしたプロダクションは，市場に存在しない。

Design Goals
• High availability
• Eventual consistency
– trade-off strong consistency in favor of high availability
• Incremental scalability
• Optimistic Replication
• “Knobs” to tune tradeoffs between consistency, durability and latency
• Low total cost of ownership
• Minimal administration

設計のゴール
・High availability
・Eventual consistency
-可用性を重視した強い一貫性とのトレードオフ
・Incremental scalability
・Optimistic Replication
・"Knobs" consistency,durability,latencyのトレードオフを調整する
・所有コストが低い
・最小の管理

Write Operations
• A client issues a write request to a random node in the Cassandra cluster.
• The “Partitioner” determines the nodes responsible for the data.
• Locally, write operations are logged and then applied to an in-memory version.
• Commit log is stored on a dedicated disk local to the machine.

書き込み操作
・クライアントはCassandra clusterのランダムノードから書き込み要求を出します。
・"Partitioner" がデータに対して責任のあるノードを決定します。
・ローカルで書き込み操作は記録され，それから in-memory versionに適用されます。
・コミットログは，マシンの専用のディスクロオーカルにストアされます。

Write Properties
• No locks in the critical path
• Sequential disk access
• Behaves like a write back Cache
• Append support without read ahead
• Atomicity guarantee for a key
• “Always Writable”
– accept writes during failure scenarios

書き込み特性
・クリティカルパスのロックではない
・シーケンシャルディスクアクセス
・write back Cache のようにふるまう
・前を読むことなく追加できる
・キーに対して原子性を保証する
・常に書き込める
-失敗したシナリオでも書き込める

Cluster Membership and Failure Detection
• Gossip protocol is used for cluster membership.
• Super lightweight with mathematically provable properties.
• State disseminated in O(logN) rounds where N is the number of nodes in the cluster.
• Every T seconds each member increments its heartbeat counter and selects one other member to send its list to.
• A member merges the list with its own list.

クラスタメンバーシップと失敗検出
・クラスタメンバーシップではGossip protocolが使われている
・数学的に証明可能な軽さ
・StateはNがクラスタノード数である場合，O(logN)で広められます。
・
・メンバーは自信のリストにマージされます。

Accrual Failure Detector
• Valuable for system management, replication, load balancing etc.
• Defined as a failure detector that outputs a value, PHI, associated with each process.
• Also known as Adaptive Failure detectors - designed to adapt to changing network conditions.
• The value output, PHI, represents a suspicion level.
• Applications set an appropriate threshold, trigger suspicions and perform appropriate actions.
• In Cassandra the average time taken to detect a failure is 10-15 seconds with the PHI threshold set at 5.

Accrual Failure Detector
・システムマネージメント，レプリケーション，ロードバランシング等に貴重である
・valueを出力するfailure detectorとして定義されたPHIは各プロセスと関連付けられます。
・またAdaptive Failure detectorsとして知られる。-ネットワークコンディションの変化に適応するように設計されている
・value出力PHIは、疑いレベルと表します。
・アプリケーションは適当なしきいを設定し、疑いを誘発して、適切なアクションを実行をします。
・Cassandraでは、5時日設定されたPHIのしきい値にしたがった失敗検出の平均時間は10-15秒です。

Properties of the Failure Detector
• If a process p is faulty, the suspicion level
Φ(t)␣∞as t␣∞.
• If a process p is faulty, there is a time after which Φ(t) is monotonic increasing.
• A process p is correct ␣ Φ(t) has an ub over an infinite execution.
• If process p is correct, then for any time T,
Φ(t) = 0 for t >= T.

Failure Detector の特性
・プロセスPが不完全なら疑いレベルは
Φ(t)␣∞as t␣∞.
・プロセスPが不完全なら、Φ(t)が単調な増加である時があります。
・プロセスPが正しいなら、いつでもTです。

Performance Benchmark
• Loading of data - limited by network bandwidth.
• Read performance for Inbox Search in production:

パフォーマンスベンチマーク
・データのロード-ネットワークバンド幅によって制限される
・Inbox検索のリードパフォーマンス

Lessons Learnt
• Add fancy features only when absolutely required.
• Many types of failures are possible.
• Big systems need proper systems-level monitoring.
• Value simple designs

Lessons Learnt
・必要なときだけ、特徴を付け加えてください。
・多くの場合、失敗可能です。
・大きいシステムではシステムレベルの監視が必要になります。
・Valueはシンプルにデザインする

Future work
• Atomicity guarantees across multiple keys
• Analysis support via Map/Reduce
• Distributed transactions
• Compression support
• Granular security via ACL’s

将来の仕事
・複数キーでの原子性の保証
・Map/Reduceを介した分析サポート
・分散トランザクション
・圧縮サポート
・ACLの細かいセキュリティ

BigTableとかDynamo読みたいなー。
まずはこれの理解が先か。
今日のところは以上！！