Version: 1.0

architecture

How to choose storage tool

Data model
- Key/value
- Semi-structured
  - Column-oriented
  - Document-oriented
Storage model
- In-memory
- Persistent
Consistency model
- Strictly
- Eventually consistent
Physical model
- Distributed
- Single machine
Read/write performance
- Read / write equals ? equal ?
- Does it support range scans or is it better suited doing random reads
Secondary indexes
- Does it support secondary indexes ?
- How strong it supports secondary indexes
Failure handling
- How it handles servers failure ?
- It is able to continue operating ?
- How it backups data in server replacement ?
- How it performs server decommissioning
Compression
- Is the compression method pluggable?
- What types are available?
Load balancing
- Does it accept load balancing
- How it performs load balancing in case of high throughput
Atomic read-modify-write
Locking, waits and dead-locks
- Can it be free for wait and therefore deadlocks ?
Security
- ??

Technology risk
- Individual component risk
- Interation between components
- Unfamiliarity with na technology used in designing the system
Team risk
- Knowledge level and strength of team
- Dependency on external team
- Potentially disruptive team member
Requirements
- Poorly defined requirements or poorly defined problem
- New technologies on which team member didn't worked on

CAP
- Consistency: data source should provide the most recent data in every copy after successfully write
- Availability: no downtime and data system is available every time
- Partition: system should continue to serve data even system is down in some partitions

Logs should capture who, what and when

Who: The human, system, or service account associated with the event (e.g., a web browser user agent or a user ID)
What happend: the event and the related metadata
When: the timestamp of the even