- 操作系统系统(基本知识)
- GFS(high avaialbe, scalable, data replication,
erasure code) - ext4 (disk layout, io scheduler, performance tunning) 了解
- btrfs(the last file sytem, which is now being developed. You take some knowledge about the disk layout, snapshot, data integrity. You can search btrfs wiki on google ) 了解
- os kernel(page cache) 了解
- GFS(high avaialbe, scalable, data replication,
- 存储(基本知识、设计)
- 内存 k-v(redis) (memory data structure, hash algorithm, distributed data partition alogrithm, data avaiable)
- 分布式数据库(Hbase) (CAP, BASE, multi-version concurrent control, WAL, LSM tree)
- MySQL (InnoDB log, InnoDB data structure)
- 网络
- TCP/IP 三次握手(three-way handshake, four-way finalization, TCP reset packet, TCP timeout, TCP RTT, TCP MSL, TCP state machine)
- zero copy (sendfile system call, the os kernel data copy path)
- RPC (epoll, select, poll mechanic, interrupt mechaic)
- 数据分发
- 路由
- 数据迁移 (load balance, multi-tier data storage including memory, ssd, disk)
- 计算
- mapreduce(Hadoop, Spark) (RDD, Shuffle, distributed computing framework task scheduler, which includes FAIR, FIFO, CFQ)
- 流计算(Flink, HERO, Spark Streaming) (difference between real-time computing and batch compputing)
- 图计算 (GraphX, Dremel) (BSP, SSP model)
- 高并发
- 生产者、消费者模式
- 无锁数据结构(linkedlist, map) (wikipedia word item, Consistency)
- CAS (volatile cost overhead)
- 可重入 (reentrant wikepeida’s word item)
- kafka
- raftor模式