TODO: please write a 300-word paper review report that summarizes what you learned from this paper, what impresses you the most, what the technical flaws are, and how it compares to other systems you knew before reading the paper.

Read More

paper mashup pre-spark

idea on big data rdbms是为富矿设计的,比如ERP,financial,ACID是基本要求 互联网的数据比如clickstream,privacy每条不值钱。一定要大量合起来分析才有价值。

Read More

Serverless Computing

TODO: Pre-Meetup Questions from CSE 291 Q: Today’s serverless functions are stateless. How do you think different functions can share data and communicate? A: lambda is stateless, so the state must be some other place, such as DB, Shared File/object, MessageQueue/event stream or HTTP API call chain.

Read More

note on architect design db

池化技术 create db connection take time. 定期检测连pool connection is still good. 它们的创建过程都比较耗时,也比较消耗系统资源 。所以,我们把它们放在一个池子里统一管理起来,以达到 提升性能和资源复用的目的

Read More

reflection on the virtualization io

Paper Reflection: Q: Is virtio a full virtualization or a paravirtualization technique? What’s its main benefit? A: It is paravirtualization because the hosted OS was modified to talk to the hypervisor. Main benefit: simplify and offload virtualization workload to the guest OS. problem space from (n*m) => (n+m)

Read More

readyset or Partial State in Dataflow-Based Materialized Views

Although this query uses the relational model cleanly to fetch the data you require for your application, this is not a query you can run in a relational database. It’s just too expensive. You’d have to do something else to figure out the star count for your repositories. Maybe de-normalize the data. Migrate your database to keep a star count in the repository table. Or maybe keep the count in a Redis instance and update your application so that starring a repository writes to your database and to Redis.

Read More


RAM资源是有限的,必然有些应用程序需要委屈下,问题在于,到底委屈哪个? 以个体来讲,公平公正,程序正义是非常重要,且带来完全感的promise。 但是,公平不是唯一的尺度,比如swap是一个非常贵的操作。以总体来讲,要trade off:降低发生swap vs 公平公正

Read More

Chip ML design

ch1 Overview

when to use ML (1) lean (2) complex patterns with (3) existing data to (4) make predictions (5) on unseen data

Read More

dp 套路

时间序列类: 套路I: 状态与前一个时间序列的状态有关,可以是一维或者二维DP,第一维表示序列,第二维表示状态数量。 LC 198. House Robber LC 213. House Robber II LC 123. Best Time to Buy and Sell Stock III LC 309. Best Time to Buy and Sell Stock with Cooldown LC 376. Wiggle Subsequence LC 276. Paint Fence LC 487. Max Consecutive Ones II LC 1186. Maximum Subarray Sum with One Deletion 903.Valid Permutations for DI Sequence

Read More

线程 vs 协程?

  • 线程由中断和特定调用分出/让出时间片和触发上下文切换。切分的是对CPU的占有
  • 协程则由 yield/await 等语言结构或特定调用分出 continuations 并切换上下文。切分的是控制流。
Read More

why the design distributed system

  • Worked on maintaining and adding features to an enterprise-wide machine learning platform that is used by various other teams at Capital One to streamline machine learning processes
  • Built an internal API that provisioned and created Kubernetes applications for ML jobs. This was done to move away from using AWS EMR to reduce deployment time from 20 minutes to less than 1 minute.
  • Migrated the API component of our machine learning platform to a newer AWS account to prevent our subnet running out of IP addresses and reducing clutter
Read More

why the design db

为什么 Redis 选择单线程模型

  1. IO multiplexing这么香, 用select监控多个连接。
  2. mutil-thread 的data racing 太麻烦了
  3. web server bottleneck 是IO不是CPU
  4. 新版本有些可以多线程来做,UNLINK、FLUSHALL ASYNC 和 FLUSHDB ASYNC. 特点是没有data race。
Read More

c++ 常用算法

// <algorithm>
pos = find(iter.begin(), iter.end(), value) -vec.begin(); // 找元素
pos = search(content.begin(), content.end(), sub.begin(), sub.end()) - content.begin(); // 找子串
fill(iter.begin(), iter.end(), value); // 填充
reverse(iter.begin(), iter.end()); // 反转元素
rotate(iter.begin(), iter.begin() + cnt, iter.end()); // 轮转 cnt 位:1 2 3 => 2 3 1 (1 位)
unique(iter.begin(), iter.end()); // 连续元素去重
bool fun(value){}
partition(iter.begin(), iter.end(), fun); // 分组(轴点),快排基础,返回值为第二组开始点
partition_point(iter.begin(), iter.end(), fun); // 返回轴点位置
sort(iter.begin(), iter.end()); // 排序,从小到大
sort(iter.being(), iter.end(), bool_fun); // 排序,自定义,true => 在前
partial_sort(iter.begin(), iter.begin() + cnt, iter.end()); // 部分排序,全部的前 n 小排序好,如 0 1 2 7 8 6 5 9 4 3
nth_element(iter.begin(), iter.begin() + nth, iter.end()); // 将有序的第 n 个放到对应位置,且前面都小于等于,后面都大于等于,但不保证有序
lower_bound(iter.begin(), iter.end(), value); // 不小于 value 的第一个值位置
upper_bound(iter.begin(), iter.end(), value); // 大于 value 的第一个位置
binary_search(iter.begin(), iter.end(), value); // 二叉搜索元素是否存在
merge(iter1.begin(), iter1.end(), iter2.begin(), iter2.end(), dest.begin()); // 合并有序的两个迭代器
inplace_merge(iter.begin(), iter.begin() + mid, iter.end()); // 将前后有序的两部分合并
includes set_difference set_intersection set_symmertric_difference set_union; // 集合用函数
is_heap(iter.begin(), iter.end()); // 判断是否是大根堆
make_heap(iter.begin(), iter.end()); // 大根堆化
push_heap(iter.begin(), iter.end()); // 将末尾元素放入前 n - 1 个元素构成的堆
pop_heap(iter.begin(), iter.end()); // 将堆顶放到末尾
sort_heap(iter.begin(), iter.end()); // 将大根堆转为升序排列
max_element(iter.begin(), iter.end()); // argmax
min_element(iter.begin(), iter.end()); // argmin
minmax_element(iter.begin(), iter.end()); // 同时返回 min max 的位置
clamp(value, lo, hi); // C++17 限制在 [lo, hi] 内
next_permutation(iter.begin(), iter.end()); // 返回下一个排序,“排序”间以字典序比较大小,字典序最大时返回 false
prev_permutation(iter.begin(), iter.end()); // 与前者相反,返回字典序更小的下一个
accumulate(iter.begin(), iter.end(), init, binary_op); // 对迭代器做聚合操作,默认 add => sum
partial_sum(iter.begin(), iter.end(), dest.begin()); // 求前缀和,可以类似前者换函数做其他聚合操作
Read More

sort + pq 贪心解决问题

253. Meeting Rooms II 用pq思路

sort by begin, pq sort by end TODO https://leetcode.com/problems/meeting-rooms-ii/solution/

class Solution {
    int minMeetingRooms(vector<vector<int>>& intervals) {
        sort(intervals.begin(), intervals.end());
        priority_queue<int, vector<int>, std::greater<int>> pq;//min
        for (int i = 1; i < intervals.size(); i++) {
            if (pq.top() <= intervals[i][0]){
        return pq.size();
Read More

妙用BST Sweep time line

Sweep time line

和刚刚的diff问题一样,只是问题颗粒不在是每个index, 而是index of interest, 在输入sparse的情况下避免了一秒一秒的数,而是event发生了再处理, 利用了bst的里面的key也是sorted的特性 Tux, the Linux mascot

Read More

Dp Roadmap

时间序列类: 套路I: 状态与前一个时间序列的状态有关,可以是一维或者二维DP,第一维表示序列,第二维表示状态数量。 LC 198. House Robber LC 213. House Robber II LC 123. Best Time to Buy and Sell Stock III LC 309. Best Time to Buy and Sell Stock with Cooldown LC 376. Wiggle Subsequence LC 276. Paint Fence LC 487. Max Consecutive Ones II LC 1186. Maximum Subarray Sum with One Deletion 903.Valid Permutations for DI Sequence

Read More

Prefix Sum

prefix Sum 用于cache了加法结果,使得sum of range立刻可以得到结果, 注意前面要加0, 值得prefix存的是[)的方式, 每次query要prefix[right+1] - prefix[left]

Read More

leetcode contest biweekly 68, weekly 273

2115. Find All Possible Recipes from Given Supplies

it is basically topo sort, remember from->to when building the graph.

class Solution {
    vector<string> findAllRecipes(vector<string>& recipes, vector<vector<string>>& ingredients, vector<string>& supplies) {
        unordered_map<string, vector<string>> graph;
        unordered_map<string, int> indegree;
        for (int i = 0; i < recipes.size(); i++) {
            string to = recipes[i];
            for (auto& from: ingredients[i]) { 
                graph[from].push_back(to); //<<--
        queue<string> q;
        for (string& seed: supplies) {
        unordered_set<string> need(recipes.begin(), recipes.end());
        vector<string> order;
        while (!q.empty()) {
            string cur = q.front(); q.pop();
            if (need.count(cur)){
            for (string to: graph[cur]) {
                if (indegree[to] == 0) {
        return order;
Read More

12-25 leetcode retro

394. Decode String

  1. keep two stacks, one for int, one for all
  2. accumulate int/string
  3. [ => push k, decoded to stack
  4. ] => pop k, decoded, then wip string = decode+ str*k
Read More

Basic Calculator

class Solution {
    int calculate(string s) {
        queue<char> q;
        for (char c: s) {
            if (c != ' ') q.push(c);
        return eval(q);
    int eval(queue<char>& q) {
        stack<int> stk;
        int num = 0;
        char op = '+';
        while(!q.empty()) {
            char c = q.front();q.pop();
            if (c == ' ') continue;
            if (isdigit(c)) {
                num = num*10 + (c - '0');
            if (c == '(') {
                num = eval(q);
            // -  12  +
            // op num i
            //because we see another op
            // stk.top() op num
            //then we know what's num, and what's op
            if (!isdigit(c) || q.empty()){
                switch(op) {
                    case '+': stk.push(num); break;
                    case '-': stk.push(-num); break;
                    case '*': stk.top() *= num; break;
                    case '/': stk.top() /= num; break;
                op = c;
                num = 0;
            if (c == ')') {
        int ans = 0;
        while (!stk.empty()) {
            ans += stk.top(); stk.pop();
        return ans;
Read More

leetcode 12-04 retro

797. All Paths From Source to Target

basic dfs to find the end node note the backtracking won’t include the root node. so either add it first in the ans, move the mark/unmark to the scope of dfs

Read More

C++ and Singleton


Read More

rabbit hole of rpath and origin

a little background on dynamic linking

the job of dynamic linker and loader(ld.so) is to resolve the executable’s dependencies on shared libraries and to load the required ones at run-time. readelf .bin to find the NEDDED headers and search the DSO with a matching SONAME

Read More

大话 design pattern

What’s design pattern?

a design pattern is a general reusable solution to a commonly occurring problem within a given context in software design

Read More

How to make a debugger

When we debug, we want to forgot about all security issue, and just want to peek the state of running applicaion. well, first thing not to do is to put debugger and debuggie to same memory space:) so they have to in different process.

Read More

How to make a chat app


tldr It is a IM protocol that use XML, it is decentialized, and XML itself contain rounting info, and it is the server that doing all the heavy work :P

Read More

Linux Graphic Stack and Wayland

3D rendering with OpenGL

  • A library, “Mesa”, implements the OpenGL API. It uses card-specific drivers to translate the API into a hardware-specific form. If the driver uses Gallium internally, there’s a shared component that turns the OpenGL API into a common intermediate representation, TGSI. The API is passed through Gallium, and all the device-specific driver does is translate from TGSI into hardware commands,
  • libdrm uses special secret card-specific ioctls to talk to the Linux kernel
  • The Linux kernel, having special permissions, can allocate memory on and for the card.
  • Back out at the Mesa level, Mesa uses DRI2 to talk to Xorg to make sure that buffer flips and window positions, etc. are synchronized.
Read More