Blog

手里有一个工控机想读一下串口数据,结果装了 Ubuntu、CentOS 7 都不能正常显示用户界面,换了 CentOS 8 显示了一个灰色的界面,不能登录。

实际上我也很少在 Linux 下使用用户界面,只是现在想给别人用这个系统的浏览器,不得不尝试一下,再说遇到问题不解决总是疙瘩。

起初怀疑是驱动层的问题,查看的 drm 和 i915 的两个模块,尝试打印一些详细日志:在 grub 的 GRUB_CMDLINE_LINUX 加入一下配置项目例如 drm.debug=0x1e

实际上安装到 CentOS 8 的时候已经能够正常显示一些内容,而且 i5-4210U 已经是比较早的 CPU 了,核显的支持应该不会出问题。

考虑在 X Window System 相关的软件中寻找问题。X server 是 Linux 界面系统的服务端,负责实现 X 协议并与 drm 层交互等。

在 /var/log/Xorg.0.log 里找到了如下的日志:

[   114.183] (II) intel(0): Using Kernel Mode Setting driver: i915, version 1.6.0 20190619
[   114.195] (--) intel(0): Integrated Graphics Chipset: Intel(R) HD Graphics 4400
[   114.195] (--) intel(0): CPU: x86-64, sse2, sse3, ssse3, sse4.1, sse4.2, avx, avx2; using a maximum of 2 threads
[   114.196] (==) intel(0): Depth 24, (--) framebuffer bpp 32
[   114.196] (==) intel(0): RGB weight 888
[   114.196] (==) intel(0): Default visual is TrueColor
[   114.196] (II) intel(0): Output eDP1 using monitor section Monitor0
[   114.197] (**) intel(0): Found backlight control interface intel_backlight (type 'raw') for output eDP1
[   114.197] (II) intel(0): Enabled output eDP1
[   114.197] (II) intel(0): Output HDMI1 has no monitor section
[   114.197] (II) intel(0): Enabled output HDMI1
[   114.197] (II) intel(0): Output DP1 has no monitor section
[   114.197] (II) intel(0): Enabled output DP1
[   114.197] (II) intel(0): Output HDMI2 has no monitor section
[   114.197] (II) intel(0): Enabled output HDMI2
[   114.197] (--) intel(0): Using a maximum size of 256x256 for hardware cursors
[   114.197] (II) intel(0): Output VIRTUAL1 has no monitor section
[   114.197] (II) intel(0): Enabled output VIRTUAL1
[   114.197] (--) intel(0): Output eDP1 using initial mode 1920x1080 on pipe 0
[   114.197] (--) intel(0): Output HDMI1 using initial mode 1920x1080 on pipe 1
[   114.197] (==) intel(0): TearFree disabled
[   114.197] (==) intel(0): Using gamma correction (1.0, 1.0, 1.0)
[   114.197] (==) intel(0): DPI set to (96, 96)

可以看到实际输出到了两个端口 eDP1 和 HDMI。

我的外置显示器连接到了 HDM1。考虑到这种工控主机实际上是从笔记本 oem 方案修改简化的,一般来说 eDP 就是笔电的内置显示器输出口。

[root@bpc ~]# ls -l /sys/class/drm/
total 0
lrwxrwxrwx. 1 root root    0 Aug 18 21:11 card0 -> ../../devices/pci0000:00/0000:00:02.0/drm/card0
lrwxrwxrwx. 1 root root    0 Aug 18 21:11 card0-DP-1 -> ../../devices/pci0000:00/0000:00:02.0/drm/card0/card0-DP-1
lrwxrwxrwx. 1 root root    0 Aug 18 21:11 card0-eDP-1 -> ../../devices/pci0000:00/0000:00:02.0/drm/card0/card0-eDP-1
lrwxrwxrwx. 1 root root    0 Aug 18 21:11 card0-HDMI-A-1 -> ../../devices/pci0000:00/0000:00:02.0/drm/card0/card0-HDMI-A-1
lrwxrwxrwx. 1 root root    0 Aug 18 21:11 card0-HDMI-A-2 -> ../../devices/pci0000:00/0000:00:02.0/drm/card0/card0-HDMI-A-2
lrwxrwxrwx. 1 root root    0 Aug 18 21:11 renderD128 -> ../../devices/pci0000:00/0000:00:02.0/drm/renderD128
-r--r--r--. 1 root root 4096 Aug 18 21:11 version

工控机上输出只有一个 HDMI 和一个 VGA,但是实际上还有其他接口,执行以上命令可以看到显卡连接的所有接口。

在搜索 i5-4210U 问题的时候,在网上也看到挺多笔记本 Linux 显示的问题。其中很多建议禁用 LVDS-1。

实际上这也算一种过时的方案:旧版本的 Linux 可能不识别 eDP 这种接口,所以认为是 LVDS-1。在这里我直接禁用了 eDP-1:

GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/cl-swap rd.lvm.lv=cl/root rd.lvm.lv=cl/swap rhgb quiet video=eDP-1:d video=HDMI-A-1:e"

另外一种就是修改 Xorg.conf 的方案(没有尝试):

Section "Device"
    Identifier "You can write anything here"
    Driver "<your Xorg display driver name here>"
    Option  "Monitor-eDP-1" "Nonexistent monitor"
EndSection

Section "Monitor"
    Identifier "Nonexistent monitor"
    Option "Enable" "false"
EndSection

参考:

https://access.redhat.com/discussions/3211901

另外,在桌面不能用的情况下拿 VNC 试了试可以访问。

事后简单看了下 VNC Server 的原理,对于应用程序来说,Xvnc 相当于 X server。而渲染后的结果由 tcp 协议传输到了用户端。用户端也通过网络传输了用户的操作。

Linux 的图形界面似乎永远不是那么好用,还是用命令行比较舒服。


本文档主要讨论如下内容:

  1. 使用 Java Servlet 接口的服务端;
  2. 使用 Java HttpClient 工具的客户端;
  3. 使用 Curl 工具的 C 语言客户端;
  4. 文件名非 ASCII 字符问题

概述

根据 HTTP 协议的普适的实现,使用 Content-Type 和 Content-Disposition 两个 Header 属性,并将数据作为二进制流写入 Response Body。这样使用可以大部分已有的工具或者类库,便于开发和调试。

Header 写法如下:

Content-Type: application/octet-stream
Content-Disposition: attachment; filename="sample.txt"

这样实现, HTTP URL 直接在浏览器地址栏直接输入,下载的文件也会自动命名为 sample.txt。

使用 Java Servlet 接口的服务端

代码如下:

@WebServlet("/download")
public class DownloadServlet extends HttpServlet {
    private final int ARBITARY_SIZE = 1048;
 
    @Override
    protected void doGet(HttpServletRequest req, HttpServletResponse resp) 
      throws ServletException, IOException {
     
        resp.setContentType("text/plain");
        resp.setHeader("Content-disposition", "attachment; filename=sample.txt");
 
        try(InputStream in = req.getServletContext().getResourceAsStream("/WEB-INF/sample.txt");
          OutputStream out = resp.getOutputStream()) {
 
            byte[] buffer = new byte[ARBITARY_SIZE];
         
            int numBytesRead;
            while ((numBytesRead = in.read(buffer)) > 0) {
                out.write(buffer, 0, numBytesRead);
            }
        }
    }
}

使用 Java HttpClient 工具的客户端

HttpEntity entity = response.getEntity();
if (entity != null) {
    String name = response.getFirstHeader('Content-Disposition').getValue();
    String fileName = disposition.replaceFirst("(?i)^.*filename=\"([^\"]+)\".*$", "$1");
    FileOutputStream fos = new FileOutputStream("C:\\" + fileName);
    entity.writeTo(fos);
    fos.close();
}

使用 Curl 工具的 C 语言客户端

Curl 工具版本 7.21.2 以上,if you specify --remote-header-name / -J,下载的文件也会自动按照 Header 里的信息命名。

参考 https://github.com/curl/curl/blob/0c76795cafe0fccab41b3adc1be08cb81d55024f/src/toolcbhdr.c#L157。

文件名非 ASCII 字符问题

RFC 5987 中,Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters 章节讨论了字符集问题。

如下代码实现了 encode 的功能:

public static String rfc5987_encode(final String s) throws UnsupportedEncodingException {
    final byte[] s_bytes = s.getBytes("UTF-8");
    final int len = s_bytes.length;
    final StringBuilder sb = new StringBuilder(len << 1);
    final char[] digits = {'0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F'};
    final byte[] attr_char = {'!','#','$','&','+','-','.','0','1','2','3','4','5','6','7','8','9',           'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z','^','_','`',                        'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','|', '~'};
    for (int i = 0; i < len; ++i) {
        final byte b = s_bytes[i];
        if (Arrays.binarySearch(attr_char, b) >= 0)
            sb.append((char) b);
        else {
            sb.append('%');
            sb.append(digits[0x0f & (b >>> 4)]);
            sb.append(digits[b & 0x0f]);
        }
    }

    return sb.toString();
}

也可以参考 https://github.com/spring-projects/spring-framework/blob/master/spring-web/src/main/java/org/springframework/http/ContentDisposition.java

附录

参考:

  1. Do I need Content-Type: application/octet-stream for file download? https://stackoverflow.com/questions/20508788/do-i-need-content-type-application-octet-stream-for-file-download
  2. Content-Disposition https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers/Content-Disposition
  3. Example of Downloading file https://www.baeldung.com/servlet-download-file
  4. Curl to grab remote filename after following location https://stackoverflow.com/questions/6881034/curl-to-grab-remote-filename-after-following-location
  5. handling filename* parameters with spaces via RFC 5987 results in '+' in filenames
  6. Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters
一份错误代码

为了解决前端 SPA 与 Spring 的结合问题,引入了一段代码,主要功能是使所有非 /api/** 形式的请求都映射到 index.html。

基本逻辑是重写  WebMvcConfigurer 的 addResourceHandlers 方法。填入一个实现了 ResourceResolver 的类。此类重写了如下两个方法。

/* 此处解析静态路径 */
Resource resolveResource(@Nullable HttpServletRequest request, String requestPath,
			List<? extends Resource> locations, ResourceResolverChain chain);

String resolveUrlPath(String resourcePath, List<? extends Resource> locations, ResourceResolverChain chain);

对这两个方法内的请求路径分别进行拦截,如果不符合 /api/** 的形式,就返回对应的静态资源或者是 index.html 。

虽然是抄来的代码用起来也挺正常。但是后边发现如果 /api/** 的 rest 请求如果未找到对应方法,不返回 404,而是直接返回 index.html。这明显不符合预期的效果。

private Resource resolve(String requestPath, List<? extends Resource> locations) {
  if (isIgnored(requestPath)) {
    return null;
  }
  if (isHandled(requestPath)) {
    return locations.stream()
        .map(loc -> createRelative(loc, requestPath))
        .filter(resource -> resource != null && resource.exists())
        .findFirst()
        .orElse(null);
  }
  return index;
}

private Resource createRelative(Resource resource, String relativePath) {
  try {
    return resource.createRelative(relativePath);
  } catch (IOException e) {
    return null;
  }
}

private boolean isIgnored(String path) {
  return ignoredPaths.contains(path);
}

private boolean isHandled(String path) {
  String extension = StringUtils.getFilenameExtension(path);
  return handledExtensions.stream().anyMatch(ext -> ext.equals(extension));
}

分析流程解析如下问题:

  1. ResourceResovler 如何生效?
  2. 请求经过 RequestMapping 和 ResourceResolver 的顺序?

提出问题之后突然有更多的问题。

例如 Component 扫描的流程、各类配置的加载流程、Boot 配置方式、DispatcherServlet 实现等。只能通过日志分析一些细节。

在 ResourceHandlerRegistry 的 getHandlerMapping 这个方法中,返回了一个 SimpleUrlHandlerMapping 的实例。HandlerMapping 是 存在与 DispatcherServlet 中的一个接口,用于映射 Url 和 Handler。

这部分应该是初始化的一个路径。

另外通过 debug 日志的分析,一般基于 RequestMapping 注解绑定的 Handler,也就是 RequestMappingHandlerMapping,解析顺序早于以上的 SimpleUrlHandlerMapping。如果这两个 Mapping 走完没有返回 404 而是一个 Resource,那么问题应该就在上面的代码中。

打了两行日志发现其实 isIgnored 函数写的并不正确。实际上应该这样:

private boolean isIgnored(String path) {
  return ignoredPaths.stream()
      .map(path::startsWith).reduce((one, more) -> one || more).orElse(false);
}

前边留了一堆关于 Spring 的疑问,留在这里有时间继续往下写。



一直想买个安卓手机折腾一下,最近正好有移动开发的活需要 Android 手机,就去买了一块红米 7,赶上 618 活动,狗东 749 大洋当天到货。

3G 运存,32G 存储基本够用,屏幕也够大,七百多买这么个手机感叹一下我们工业水平真是不错。但是,买完手机的时候页面上跳出来一个推荐是雅诗兰黛七百多的一瓶精华,我又觉得搞工业这些真是辛苦。

打开手机升级到 miui 10.3.2,样子还可以,虽然 UI 设计不能跟 Apple、Google 比,但也是社会主义初级阶段能达到的较高水平了。

简述要点:

  1. 手机打开一直没找到开发者模式,需要打开 设置 → 我的设备 → 全部参数,在 MIUI 版本 那一行连点 3 次打开开发者模式。现在貌似大部分手机都有这种操作。
  2. 按 音量下 + 电源 进入 fastboot 模式,此时连接电脑可以操作 bootloader 分区等。
  3. 按 音量上 + 电源 进入 recovery 模式。现在 root 一般由第二步骤中刷入的 twrp(TeamWin Recovery Project)进行,也就是刷入定制的 Recovery 程序,然后进行 root。
  4. 小米账号要绑定手机号,绑定手机并使用 7 天才能进行解锁,解锁软件需要从官方下载。
  5. 刷入 twrp 后,将 magisk 下载到手机存储中,进入 Recovery 模式,选择安装 magisk。



Paxos Made Simple 论文

这是 Paxos Made Simple 的主要部分,描述一致性问题并讲解为了保证一致性而应该遵循约束,进而提出算法的主要流程以及实现中可以优化的点。The next section shows that this consensus algorithm follows almost unavoidably from the properties we want it to satisfy.


2.1 The Problem
 
假设一组进程可以 propose value。
 
A consensus algorithm ensures that a single one among the proposed values is chosen. 一致性算法保证其中唯一一个值会被选中。
 
如果没有 propose value ,也没有值会被选中。如果一个值被选中,所有的进程都会接收到这个值。一致性的最基本要求如下:
 

  • Only a value that has been proposed may be chosen
  • Only a single value is chosen, and
  • A process never learns that a value has been chosen unless it actually has been. 如果没有被选中,进程不能获知一个值已经被选中。


We wont try to specify precise liveness requirements. 我们不会试图去精确的指定节点存活的要求。
 
However, the goal is to ensure that some proposed value is eventually chosen and, if a value has been chosen, then a process can eventually learn the value.
 
We let the three roles in the consensus algorithm be performed by three classes of agents: proposers, acceptors, and learners.
 
In an implementation, a single process may act as more than one agent, but the mapping from agents to processes does not concern us here. 在算法的实现中,一个进程可以扮演多个角色,但是角色与进程的映射关系在这里不需要我们关心。
 
Assume that agents can communicate with one another by sending messages.
 
We use the customary asynchronous, non-Byzantine model, in which:
 
Agents operate at arbitrary speed, may fail by stopping, and may restart. 角色的操作可以是任意速率,也可能中断,可能重置。
 

  • Since all agents may fail after a value is chosen and then restart, a solution is impossible unless some information can be remembered by an agent that has failed and restarted. 由于所有的角色可能在选中值之后失效重启,一些信息在角色失效或者重启之后必须仍然存在。
  • Messages can take arbitrarily long to be delivered, can be duplicated, and can be lost, but they are not corrupted.


 
2.2  Choosing a Value
 
The easiest way to choose a value is to have a single acceptor agent. A proposer sends a proposal to the acceptor, who chooses the first proposed value that it receives. Although simple, this solution is unsatisfactory because the failure of the acceptor makes any further progress impossible.
 
So, let’s try another way of choosing a value. Instead of a single acceptor, let’s use multiple acceptor agents. A proposer sends a proposed value to a set of acceptors. An acceptor may accept the proposed value. The value is chosen when a large enough set of acceptors have accepted it. How large is large enough?
 
To ensure that only a single value is chosen, we can let a large enough set consist of any majority of the agents. Because any two majorities have at least one acceptor in common, this works if an acceptor can accept at most one value. 为了保证只有一个值被选中,那么我们选择 majority 的角色。因为在一个 acceptor 只能接收至多一个值的情况下,两个 majorities 至少有一个 acceptor 是共有的。
 
(There is an obvious generalization of a majority that has been observed in numerous papers, apparently starting with [3].)
 
In the absence of failure or message loss, we want a value to be chosen even if only one value is proposed by a single proposer. 在故障节点缺失和消息丢失的情况下,我们希望只有一个值会被选中,就算是只有一个 Proposer 提议了一个值。
 
This suggests the requirement:
 
P1. An acceptor must accept the first proposal that it receives. 必须接收第一个 Proposal。
 
But this requirement raises a problem. Several values could be proposed by different proposers at about the same time, leading to a situation in which every acceptor has accepted a value, but no single value is accepted by a majority of them. 多个值同时被多个 Proposer 提出,没有一个达到 majority。
 
Even with just two proposed values, if each is accepted by about half the acceptors, failure of a single acceptor could make it impossible to learn which of the values was chosen.
 
P1 and the requirement that a value is chosen only when it is accepted by a majority of acceptors imply that an acceptor must be allowed to accept more than one proposal. P1 和 一个选定的值必须被 majority 接收表明一个 acceptor 允许接收多个 Proposal。
 
We keep track of the different proposals that an acceptor may accept by assigning a (natural) number to each proposal, so a proposal consists of a proposal number and a value.
 
To prevent confusion, we require that different proposals have different numbers.
 
How this is achieved depends on the implementation, so for now we just assume it. 怎样达到这个效果取决于实现,我们只是假设可以这样。
 
A value is chosen when a single proposal with that value has been accepted by a majority of the acceptors.  In that case, we say that the proposal (as well as its value) has been chosen. 一个 Proposal 被 acceptor 中的大多数 accept,那么可以说这个Proposal(也就是 value)被 chosen。
 
We can allow multiple proposals to be chosen, but we must guarantee that all chosen proposals have the same value. 我们允许多个 Proposal 被选中,但是必须保证多个 Proposal 中的值是一样的。
 
By induction on the proposal number, it suffices to guarantee: 基于对 Proposal 序数的归纳,容易保证以下条件
 
P2. If a proposal with value v is chosen, then every higher-numbered proposal that is chosen has value v. 如果一个 值为 v 的 Proposal 被选中,那么更高序号的被选中的 Proposal 都有值 v.
 
Since numbers are totally ordered, condition P2 guarantees the crucial safety property that only a single value is chosen.
 
To be chosen, a proposal must be accepted by at least one acceptor. So, we can satisfy P2 by satisfying:
 
P2a . If a proposal with value v is chosen, then every higher-numbered proposal accepted by any acceptor has value v.
 
We still maintain P1 to ensure that some proposal is chosen.
 
Because communication is asynchronous, a proposal could be chosen with some particular acceptor c never having received any proposal. Suppose a new proposer “wakes up” and issues a higher-numbered proposal with a different value.
 
P1 requires c to accept this proposal, violating P2a . Maintaining both P1 and P2a requires strengthening P2a to:
 
P2b . If a proposal with value v is chosen, then every higher-numbered proposal issued by any proposer has value v.
 
Since a proposal must be issued by a proposer before it can be accepted by an acceptor, P2b implies P2a , which in turn implies P2. 因为一个 Proposal 在被 Acceptor 接收前必须由一个 Proposor 提出,P2b 隐含 P2a,P2a 隐含 P2.
 
To discover how to satisfy P2b , let’s consider how we would prove that it holds. 为了找到如何满足 P2b,我们来证明它是成立的。
 
We would assume that some proposal with number m and value v is chosen and show that any proposal issued with number n > m also has value v. 我们假设 序号为 m,值为 v 的 Proposal 已经被 chosen,并且 任何 序号 n > m 的 Proposal 同样包含值 v。
 
We would make the proof easier by using induction on n, so we can prove that proposal number n has value v under the additional assumption that every proposal issued with a number in m . .(n − 1) has value v, where i . . j denotes the set of numbers from i through j.  对 n 归纳可以简化证明,根据条件:每个提出的议案(编号从 m 到 n-1)的决议都是 v,我们可以证明编号为 n 的议案的决议是v。
 
For the proposal numbered m to be chosen, there must be some set C consisting of a majority of acceptors such that every acceptor in C accepted it. 对于选择的议案(编号为 m),必定存在一个集合 C(acceptor的多数派),C 中的每个 acceptor 都批准了该议案。
 
Combining this with the induction assumption, the hypothesis that m is chosen implies: 结合归纳假设,m 被选择这一前提意味着:
 
Every acceptor in C has accepted a proposal with number in m . .(n − 1), and every proposal with number in m . .(n − 1) accepted by any acceptor has value v.
 
Since any set S consisting of a majority of acceptors contains at least one member of C , we can conclude that a proposal numbered n has value v by ensuring that the following invariant is maintained:
 
P2c . For any v and n, if a proposal with value v and number n is issued, then there is a set S consisting of a majority of acceptors such that either (a) no acceptor in S has accepted any proposal numbered less than n, or (b) v is the value of the highest-numbered proposal among all proposals numbered less than n accepted by the acceptors in S. (a) 要么 S 中没有 acceptor 接收过序数小于 n 的 proposal,(b) 要么 v 是在 S 中的 acceptor 接收的所有小于 n 的 序数最大的 proposal 的值。
 
We can therefore satisfy P2b by maintaining the invariance of P2c .
 
To maintain the invariance of P2c , a proposer that wants to issue a proposal numbered n must learn the highest-numbered proposal with number less than n, ,if any, that has been or will be accepted by each acceptor in some majority of acceptors. 为了维护 P2c,Proposer 想要提出一个序数为 n 的 Proposal 时,必须知道小于 n 的最大序号的 proposal,如果有,那么这个 Proposal 已经或者将要被 chosen。
 
Learning about proposals already accepted is easy enough; predicting future acceptances is hard.
 
Instead of trying to predict the future, the proposer controls it by extracting a promise that there won`t be any such acceptances. Proposer 通过获得一个 Promise 保证后面不会出现那种 获得多数 accept 的情况(也就是保证 P2c 中的 a).
 
In other words, the proposer requests that the acceptors not accept any more proposals numbered less than n. This leads to the following algorithm for issuing proposals. 换句话说,Proposer 要求 Acceptors 不会接收 序号小于 n 的 Proposal。
 
This leads to the following algorithm for issuing proposals.
 

  1. A proposer chooses a new proposal number n and sends a request to each member of some set of acceptors, asking it to respond with: Proposer 选择一个新的 Proposal 并将请求发送给 acceptor 的时候,要求 acceptor 响应如下:


 
(a) A promise never again to accept a proposal numbered less than n, and 不在接受小于该 Proposal 序数的请求
(b) The proposal with the highest number less than n that it has accepted, if any. 如果曾经 accept 任意序数大于当前请求的 Proposal,就将其返回。
 
I will call such a request a prepare request with number n 这个可以称为 prepare request
 

  1. If the proposer receives the requested responses from a majority of the acceptors, then it can issue a proposal with number n and value v, where v is the value of the highest-numbered proposal among the responses, or is any value selected by the proposer if the responders reported no proposals. 如果 Proposer 接收到多数 acceptor 的响应,那么它就可以提出一个 序数 n 以及值为 v 的 Proposal,v 可能是所有响应里序数最大的那一个的值,如果没有响应之前的 Proposal,也可以是自己提出的值。


 
A proposer issues a proposal by sending, to some set of acceptors, a request that the proposal be accepted. (This need not be the same set of acceptors that responded to the initial requests.) Let’s call this an accept request. Proposer 发送 Proposal 到一组 acceptor(不一定是响应初始请求的那一组),请求他们接受 Proposal。我们称之为 accept request.
 
This describes a proposer’s algorithm. What about an acceptor? It can receive two kinds of requests from proposers: prepare requests and accept requests.
 
An acceptor can ignore any request without compromising safety. acceptor 可以忽略任何请求,不必让步于 safety(safety 可以认为是保障达成一致性,不太好翻译)。
 
So, we need to say only when it is allowed to respond to a request. It can always respond to a prepare request. It can respond to an accept request, accepting the proposal, iff it has not promised not to. In other words:
 
P1a . An acceptor can accept a proposal numbered n iff it has not responded to a prepare request having a number greater than n. acceptor 可以接受序数为 n 的 proposal,如果它之前没有回复任何序数大于 n 的 prepare request.
 
Observe that P1a subsumes P1
 
We now have a complete algorithm for choosing a value that satisfies the required safety properties—assuming unique proposal numbers. 假设我们实现了不重复的 Proposal 序数,现在已经有一个可以获取一个值的完整算法。
 
The final algorithm is obtained by making one small optimization. 最终的算法只是在这基础之上做了一个小小的优化。
 
Suppose an acceptor receives a prepare request numbered n, but it has already responded to a prepare request numbered greater than n, thereby promising not to accept any new proposal numbered n. There is then no reason for the acceptor to respond to the new prepare request, since it will not accept the proposal numbered n that the proposer wants to issue. So we have the acceptor ignore such a prepare request. We also have it ignore a prepare request for a proposal it has already accepted.
 
With this optimization, an acceptor needs to remember only the highestnumbered proposal that it has ever accepted and the number of the highestnumbered prepare request to which it has responded. Because P2c must be kept invariant regardless of failures, an acceptor must remember this information even if it fails and then restarts. Note that the proposer can always abandon a proposal and forget all about it—as long as it never tries to issue another proposal with the same number.
 
Putting the actions of the proposer and acceptor together, we see that the algorithm operates in the following two phases.
 
Phase 1. (a) A proposer selects a proposal number n and sends a prepare request with number n to a majority of acceptors. (b) If an acceptor receives a prepare request with number n greater than that of any prepare request to which it has already responded, then it responds to the request with a promise not to accept any more proposals numbered less than n and with the highest-numbered proposal (if any) that it has accepted.
 
Phase 2. (a) If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals. (b) If an acceptor receives an accept request for a proposal numbered n, it accepts the proposal unless it has already responded to a prepare request having a number greater than n.
 
A proposer can make multiple proposals, so long as it follows the algorithm for each one. It can abandon a proposal in the middle of the protocol at any time. (Correctness is maintained, even though requests and/or responses for the proposal may arrive at their destinations long after the proposal was abandoned.) It is probably a good idea to abandon a proposal if some proposer has begun trying to issue a higher-numbered one. Therefore, if an acceptor ignores a prepare or accept request because it has already received a prepare request with a higher number, then it should probably inform the proposer, who should then abandon its proposal. This is a performance optimization that does not affect correctness. 一个 proposer 可以生成多个 proposal,只要它能满足算法的要求。它可以在协议的任意时刻放弃 proposal。(正确性依然是得到满足的,即使请求或者回复在对应的 proposal 被放弃了很久之后才到达目的地)当其他的 proposer 已经开始发出更高 numbe r的 proposal 时,最好放弃当前的 proposal。因此。如果 acceptor 因为它已经接受了更高 number 的 prepare request 而忽略了其他的 prepare 或者 accept request,那么它应该通知对应的 proposer 放弃该 proposal。这是对性能优化,并不会影响正确性。
 
2.3 Learning a Chosen Value
 
To learn that a value has been chosen, a learner must find out that a proposal has been accepted by a majority of acceptors. The obvious algorithm is to have each acceptor, whenever it accepts a proposal, respond to all learners, sending them the proposal. This allows learners to find out about a chosen value as soon as possible, but it requires each acceptor to respond to each learner—a number of responses equal to the product of the number of acceptors and the number of learners. 为了获取一个已经被选中的 value,learner 必须要确定已经有一个 proposal 被 majority 接受了。最显而易见的算法就是让每个acceptor 在接受了一个 proposal 之后向所有的 learner 发送这个 proposal。这能让 learner 尽快地找到被选中的 value,但这需要 acceptor 对每个 learner 进行回复——回复的数量为 acceptor 和 learner 数量的乘积。
 
The assumption of non-Byzantine failures makes it easy for one learner to find out from another learner that a value has been accepted. 在 non-Byzantine 的假设,一个 learner 可以从另外一个 learn 得知一个值已经被接收。
 
We can have the acceptors respond with their acceptances to a distinguished learner, which in turn informs the other learners when a value has been chosen. This approach requires an extra round for all the learners to discover the chosen value. It is also less reliable, since the distinguished learner could fail. But it requires a number of responses equal only to the sum of the number of acceptors and the number of learners. 我们可以让 acceptor 将批准状况返回给一个主 learner,然后再由此 learner 通知其他 learner 有一个值已经被选定。这个方案要求额外的一轮通讯。这个过程同样不可靠的,因为主 learner 可能会挂掉。但是这样只要求等同于 acceptors 和 learner 数之和的相应。
 
More generally, the acceptors could respond with their acceptances to some set of distinguished learners, each of which can then inform all the learners when a value has been chosen. Using a larger set of distinguished learners provides greater reliability at the cost of greater communication complexity. 更通用的情况,acceptor 可以与一组主 learner 通讯,每一个都可以通知其他 learner 值被选中的消息。主 learner 的数量越多,通讯越复杂,过程越可靠。
 
Because of message loss, a value could be chosen with no learner ever finding out. The learner could ask the acceptors what proposals they have accepted, but failure of an acceptor could make it impossible to know whether or not a majority had accepted a particular proposal. In that case, learners will find out what value is chosen only when a new proposal is chosen. If a learner needs to know whether a value has been chosen, it can have a proposer issue a proposal, using the algorithm described above. 由于消息丢失,可能没有 learner 知道选择了一个决议。Learner 可以向 acceptor 询问批准的议案,但是由于 acceptor 的失效,可能难以得知多数派是否批准了一个议案。这样,learner 只能在新的议案被选择时才能知道 acceptor 选择的决议。如果 learner 需要知道是否已经选择了一个决议,它可以让 proposer 根据上面的算法提出一个议案【提出请求就有回应,并且新的提案的决议就是当前选择的决议】。
 
2.4 Progress
 
It’s easy to construct a scenario in which two proposers each keep issuing a sequence of proposals with increasing numbers, none of which are ever chosen. Proposer p completes phase 1 for a proposal number n1. Another proposer q then completes phase 1 for a proposal number n2 > n1. Proposer p’s phase 2 accept requests for a proposal numbered n1 are ignored because the acceptors have all promised not to accept any new proposal numbered less than n2. So, proposer p then begins and completes phase 1 for a new proposal number n3 > n2, causing the second phase 2 accept requests of proposer q to be ignored. And so on.
 
To guarantee progress, a distinguished proposer must be selected as the only one to try issuing proposals. If the distinguished proposer can communicate successfully with a majority of acceptors, and if it uses a proposal with number greater than any already used, then it will succeed in issuing a proposal that is accepted. By abandoning a proposal and trying again if it learns about some request with a higher proposal number, the distinguished proposer will eventually choose a high enough proposal number.
 
If enough of the system (proposer, acceptors, and communication network) is working properly, liveness can therefore be achieved by electing a single distinguished proposer. The famous result of Fischer, Lynch, and Patterson [1] implies that a reliable algorithm for electing a proposer must use either randomness or real time—for example, by using timeouts. However, safety is ensured regardless of the success or failure of the election.
 
2.5 The Implementation
 
The Paxos algorithm [5] assumes a network of processes. In its consensus algorithm, each process plays the role of proposer, acceptor, and learner. The algorithm chooses a leader, which plays the roles of the distinguished proposer and the distinguished learner. The Paxos consensus algorithm is precisely the one described above, where requests and responses are sent as ordinary messages. (Response messages are tagged with the corresponding proposal number to prevent confusion.) Stable storage, preserved during failures, is used to maintain the information that the acceptor must remember. An acceptor records its intended response in stable storage before actually sending the response.
 
All that remains is to describe the mechanism for guaranteeing that no two proposals are ever issued with the same number. Different proposers choose their numbers from disjoint sets of numbers, so two different proposers never issue a proposal with the same number. Each proposer remembers (in stable storage) the highest-numbered proposal it has tried to issue, and begins phase 1 with a higher proposal number than any it has already used.
 

最初并不能很好的理解 Paxos,大概是因为对于这个算法先入为主的理解,包括它要解决什么问题以及解决到什么程度。


讲述 Paxos 的文章很多,不少人都说算法抽象晦涩难以理解。其实并没有, In fact, it is among the simplest and most obvious of distributed algorithms.

我最开始认为他解决的问题是例如 客户端向一组服务器发送命令,在某一台宕机的情况下,其它服务器的依然能够按顺序执行这些命令。然而问题并没有那么复杂,就仅仅是在有多个角色提出多个值的情况下,大家都认同其中一个值(不论是哪一个都可以)。

另外这个算法不是对于 2PC 和 3PC 的改进,这两种协议保证的是分布事务一致性,和分布式系统的一致性说的不是一回事。不应该通过理解 2PC 或者 3PC 进而理解 Paxos。

多轮投票确认一个值,中间可能有失败的投票,也可能有多轮重复确认一个值,并不是说走完两段流程就可以达成一致。这一点在 The Part-Time Parliament 用表格表现的很清楚。 而且在两段流程结束后,还需要其他的角色来确认达成了一致,并结束这一组投票。

另外如果要实现这个算法,还需要很多改进,并且根据实际情况补充细节。例如上边所说的状态机实现。

SEH 基本结构

程序初始化,包含 SEH 保存 handler 指针的操作,展示 SEH 在内存中的基本结构。

程序的入口地址 DWORD AddressOfEntryPoint 11EB。

004011EB     55                    push ebp                       ; ebp 存至 esp 0018FF8C 所指的位置,ESP - 4
004011EC     8BEC                  mov ebp,esp                    ; esp 存至 ebp
004011EE     6A FF                 push -1                        ; -1 存至栈顶 esp
004011F0     68 B8604000           push CrackMe1.004060B8         ; rdata 部分,地址所指区域作用未知
004011F5     68 0C294000           push CrackMe1.0040290C         ; text 部分,EXCEPTION HANDLER
004011FA     64:A1 00000000        mov eax,dword ptr fs:[0]       ; fs 7EFDD000,值 0018FFC4,指向 SEH chain 尾部(值 FFFFFF),dword ptr,取指针指向的 32 位值(一般为指针)
00401200     50                    push eax
00401201     64:8925 00000000      mov dword ptr fs:[0],esp       ; 当前 esp 存入 fs 指向

程序开始是有 fs:[0] 指向 0018FFC4。

执行到最后,

7EFDD000 这个地址指向的是一个结构,第一个指针指向的是下一个 SEH 结构,第二个指针则指向处理函数。


分析 nt!KiSystemService

KiSystemService 是由用户态进内核态的系统调用,简单分析,写在这里。

进入函数之前

pushfd                                                                  // ebp+70
push Argument_1                                                         // ebp+6c
push call ip                                                            // ebp+68

call

nt!KiSystemService:
80542451 6a00            push    0                                      // ebp+64
80542453 55              push    ebp                                    // ebp+60
80542454 53              push    ebx                                    // ebp+5c
80542455 56              push    esi                                    // ebp+58
80542456 57              push    edi                                    // ebp+54
80542457 0fa0            push    fs                                     // ebp+50
80542459 bb30000000      mov     ebx,30h                         
8054245e 668ee3          mov     fs,bx
80542461 64ff3500000000  push    dword ptr fs:[0]                       // ebp+4c

fs的值设为30h,根据 System Programming Guide 3.4.1 Segment Selectors 的定义,实际上是 GDTR 第6个值,解析这个数据得到虚拟地址。 直接用dg命令查看段选择子的值。

0: kd> dg 30h
                                  P Si Gr Pr Lo
Sel    Base     Limit     Type    l ze an es ng Flags
---- -------- -------- ---------- - -- -- -- -- --------
0030 ffdff000 00001fff Data RW Ac 0 Bg Pg P  Nl 00000c93

然后得到这个base地址,据说 这个地址是存放着一个 _KPCR(Processor Control Region) 结构,第一个成员指向 _NT_TIB 的一个地址,存放当前CPU的各种信息,该结构体最后一个成员是0x120 _KPRCB,它的第4个成员指向 线程控制块 CurrentThread。

80542468 64c70500000000ffffffff mov dword ptr fs:[0],0FFFFFFFFh
80542473 648b3524010000  mov     esi,dword ptr fs:[124h]               // CurrentThread _KTHREAD
8054247a ffb640010000    push    dword ptr [esi+140h]                  // _KTHREAD->PreviousMode
80542480 83ec48          sub     esp,48h
80542483 8b5c246c        mov     ebx,dword ptr [esp+6Ch]               // Argument_1
80542487 83e301          and     ebx,1
8054248a 889e40010000    mov     byte ptr [esi+140h],bl                // 修改 PreviousMode
80542490 8bec            mov     ebp,esp
80542492 8b9e34010000    mov     ebx,dword ptr [esi+134h]              // _KTHREAD->TrapFrame

TrapFrame nt!_KTRAP_FRAME

0: kd> dt nt!_KTRAP_FRAME 
   +0x000 DbgEbp           : Uint4B
   +0x004 DbgEip           : Uint4B
   +0x008 DbgArgMark       : Uint4B
   +0x00c DbgArgPointer    : Uint4B
   ...

据说 是指中断、自陷、异常进入内核后,在堆栈上形成的一种数据结构。

80542498 895d3c          mov     dword ptr [ebp+3Ch],ebx               // 存起来
8054249b 89ae34010000    mov     dword ptr [esi+134h],ebp              // 你明白吧
805424a1 fc              cld                                           // Clear Direction Flag
805424a2 8b5d60          mov     ebx,dword ptr [ebp+60h]               // 开头保存的那个 ebp
805424a5 8b7d68          mov     edi,dword ptr [ebp+68h]               // eip
805424a8 89550c          mov     dword ptr [ebp+0Ch],edx               // Argument_1
805424ab c74508000ddbba  mov     dword ptr [ebp+8],0BADB0D00h          // DbgArgMark 据说是个 Magic Number
805424b2 895d00          mov     dword ptr [ebp],ebx
805424b5 897d04          mov     dword ptr [ebp+4],edi
805424b8 f6462cff        test    byte ptr [esi+2Ch],0FFh               // _KTHREAD->DebugActive
805424bc 0f858afeffff    jne     nt!Dr_kss_a (8054234c)

如果老夫没有看错的话,这个应该是保存TrapFrame吧。咳咳。

nt!Dr_kss_a
8054234c f7457000000200  test    dword ptr [ebp+70h],20000h            // EFLAGS VM Virtual-8086 Mode
80542353 750d            jne     nt!Dr_kss_a+0x16 (80542362)           

nt!Dr_kss_a+0x9:
80542355 f7456c01000000  test    dword ptr [ebp+6Ch],1
8054235c 0f8460010000    je      nt!KiSystemService+0x71 (805424c2)

nt!Dr_kss_a+0x16:
80542362 0f21c3          mov     ebx,dr0
80542365 0f21c9          mov     ecx,dr1
80542368 0f21d7          mov     edi,dr2
8054236b 895d18          mov     dword ptr [ebp+18h],ebx
8054236e 894d1c          mov     dword ptr [ebp+1Ch],ecx
80542371 897d20          mov     dword ptr [ebp+20h],edi
80542374 0f21db          mov     ebx,dr3
80542377 0f21f1          mov     ecx,dr6
8054237a 0f21ff          mov     edi,dr7
8054237d 895d24          mov     dword ptr [ebp+24h],ebx
80542380 894d28          mov     dword ptr [ebp+28h],ecx
80542383 33db            xor     ebx,ebx
80542385 897d2c          mov     dword ptr [ebp+2Ch],edi
80542388 0f23fb          mov     dr7,ebx
8054238b 648b3d20000000  mov     edi,dword ptr fs:[20h]
80542392 8b9ff8020000    mov     ebx,dword ptr [edi+2F8h]
80542398 8b8ffc020000    mov     ecx,dword ptr [edi+2FCh]
8054239e 0f23c3          mov     dr0,ebx
805423a1 0f23c9          mov     dr1,ecx
805423a4 8b9f00030000    mov     ebx,dword ptr [edi+300h]
805423aa 8b8f04030000    mov     ecx,dword ptr [edi+304h]
805423b0 0f23d3          mov     dr2,ebx
805423b3 0f23d9          mov     dr3,ecx
805423b6 8b9f08030000    mov     ebx,dword ptr [edi+308h]
805423bc 8b8f0c030000    mov     ecx,dword ptr [edi+30Ch]
805423c2 0f23f3          mov     dr6,ebx
805423c5 0f23f9          mov     dr7,ecx
805423c8 e9f5000000      jmp     nt!KiSystemService+0x71 (805424c2)

接下来。

nt!KiSystemService+0x71:
805424c2 fb              sti                                          // Set Interrupt Flag
805424c3 e9e7000000      jmp     nt!KiFastCallEntry+0x8f (805425af)

nt!KiFastCallEntry+0x8f:
805425af 8bf8            mov     edi,eax                              // eax 里保存着 SSDT 表的编号
805425b1 c1ef08          shr     edi,8                                // edi/256
805425b4 83e730          and     edi,30h                              // 12-14位
805425b7 8bcf            mov     ecx,edi                              // edi 验证 shadow ssdt
805425b9 03bee0000000    add     edi,dword ptr [esi+0E0h]             // _KTHREAD->ServiceTable
805425bf 8bd8            mov     ebx,eax
805425c1 25ff0f0000      and     eax,0FFFh                            // 低12位
805425c6 3b4708          cmp     eax,dword ptr [edi+8]                // ServiceTable->NumberOfServices
805425c9 0f8333fdffff    jae     nt!KiBBTUnexpectedRange (80542302)

nt!KiFastCallEntry+0xaf:
805425cf 83f910          cmp     ecx,10h
805425d2 751b            jne     nt!KiFastCallEntry+0xcf (805425ef)

nt!KiFastCallEntry+0xb4:
805425d4 648b0d18000000  mov     ecx,dword ptr fs:[18h]               // _NT_TIB _TEB 当前线程控制块
805425db 33db            xor     ebx,ebx
805425dd 0b99700f0000    or      ebx,dword ptr [ecx+0F70h]            // _PEB GdiBatchCount
805425e3 740a            je      nt!KiFastCallEntry+0xcf (805425ef)

nt!KiFastCallEntry+0xc5:
805425e5 52              push    edx
805425e6 50              push    eax
805425e7 ff1548d75580    call    dword ptr [nt!KeGdiFlushUserBatch (8055d748)]
805425ed 58              pop     eax
805425ee 5a              pop     edx

nt!KiFastCallEntry+0xcf:
805425ef 64ff0538060000  inc     dword ptr fs:[638h]                  // _KPRCB->KeSystemCalls
805425f6 8bf2            mov     esi,edx                              // 
805425f8 8b5f0c          mov     ebx,dword ptr [edi+0Ch]              // ParamTableBase
805425fb 33c9            xor     ecx,ecx
805425fd 8a0c18          mov     cl,byte ptr [eax+ebx]                // Params
80542600 8b3f            mov     edi,dword ptr [edi]                  // ServiceTable 第一个函数地址
80542602 8b1c87          mov     ebx,dword ptr [edi+eax*4]            // 计算函数地址
80542605 2be1            sub     esp,ecx                              // param num 
80542607 c1e902          shr     ecx,2
8054260a 8bfc            mov     edi,esp
8054260c 3b3534315680    cmp     esi,dword ptr [nt!MmUserProbeAddress (80563134)]  // 7fff0000
80542612 0f83a8010000    jae     nt!KiSystemCallExit2+0x9f (805427c0)

nt!KiFastCallEntry+0xf8:
80542618 f3a5            rep movs dword ptr es:[edi],dword ptr [esi]  // 复制到esp
8054261a ffd3            call    ebx                                  // 调用

nt!KiFastCallEntry+0xfc:
8054261c 8be5            mov     esp,ebp

nt!KiFastCallEntry+0xfe:
8054261e 648b0d24010000  mov     ecx,dword ptr fs:[124h]              // kthread
80542625 8b553c          mov     edx,dword ptr [ebp+3Ch]              // 
80542628 899134010000    mov     dword ptr [ecx+134h],edx             // 恢复_KTRAP_FRAME
8054262e fa              cli
8054262f f7457000000200  test    dword ptr [ebp+70h],20000h
80542636 7506            jne     nt!KiServiceExit+0x10 (8054263e)

基本流程就这样。还有一些错误处理的分支没有写出来,有时间再看吧。

SSH 端口转发

SSH的三种代理方式。


正向代理,也就是说访问本地,即可访问远程主机上的内容,监听在本地端口,

ssh -L 0.0.0.0:13306:127.0.0.1:3306 user@remote.ip -Nv

这样访问本地端口上13306,也能连接到的远程主机上的MySQL。

反向代理,监听远程主机上的端口,访问远程主机上的端口,即相当于访问本地(内网)端口,相当于网站做负载均衡的时候,访问外部地址,连接到集群里的服务器。

ssh -R 127.0.0.1:1022:127.0.0.1:22 user@remote.ip -Nv

这样访问远程主机的1022端口可以连接到本地的SSH。

动态代理比较简单,就是建立一个socks代理,其实socks协议也是可以bind的。

ssh -D 127.0.0.1:1080 user@remote.ip -Nv

这个最常用。

autossh 可以断线重连,

ssh -L 30168:127.0.0.1:30168 -R 30168:127.0.0.1:30169 -CNg -D 1080 -p 22 -l user@remote.ip

也就是这样了。