# Binutils Package

作者: Dylan Clark

日期:2019年10月16日

A basic fact is : CPU读的是机器语言(二进制)指令,而不是源代码!

# 编译

编译是什么?是这样一个过程,将一个程序由其某一种语言写成的源代码形式转换成机器代码的过程。[1] 机器代码往往以某种特殊格式保存,称为可执行文件,或二进制文件(binaray file). 在Linux和BSD系统中,它又称为ELF(Executable and Linkable Format).

# readelf

举例。用readelf查看ls命令的信息:

(base) huang@xian /home/huang/git/yitiduojie_source/high [55]$ readelf -h /bin/ls
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x5850
  Start of program headers:          64 (bytes into file)
  Start of section headers:          132000 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           64 (bytes)
  Number of section headers:         28
  Section header string table index: 27

用ldd命令查看ls命令依赖于那些系统库(system libraries):

(base) huang@xian /home/huang/git/yitiduojie_source/high [56]$ ldd /bin/ls
	linux-vdso.so.1 (0x00007fffc07a2000)
	libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f2ab86b2000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2ab82c1000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f2ab804f000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f2ab7e4b000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f2ab8afc000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f2ab7c2c000)

对其中的/lib/x86_64-linux-gnu/libc.so.6,利用readelf查看:

(base) huang@xian /home/huang [9]$ readelf -h /lib/x86_64-linux-gnu/libc.so.6
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - GNU
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x21cb0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          2025872 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         10
  Size of section headers:           64 (bytes)
  Number of section headers:         73
  Section header string table index: 72

可见,/lib/x86_64-linux-gnu/libc.so.6是一个DYN(Shared object file).DYN意味着它不能直接被执行,it must by an executable file that internally uses any functions made available by the library.(它必须由在内部使用库提供的任何函数的可执行文件使用)。

# ldd

# size

List section size and the total size.

# Strings

Prints the strings of printable characters in files.

# objdump

Displays information from object files.

# strip

Discards symbols from object files.

# addr2line

将地址转化为文件名或行数(line numbers)

# nm

Lists symbols from object files.

# 注释:

[0] Gaurav Kamathe, 9 essential GNU binutils tools.

[1] R. Antonsen在TED演讲中提到一个观念:理解的本质是具有改变视角(perspective)的能力。程序好比一种Antonsen所谓的"模式”,而其不同语言形式,只是视角不同而已。它们其实是等价的。

[2]GCC, GNU Compiler Collection: 它的功能是将经过预处理(如C prprocessor, cpp)的源代码转化为汇编语言。

[3]汇编程序 assembler: 其目的是将汇编语言指令转化成机器代码,并产生一个具有扩展名为.o的对象文件。在Linux平台下,一般可用as命令实现: # as hello.s -o hello.o

[4]对象文件(扩展名.o)和可执行文件有差别。一个可执行文件往往需要来自系统库的外部函数。

[5]ld命令:是一个连接工具。 其目的是:将多个对象文件连接(link)到一起,并生成一个可执行文件。(ld如何实现linking?以后再具体说)

[6]readelf命令:其作用是给出一个二进制文件的大量信息,例如它是ELF64-bit格式,还是ELF32-bit格式,