12.9 命令行参数

  如果我们的 hex 程序能读取通过命令行传给它的输入输出文件名, 也就是如果它能处理命令行参数的话, 那它就更好用了。 但是, 这些参数在哪里呢?

  在 UNIX® 系统启动程序之前, 它会将一些数据 push 到堆栈中, 接着跳转到程序的 _start 标签。 是的, 我说的是跳转而不是调用。 这意味着, 这些数据可以通过读取 [esp+offset], 或更简单的 pop 得到。

  �顶的值包含了命令行参数的个数。 传统上它叫做 argc, 表示 “argument count”。

  接下来是 argc 个命令行参数。 传统上这些称为 argv, 表示 “argument value(s)”。 这样, 我们便得到了 argv[0]argv[1]...argv[argc-1]。 这些并不是实际的参数, 而是指向这些参数的指针, 也就是实际参数的内存地址。 参数本身是以 NUL-结尾的字符串形式存放的。

  argv 表以一 NULL 指针结束, 这个指针的值就是 0。 还有一些其它的数据, 但前面这些已经足以让我们达到目的了。

注意: If you have come from the MS-DOS® programming environment, the main difference is that each argument is in a separate string. The second difference is that there is no practical limit on how many arguments there can be.

  Armed with this knowledge, we are almost ready for the next version of hex.asm. First, however, we need to add a few lines to system.inc:

  First, we need to add two new entries to our list of system call numbers:

%define    SYS_open    5
%define SYS_close   6

  Then we add two new macros at the end of the file:

%macro sys.open    0
    system  SYS_open
%endmacro

%macro  sys.close   0
    system  SYS_close
%endmacro

  Here, then, is our modified source code:

%include   'system.inc'

%define BUFSIZE 2048

section .data
fd.in   dd  stdin
fd.out  dd  stdout
hex db  '0123456789ABCDEF'

section .bss
ibuffer resb    BUFSIZE
obuffer resb    BUFSIZE

section .text
align 4
err:
    push    dword 1     ; return failure
    sys.exit

align 4
global  _start
_start:
    add esp, byte 8 ; discard argc and argv[0]

    pop ecx
    jecxz   .init       ; no more arguments

    ; ECX contains the path to input file
    push    dword 0     ; O_RDONLY
    push    ecx
    sys.open
    jc  err     ; open failed

    add esp, byte 8
    mov [fd.in], eax

    pop ecx
    jecxz   .init       ; no more arguments

    ; ECX contains the path to output file
    push    dword 420   ; file mode (644 octal)
    push    dword 0200h | 0400h | 01h
    ; O_CREAT | O_TRUNC | O_WRONLY
    push    ecx
    sys.open
    jc  err

    add esp, byte 12
    mov [fd.out], eax

.init:
    sub eax, eax
    sub ebx, ebx
    sub ecx, ecx
    mov edi, obuffer

.loop:
    ; read a byte from input file or stdin
    call    getchar

    ; convert it to hex
    mov dl, al
    shr al, 4
    mov al, [hex+eax]
    call    putchar

    mov al, dl
    and al, 0Fh
    mov al, [hex+eax]
    call    putchar

    mov al, ' '
    cmp dl, 0Ah
    jne .put
    mov al, dl

.put:
    call    putchar
    cmp al, dl
    jne .loop
    call    write
    jmp short .loop

align 4
getchar:
    or  ebx, ebx
    jne .fetch

    call    read

.fetch:
    lodsb
    dec ebx
    ret

read:
    push    dword BUFSIZE
    mov esi, ibuffer
    push    esi
    push    dword [fd.in]
    sys.read
    add esp, byte 12
    mov ebx, eax
    or  eax, eax
    je  .done
    sub eax, eax
    ret

align 4
.done:
    call    write       ; flush output buffer

    ; close files
    push    dword [fd.in]
    sys.close

    push    dword [fd.out]
    sys.close

    ; return success
    push    dword 0
    sys.exit

align 4
putchar:
    stosb
    inc ecx
    cmp ecx, BUFSIZE
    je  write
    ret

align 4
write:
    sub edi, ecx    ; start of buffer
    push    ecx
    push    edi
    push    dword [fd.out]
    sys.write
    add esp, byte 12
    sub eax, eax
    sub ecx, ecx    ; buffer is empty now
    ret

  In our .data section we now have two new variables, fd.in and fd.out. We store the input and output file descriptors here.

  In the .text section we have replaced the references to stdin and stdout with [fd.in] and [fd.out].

  The .text section now starts with a simple error handler, which does nothing but exit the program with a return value of 1. The error handler is before _start so we are within a short distance from where the errors occur.

  Naturally, the program execution still begins at _start. First, we remove argc and argv[0] from the stack: They are of no interest to us (in this program, that is).

  We pop argv[1] to ECX. This register is particularly suited for pointers, as we can handle NULL pointers with jecxz. If argv[1] is not NULL, we try to open the file named in the first argument. Otherwise, we continue the program as before: Reading from stdin, writing to stdout. If we fail to open the input file (e.g., it does not exist), we jump to the error handler and quit.

  If all went well, we now check for the second argument. If it is there, we open the output file. Otherwise, we send the output to stdout. If we fail to open the output file (e.g., it exists and we do not have the write permission), we, again, jump to the error handler.

  The rest of the code is the same as before, except we close the input and output files before exiting, and, as mentioned, we use [fd.in] and [fd.out].

  Our executable is now a whopping 768 bytes long.

  Can we still improve it? Of course! Every program can be improved. Here are a few ideas of what we could do:

  I shall leave these enhancements as an exercise to the reader: You already know everything you need to know to implement them.

本文档和其它文档可从这里下载:ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/.

如果对于FreeBSD有问题,请先阅读文档,如不能解决再联系<questions@FreeBSD.org>.
关于本文档的问题请发信联系 <doc@FreeBSD.org>.