summaryrefslogtreecommitdiff
path: root/doc/developer_ibm_com_articles_l_gas_nasm.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/developer_ibm_com_articles_l_gas_nasm.txt')
-rw-r--r--doc/developer_ibm_com_articles_l_gas_nasm.txt544
1 files changed, 544 insertions, 0 deletions
diff --git a/doc/developer_ibm_com_articles_l_gas_nasm.txt b/doc/developer_ibm_com_articles_l_gas_nasm.txt
new file mode 100644
index 0000000..8d0850d
--- /dev/null
+++ b/doc/developer_ibm_com_articles_l_gas_nasm.txt
@@ -0,0 +1,544 @@
+Linux assemblers: A comparison of GAS and NASM
+A side-by-side look at GNU Assembler (GAS) and Netwide Assembler (NASM)
+ Save
+ Like
+By Ram Narayan
+Published October 17, 2007
+
+Introduction
+Unlike other languages, assembly programming involves understanding the
+processor architecture of the machine that is being programmed. Assembly
+programs are not at all portable and are often cumbersome to maintain and
+understand, and can often contain a large number of lines of code. But with
+these limitations comes the advantage of speed and size of the runtime binary
+that executes on that machine.
+
+Though much information is already available on assembly level programming on
+Linux, this article aims to more specifically show the differences between
+syntaxes in a way that will help you more easily convert from one flavor of
+assembly to the another. The article evolved from my own quest to improve at
+this conversion.
+
+This article uses a series of program examples. Each program illustrates some
+feature and is followed by a discussion and comparison of the syntaxes.
+Although it’s not possible to cover every difference that exists between
+NASM and GAS, I do try to cover the main points and provide a foundation for
+further investigation. And for those already familiar with both NASM and GAS,
+you might still find something useful here, such as macros.
+
+This article assumes you have at least a basic understanding of assembly
+terminology and have programmed with an assembler using Intel® syntax,
+perhaps using NASM on Linux or Windows. This article does not teach how to
+type code into an editor or how to assemble and link. You should be familiar
+with the Linux operating system (any Linux distribution will do; I used Red
+Hat and Slackware) and basic GNU tools such as gcc and ld, and you should be
+programming on an x86 machine.
+
+Now I’ll describe what this article does and does not cover.
+
+Building the examples
+
+Assembling:
+GAS:
+as –o program.o program.s
+
+NASM:
+nasm –f elf –o program.o program.asm
+
+Linking (common to both kinds of assembler):
+ld –o program program.o
+
+Linking when an external C library is to be used:
+ld –-dynamic-linker /lib/ld-linux.so.2 –lc –o program program.o
+
+This article covers:
+
+Basic syntactical differences between NASM and GAS
+Common assembly level constructs such as variables, loops, labels, and macros
+A bit about calling external C routines and using functions
+Assembly mnemonic differences and usage
+Memory addressing methods
+This article does not cover:
+
+The processor instruction set
+Various forms of macros and other constructs particular to an assembler
+Assembler directives peculiar to either NASM or GAS
+Features that are not commonly used or are found only in one assembler but not
+in the other
+For more information, refer to the official assembler manuals (see resources
+section in the right for links), as those are the most complete sources of
+information.
+
+Basic structure
+Listing 1 shows a very simple program that simply exits with an exit code of
+2. This little program describes the basic structure of an assembly program
+for both GAS and NASM.
+
+Line NASM GAS
+001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 ; Text segment
+begins section .text global _start ; Program entry point _start: ; Put the
+code number for system call mov eax, 1 ; Return value mov ebx, 2 ; Call the OS
+int 80h # Text segment begins .section .text .globl _start # Program entry
+point _start: # Put the code number for system call movl $1, %eax /* Return
+value */ movl $2, %ebx # Call the OS int $0x80
+Listing 1. A program that exits with an exit code of 2
+
+Now for a bit of explanation.
+
+One of the biggest differences between NASM and GAS is the syntax. GAS uses
+the AT&T syntax, a relatively archaic syntax that is specific to GAS and some
+older assemblers, whereas NASM uses the Intel syntax, supported by a majority
+of assemblers such as TASM and MASM. (Modern versions of GAS do support a
+directive called .intel_syntax, which allows the use of Intel syntax with GAS.)
+
+The following are some of the major differences summarized from the GAS manual:
+
+AT&T and Intel syntax use the opposite order for source and destination
+operands. For example:
+
+Intel: mov eax, 4
+AT&T: movl $4, %eax
+In AT&T syntax, immediate operands are preceded by $; in Intel syntax,
+immediate operands are not. For example:
+
+Intel: push 4
+AT&T: pushl $4
+In AT&T syntax, register operands are preceded by %; in Intel syntax, they are
+not.
+In AT&T syntax, the size of memory operands is determined from the last
+character of the opcode name. Opcode suffixes of b, w, and l specify byte
+(8-bit), word (16-bit), and long (32-bit) memory references. Intel syntax
+accomplishes this by prefixing memory operands (not the opcodes themselves)
+with byte ptr, word ptr, and dword ptr. Thus:
+
+Intel: mov al, byte ptr foo
+AT&T: movb foo, %al
+Immediate form long jumps and calls are lcall/ljmp $section, $offset in AT&T
+syntax; the Intel syntax is call/jmp far section:offset. The far return
+instruction is lret $stack-adjust in AT&T syntax, whereas Intel uses ret far
+stack-adjust.
+In both the assemblers, the names of registers remain the same, but the syntax
+for using them is different as is the syntax for addressing modes. In
+addition, assembler directives in GAS begin with a “.”, but not in NASM.
+
+The .text section is where the processor begins code execution. The global
+(also .globl or .global in GAS) keyword is used to make a symbol visible to
+the linker and available to other linking object modules. On the NASM side of
+Listing 1, global _start marks the symbol _start as a visible identifier so
+the linker knows where to jump into the program and begin execution. As with
+NASM, GAS looks for this _start label as the default entry point of a program.
+A label always ends with a colon in both GAS and NASM.
+
+Interrupts are a way to inform the OS that its services are required. The int
+instruction in line 16 does this job in our program. Both GAS and NASM use the
+same mnemonic for interrupts. GAS uses the 0x prefix to specify a hex number,
+whereas NASM uses the h suffix. Because immediate operands are prefixed with $
+in GAS, 80 hex is $0x80.
+
+int $0x80 (or 80h in NASM) is used to invoke Linux and request a service. The
+service code is present in the EAX register. A value of 1 (for the Linux exit
+system call) is stored in EAX to request that the program exit. Register EBX
+contains the exit code (2, in our case), a number that is returned to the OS.
+(You can track this number by typing echo $? at the command prompt.)
+
+Finally, a word about comments. GAS supports both C style (/* */), C++ style
+(//), and shell style (#) comments. NASM supports single-line comments that
+begin with the “;” character.
+
+Variables and accessing memory
+This section begins with an example program that finds the largest of three
+numbers.
+
+Line NASM GAS
+001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019
+020 021 022 023 024 025 026 027 028 029 030 031 ; Data section begins section
+.data var1 dd 40 var2 dd 20 var3 dd 30 section .text global _start _start: ;
+Move the contents of variables mov ecx, [var1] cmp ecx, [var2] jg
+check_third_var mov ecx, [var2] check_third_var: cmp ecx, [var3] jg _exit mov
+ecx, [var3] _exit: mov eax, 1 mov ebx, ecx int 80h // Data section begins
+.section .data var1: .int 40 var2: .int 20 var3: .int 30 .section .text .globl
+_start _start: # move the contents of variables movl (var1), %ecx cmpl (var2),
+%ecx jg check_third_var movl (var2), %ecx check_third_var: cmpl (var3), %ecx
+jg _exit movl (var3), %ecx _exit: movl $1, %eax movl %ecx, %ebx int $0x80
+Listing 2. A program that finds the maximum of three numbers
+
+You can see several differences above in the declaration of memory variables.
+NASM uses the dd, dw, and db directives to declare 32-, 16-, and 8-bit
+numbers, respectively, whereas GAS uses the .long, .int, and .byte for the
+same purpose. GAS has other directives too, such as .ascii, .asciz, and
+.string. In GAS, you declare variables just like other labels (using a colon),
+but in NASM you simply type a variable name (without the colon) before the
+memory allocation directive (dd, dw, etc.), followed by the value of the
+variable.
+
+Line 18 in Listing 2 illustrates the memory indirect addressing mode. NASM
+uses square brackets to dereference the value at the address pointed to by a
+memory location: [var1]. GAS uses a circular brace to dereference the same
+value: (var1). The use of other addressing modes is covered later in this
+article.
+
+Using macros
+Listing 3 illustrates the concepts of this section; it accepts the user’s
+name as input and returns a greeting.
+
+Line NASM GAS
+001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019
+020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038
+039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057
+058 059 060 061 062 section .data prompt_str db 'Enter your name: ' ; $ is
+the location counter STR_SIZE equ $ - prompt_str greet_str db 'Hello '
+GSTR_SIZE equ $ - greet_str section .bss ; Reserve 32 bytes of memory buff
+resb 32 ; A macro with two parameters ; Implements the write system call
+%macro write 2 mov eax, 4 mov ebx, 1 mov ecx, %1 mov edx, %2 int 80h %endmacro
+; Implements the read system call %macro read 2 mov eax, 3 mov ebx, 0 mov ecx,
+%1 mov edx, %2 int 80h %endmacro section .text global _start _start: write
+prompt_str, STR_SIZE read buff, 32 ; Read returns the length in eax push eax ;
+Print the hello text write greet_str, GSTR_SIZE pop edx ; edx = length
+returned by read write buff, edx _exit: mov eax, 1 mov ebx, 0 int 80h
+.section .data prompt_str: .ascii "Enter Your Name: " pstr_end: .set STR_SIZE,
+pstr_end - prompt_str greet_str: .ascii "Hello " gstr_end: .set GSTR_SIZE,
+gstr_end - greet_str .section .bss // Reserve 32 bytes of memory .lcomm buff,
+32 // A macro with two parameters // implements the write system call .macro
+write str, str_size movl $4, %eax movl $1, %ebx movl \str, %ecx movl
+\str_size, %edx int $0x80 .endm // Implements the read system call .macro read
+buff, buff_size movl $3, %eax movl $0, %ebx movl \buff, %ecx movl \buff_size,
+%edx int $0x80 .endm .section .text .globl _start _start: write $prompt_str,
+$STR_SIZE read $buff, $32 // Read returns the length in eax pushl %eax //
+Print the hello text write $greet_str, $GSTR_SIZE popl %edx // edx = length
+returned by read write $buff, %edx _exit: movl $1, %eax movl $0, %ebx int $0x80
+Listing 3. A program to read a string and display a greeting to the user
+
+The heading for this section promises a discussion of macros, and both NASM
+and GAS certainly support them. But before we get into macros, a few other
+features are worth comparing.
+
+Listing 3 illustrates the concept of uninitialized memory, defined using the
+.bss section directive (line 14). BSS stands for “block storage segment”
+(originally, “block started by symbol”), and the memory reserved in the
+BSS section is initialized to zero during the start of the program. Objects in
+the BSS section have only a name and a size, and no value. Variables declared
+in the BSS section don’t actually take space, unlike in the data segment.
+
+NASM uses the resb, resw, and resd keywords to allocated byte, word, and dword
+space in the BSS section. GAS, on the other hand, uses the .lcomm keyword to
+allocate byte-level space. Notice the way the variable name is declared in
+both versions of the program. In NASM the variable name precedes the resb (or
+resw or resd) keyword, followed by the amount of space to be reserved, whereas
+in GAS the variable name follows the .lcomm keyword, which is then followed by
+a comma and then the amount of space to be reserved. This shows the difference:
+
+NASM: varname resb size
+
+GAS: .lcomm varname, size
+
+Listing 2 also introduces the concept of a location counter (line 6). NASM
+provides a special variable (the $ and $$ variables) to manipulate the
+location counter. In GAS, there is no method to manipulate the location
+counter and you have to use labels to calculate the next storage location
+(data, instruction, etc.).
+
+For example, to calculate the length of a string, you would use the following
+idiom in NASM:
+
+prompt_str db 'Enter your name: ' STR_SIZE equ $ - prompt_str ; $ is the
+location counter
+
+The $ gives the current value of the location counter, and subtracting the
+value of the label (all variable names are labels) from this location counter
+gives the number of bytes present between the declaration of the label and the
+current location. The equ directive is used to set the value of the variable
+STR_SIZE to the expression following it. A similar idiom in GAS looks like
+this:
+
+prompt_str: .ascii "Enter Your Name: " pstr_end: .set STR_SIZE, pstr_end -
+prompt_str
+
+The end label (pstr_end) gives the next location address, and subtracting the
+starting label address gives the size. Also note the use of .set to initialize
+the value of the variable STR_SIZE to the expression following the comma. A
+corresponding .equ can also be used. There is no alternative to GAS’s set
+directive in NASM.
+
+As I mentioned, Listing 3 uses macros (line 21). Different macro techniques
+exist in NASM and GAS, including single-line macros and macro overloading, but
+I only deal with the basic type here. A common use of macros in assembly is
+clarity. Instead of typing the same piece of code again and again, you can
+create reusable macros that both avoid this repetition and enhance the look
+and readability of the code by reducing clutter.
+
+NASM users might be familiar with declaring macros using the %beginmacro
+directive and ending them with an %endmacro directive. A %beginmacro directive
+is followed by the macro name. After the macro name comes a count, the number
+of macro arguments the macro is supposed to have. In NASM, macro arguments are
+numbered sequentially starting with 1. That is, the first argument to a macro
+is %1, the second is %2, the third is %3, and so on. For example:
+
+%beginmacro macroname 2 mov eax, %1 mov ebx, %2 %endmacro
+
+This creates a macro with two arguments, the first being %1 and the second
+being %2. Thus, a call to the above macro would look something like this:
+
+macroname 5, 6
+
+Macros can also be created without arguments, in which case they don’t
+specify any number.
+
+Now let’s take a look at how GAS uses macros. GAS provides the .macro and
+.endm directives to create macros. A .macro directive is followed by a macro
+name, which may or may not have arguments. In GAS, macro arguments are given
+by name. For example:
+
+.macro macroname arg1, arg2 movl \arg1, %eax movl \arg2, %ebx .endm
+
+A backslash precedes the name of each argument of the macro when the name is
+actually used inside a macro. If this is not done, the linker would treat the
+names as labels rather then as arguments and will report an error.
+
+Functions, external routines, and the stack
+The example program for this section implements a selection sort on an array
+of integers.
+
+Line NASM GAS
+001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019
+020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038
+039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057
+058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076
+077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095
+096 097 098 099 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114
+115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133
+134 135 136 137 138 139 140 141 142 143 144 145 section .data array db 89, 10,
+67, 1, 4, 27, 12, 34, 86, 3 ARRAY_SIZE equ $ - array array_fmt db " %d", 0
+usort_str db "unsorted array:", 0 sort_str db "sorted array:", 0 newline db
+10, 0 section .text extern puts global _start _start: push usort_str call puts
+add esp, 4 push ARRAY_SIZE push array push array_fmt call print_array10 add
+esp, 12 push ARRAY_SIZE push array call sort_routine20 ; Adjust the stack
+pointer add esp, 8 push sort_str call puts add esp, 4 push ARRAY_SIZE push
+array push array_fmt call print_array10 add esp, 12 jmp _exit extern printf
+print_array10: push ebp mov ebp, esp sub esp, 4 mov edx, [ebp + 8] mov ebx,
+[ebp + 12] mov ecx, [ebp + 16] mov esi, 0 push_loop: mov [ebp - 4], ecx mov
+edx, [ebp + 8] xor eax, eax mov al, byte [ebx + esi] push eax push edx call
+printf add esp, 8 mov ecx, [ebp - 4] inc esi loop push_loop push newline call
+printf add esp, 4 mov esp, ebp pop ebp ret sort_routine20: push ebp mov ebp,
+esp ; Allocate a word of space in stack sub esp, 4 ; Get the address of the
+array mov ebx, [ebp + 8] ; Store array size mov ecx, [ebp + 12] dec ecx ;
+Prepare for outer loop here xor esi, esi outer_loop: ; This stores the min
+index mov [ebp - 4], esi mov edi, esi inc edi inner_loop: cmp edi, ARRAY_SIZE
+jge swap_vars xor al, al mov edx, [ebp - 4] mov al, byte [ebx + edx] cmp byte
+[ebx + edi], al jge check_next mov [ebp - 4], edi check_next: inc edi jmp
+inner_loop swap_vars: mov edi, [ebp - 4] mov dl, byte [ebx + edi] mov al, byte
+[ebx + esi] mov byte [ebx + esi], dl mov byte [ebx + edi], al inc esi loop
+outer_loop mov esp, ebp pop ebp ret _exit: mov eax, 1 mov ebx, 0 int
+80h .section .data array: .byte 89, 10, 67, 1, 4, 27, 12, 34, 86, 3
+array_end: .equ ARRAY_SIZE, array_end - array array_fmt: .asciz " %d"
+usort_str: .asciz "unsorted array:" sort_str: .asciz "sorted array:" newline:
+.asciz "\n" .section .text .globl _start _start: pushl $usort_str call puts
+addl $4, %esp pushl $ARRAY_SIZE pushl $array pushl $array_fmt call
+print_array10 addl $12, %esp pushl $ARRAY_SIZE pushl $array call
+sort_routine20 # Adjust the stack pointer addl $8, %esp pushl $sort_str call
+puts addl $4, %esp pushl $ARRAY_SIZE pushl $array pushl $array_fmt call
+print_array10 addl $12, %esp jmp _exit print_array10: pushl %ebp movl %esp,
+%ebp subl $4, %esp movl 8(%ebp), %edx movl 12(%ebp), %ebx movl 16(%ebp), %ecx
+movl $0, %esi push_loop: movl %ecx, -4(%ebp) movl 8(%ebp), %edx xorl %eax,
+%eax movb (%ebx, %esi, 1), %al pushl %eax pushl %edx call printf addl $8, %esp
+movl -4(%ebp), %ecx incl %esi loop push_loop pushl $newline call printf addl
+$4, %esp movl %ebp, %esp popl %ebp ret sort_routine20: pushl %ebp movl %esp,
+%ebp # Allocate a word of space in stack subl $4, %esp # Get the address of
+the array movl 8(%ebp), %ebx # Store array size movl 12(%ebp), %ecx decl %ecx
+# Prepare for outer loop here xorl %esi, %esi outer_loop: # This stores the
+min index movl %esi, -4(%ebp) movl %esi, %edi incl %edi inner_loop: cmpl
+$ARRAY_SIZE, %edi jge swap_vars xorb %al, %al movl -4(%ebp), %edx movb (%ebx,
+%edx, 1), %al cmpb %al, (%ebx, %edi, 1) jge check_next movl %edi, -4(%ebp)
+check_next: incl %edi jmp inner_loop swap_vars: movl -4(%ebp), %edi movb
+(%ebx, %edi, 1), %dl movb (%ebx, %esi, 1), %al movb %dl, (%ebx, %esi, 1) movb
+%al, (%ebx, %edi, 1) incl %esi loop outer_loop movl %ebp, %esp popl %ebp ret
+_exit: movl $1, %eax movl $0, %ebx int $0x80
+Listing 4. Implementation of selection sort on an integer array
+
+Listing 4 might look overwhelming at first, but in fact it’s very simple.
+The listing introduces the concept of functions, various memory addressing
+schemes, the stack and the use of a library function. The program sorts an
+array of 10 numbers and uses the external C library functions puts and printf
+to print out the entire contents of the unsorted and sorted array. For
+modularity and to introduce the concept of functions, the sort routine itself
+is implemented as a separate procedure along with the array print routine.
+Let’s deal with them one by one.
+
+After the data declarations, the program execution begins with a call to puts
+(line 31). The puts function displays a string on the console. Its only
+argument is the address of the string to be displayed, which is passed on to
+it by pushing the address of the string in the stack (line 30).
+
+In NASM, any label that is not part of our program and needs to be resolved
+during link time must be predefined, which is the function of the extern
+keyword (line 24). GAS doesn’t have such requirements. After this, the
+address of the string usort_str is pushed onto the stack (line 30). In NASM, a
+memory variable such as usort_str represents the address of the memory
+location itself, and thus a call such as push usort_str actually pushes the
+address on top of the stack. In GAS, on the other hand, the variable usort_str
+must be prefixed with $, so that it is treated as an immediate address. If
+it’s not prefixed with $, the actual bytes represented by the memory
+variable are pushed onto the stack instead of the address.
+
+Since pushing a variable essentially moves the stack pointer by a dword, the
+stack pointer is adjusted by adding 4 (the size of a dword) to it (line 32).
+
+Three arguments are now pushed onto the stack, and the print_array10 function
+is called (line 37). Functions are declared the same way in both NASM and GAS.
+They are nothing but labels, which are invoked using the call instruction.
+
+After a function call, ESP represents the top of the stack. A value of esp + 4
+represents the return address, and a value of esp + 8 represents the first
+argument to the function. All subsequent arguments are accessed by adding the
+size of a dword variable to the stack pointer (that is, esp + 12, esp + 16,
+and so on).
+
+Once inside a function, a local stack frame is created by copying esp to ebp
+(line 62). You can also allocate space for local variables as is done in the
+program (line 63). You do this by subtracting the number of bytes required
+from esp. A value of esp – 4 represents a space of 4 bytes allocated for a
+local variable, and this can continue as long as there is enough space in the
+stack to accommodate your local variables.
+
+Listing 4 illustrates the base indirect addressing mode (line 64), so called
+because you start with a base address and add an offset to it to arrive at a
+final address. On the NASM side of the listing, [ebp + 8] is one such example,
+as is [ebp – 4] (line 71). In GAS, the addressing is a bit more terse:
+4(%ebp) and -4(%ebp), respectively.
+
+In the print_array10 routine, you can see another kind of addressing mode
+being used after the push_loop label (line 74). The line is represented in
+NASM and GAS, respectively, like so:
+
+NASM: mov al, byte [ebx + esi]
+
+GAS: movb (%ebx, %esi, 1), %al
+
+This addressing mode is the base indexed addressing mode. Here, there are
+three entities: one is the base address, the second is the index register, and
+the third is the multiplier. Because it’s not possible to determine the
+number of bytes to be accessed from a memory location, a method is needed to
+find out the amount of memory addressed. NASM uses the byte operator to tell
+the assembler that a byte of data is to be moved. In GAS the same problem is
+solved by using a multiplier as well as using the b, w, or l suffix in the
+mnemonic (for example, movb). The syntax of GAS can seem somewhat complex when
+first encountered.
+
+The general form of base indexed addressing in GAS is as follows:
+
+%segment:ADDRESS (, index, multiplier)
+
+or
+
+%segment:(offset, index, multiplier)
+
+or
+
+%segment:ADDRESS(base, index, multiplier)
+
+The final address is calculated using this formula:
+
+ADDRESS or offset + base + index * multiplier.
+
+Thus, to access a byte, a multiplier of 1 is used, for a word, 2, and for a
+dword, 4. Of course, NASM uses a simpler syntax. Thus, the above in NASM would
+be represented like so:
+
+Segment:[ADDRESS or offset + index * multiplier]
+
+A prefix of byte, word, or dword is used before this memory address to access
+1, 2, or 4 bytes of memory, respectively.
+
+Leftovers
+Line NASM GAS
+001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019
+020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038
+039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057
+058 059 060 061 section .data ; Command table to store at most ; 10 command
+line arguments cmd_tbl: %rep 10 dd 0 %endrep section .text global _start
+_start: ; Set up the stack frame mov ebp, esp ; Top of stack contains the ;
+number of command line arguments. ; The default value is 1 mov ecx, [ebp] ;
+Exit if arguments are more than 10 cmp ecx, 10 jg _exit mov esi, 1 mov edi, 0
+; Store the command line arguments ; in the command table store_loop: mov eax,
+[ebp + esi * 4] mov [cmd_tbl + edi * 4], eax inc esi inc edi loop store_loop
+mov ecx, edi mov esi, 0 extern puts print_loop: ; Make some local space sub
+esp, 4 ; puts function corrupts ecx mov [ebp - 4], ecx mov eax, [cmd_tbl + esi
+* 4] push eax call puts add esp, 4 mov ecx, [ebp - 4] inc esi loop print_loop
+jmp _exit _exit: mov eax, 1 mov ebx, 0 int 80h .section .data // Command
+table to store at most // 10 command line arguments cmd_tbl: .rept 10 .long 0
+.endr .section .text .globl _start _start: // Set up the stack frame movl
+%esp, %ebp // Top of stack contains the // number of command line arguments.
+// The default value is 1 movl (%ebp), %ecx // Exit if arguments are more than
+10 cmpl $10, %ecx jg _exit movl $1, %esi movl $0, %edi // Store the command
+line arguments // in the command table store_loop: movl (%ebp, %esi, 4), %eax
+movl %eax, cmd_tbl( , %edi, 4) incl %esi incl %edi loop store_loop movl %edi,
+%ecx movl $0, %esi print_loop: // Make some local space subl $4, %esp // puts
+functions corrupts ecx movl %ecx, -4(%ebp) movl cmd_tbl( , %esi, 4), %eax
+pushl %eax call puts addl $4, %esp movl -4(%ebp), %ecx incl %esi loop
+print_loop jmp _exit _exit: movl $1, %eax movl $0, %ebx int $0x80
+Listing 5. A program that reads command line arguments, stores them in memory,
+and prints them
+
+Listing 5 shows a construct that repeats instructions in assembly. Naturally
+enough, it’s called the repeat construct. In GAS, the repeat construct is
+started using the .rept directive (line 6). This directive has to be closed
+using an .endr directive (line 8). .rept is followed by a count in GAS that
+specifies the number of times the expression enclosed inside the .rept/.endr
+construct is to be repeated. Any instruction placed inside this construct is
+equivalent to writing that instruction count number of times, each on a
+separate line.
+
+For example, for a count of 3:
+
+.rept 3 movl $2, %eax .endr
+
+This is equivalent to:
+
+movl $2, %eax movl $2, %eax movl $2, %eax
+
+In NASM, a similar construct is used at the preprocessor level. It begins with
+the %rep directive and ends with %endrep. The %rep directive is followed by an
+expression (unlike in GAS where the .rept directive is followed by a count):
+
+%rep <expression> nop %endrep
+
+There is also an alternative in NASM, the times directive. Similar to %rep, it
+works at the assembler level, and it, too, is followed by an expression. For
+example, the above %rep construct is equivalent to this:
+
+times <expression> nop
+
+And this:
+
+%rep 3 mov eax, 2 %endrep
+
+is equivalent to this:
+
+times 3 mov eax, 2
+
+and both are equivalent to this:
+
+mov eax, 2 mov eax, 2 mov eax, 2
+
+In Listing 5, the .rept (or %rep) directive is used to create a memory data
+area for 10 double words. The command line arguments are then accessed one by
+one from the stack and stored in the memory area until the command table gets
+full.
+
+As for command line arguments, they are accessed similarly with both
+assemblers. ESP or the top of the stack stores the number of command line
+arguments supplied to a program, which is 1 by default (for no command line
+arguments). esp + 4 stores the first command line argument, which is always
+the name of the program that was invoked from the command line. esp + 8, esp +
+12, and so on store subsequent command line arguments.
+
+Also watch the way the memory command table is being accessed on both sides in
+Listing 5. Here, memory indirect addressing mode (line 33) is used to access
+the command table along with an offset in ESI (and EDI) and a multiplier.
+Thus, [cmd_tbl + esi * 4] in NASM is equal to cmd_tbl(, %esi, 4) in GAS.
+
+Conclusion
+Even though the differences between these two assemblers are substantial,
+it’s not that difficult to convert from one form to another. You might find
+that the AT&T syntax seems at first difficult to understand, but once
+mastered, it’s as simple as the Intel syntax.
+