• 2011-06-10

    APUE-Memory Layout of a C Program

    Views: 21386 | No Comments

    Historically, a C program has been composed of the following pieces:

    • Text segment, the machine instructions that the CPU executes. Usually, the text segment is sharable so that only a single copy needs to be in memory for frequently executed programs, such as text editors, the C compiler, the shells, and so on. Also, the text segment is often read-only, to prevent a program from accidentally modifying its instructions.
    • Initialized data segment, usually called simply the data segment, containing variables that are specifically initialized in the program. For example, the C declaration
          int   maxcount = 99;
      

      appearing outside any function causes this variable to be stored in the initialized data segment with its initial value.

    • Uninitialized data segment, often called the “bss” segment, named after an ancient assembler operator that stood for “block started by symbol.” Data in this segment is initialized by the kernel to arithmetic 0 or null pointers before the program starts executing. The C declaration
          long  sum[1000];
      

      appearing outside any function causes this variable to be stored in the uninitialized data segment.

    • Stack, where automatic variables are stored, along with information that is saved each time a function is called. Each time a function is called, the address of where to return to and certain information about the caller’s environment, such as some of the machine registers, are saved on the stack. The newly called function then allocates room on the stack for its automatic and temporary variables. This is how recursive functions in C can work. Each time a recursive function calls itself, a new stack frame is used, so one set of variables doesn’t interfere with the variables from another instance of the function.
    • Heap, where dynamic memory allocation usually takes place. Historically, the heap has been located between the uninitialized data and the stack.

    Figure 7.6 shows the typical arrangement of these segments. This is a logical picture of how a program looks; there is no requirement that a given implementation arrange its memory in this fashion. Nevertheless, this gives us a typical arrangement to describe. With Linux on an Intel x86 processor, the text segment starts at location 0x08048000, and the bottom of the stack starts just below 0xC0000000. (The stack grows from higher-numbered addresses to lower-numbered addresses on this particular architecture.) The unused virtual address space between the top of the heap and the top of the stack is large.

    Figure 7.6. Typical memory arrangement

    Several more segment types exist in an a.out, containing the symbol table, debugging information, linkage tables for dynamic shared libraries, and the like. These additional sections don’t get loaded as part of the program’s image executed by a process.

    Note from Figure 7.6 that the contents of the uninitialized data segment are not stored in the program file on disk. This is because the kernel sets it to 0 before the program starts running. The only portions of the program that need to be saved in the program file are the text segment and the initialized data.

    The size(1) command reports the sizes (in bytes) of the text, data, and bss segments. For example:

        $ size /usr/bin/cc /bin/sh
           text     data   bss     dec     hex   filename
          79606     1536   916   82058   1408a   /usr/bin/cc
         619234    21120 18260  658614   a0cb6   /bin/sh
    

    The fourth and fifth columns are the total of the three sizes, displayed in decimal and hexadecimal, respectively.

    ——

    All text copy from Advanced Programing in the Unix Environment.

    Posted by ideawu at 2011-06-10 20:10:55 Tags: , ,
|<<<1>>>| 1/1 Pages, 1 Results.