stdarg.h
tags: stdarg.h
The stdarg.h
header provides the functionality for marching through each
argument of a function when you don't know how many arguments there will be
until run time. When you see a function with an ellipsis as the last
"parameter" then that function is probably using stdarg.h
to process each
argument. The printf
function is just one example.
Stack Frames
First, we need to talk about stack frames and how they are constructed. If
you've never heard of a stack frame then I recommend reading the Wikipedia
article on call frames. This explanation is specific to the x86-64
architecture so keep that in mind as this may not be the way a different
architecture handles things. A stack frame refers to the portion of stack memory
that pertains to a specific function. The stack frame will contain function
arguments (if they weren't passed in registers), the return address, a pointer
to the previous base pointer, and then any local variables for the function. The
base pointer is used to access both the arguments that were passed on the stack
as well as the local variables for the function. There is a specific register,
rbp
, which will contain the current base pointer. Modifying this register
without saving it could have serious consequences because you no longer have an
easy reference point for accessing the contents of your stack frame. For a
visual representation of this, the Wikipedia page I mentioned previously
has a good picture or you can look at page 16 of the x86_64 ABI. Keep in
mind that these pictures will seem to contradict eachother but this is only
because the Wiki page shows the stack with low memory at the top while the ABI
shows the stack with high memory at the top.
va_list
stdarg.h
specifies a single type for holding the meta-data necessary for
advancing through the argument list. Since this type is architecture specific
then the Standard doesn't give any more information aside from it being "a type
suitable for holding the information needed by the macros va_start
, va_arg
,
and va_end
." We can look to the x86_64 ABI for an answer though.
Each specific architecture will have its own way of performing argument passing.
For example, the i386 architecture pushes every argument onto the stack before
calling a function. Unfortunately, x86_64 is not so simple. The ABI describes
the method through with parameters should be passed in detail and even provides
the following definition for the va_list
type:
typedef struct {
unsigned int gp_offset;
unsigned int fp_offset;
void *overflow_arg_area;
void *reg_save_area;
} va_list[1];
Due to the way that x86_64 performs parameters passing, it is necessary to keep
track of a few things so that we can access different types of parameters since
they are stored in different places. The first of these is reg_save_area
which
is a pointer to the register save area. The register save area is a location on
the stack which holds copies of arguments that were passed in registers, the
reg_save_area
pointer holds the location of the first register copy (which is
always a copy of the rdi
register). There is a limit to the amount of
information that can be passed through registers and the excess arguments are
passed directly on the stack. The overflow_arg_area
is a pointer to the first
of these excess arguments.
Now we have pointers to each group of arguments on the stack but we
need to keep track of which argument is next. The gp_offset
member is an
offset from beginning of the reg_save_area
that points to the next argument
that was passed in the general purpose registers. The next argument could be
retrieved by accessing what is on the stack at reg_save_area + gp_offset
.
However, what happens when we exhaust the arguments that were passed in
registers? We set gp_offset
to 48
which indicates that "register arguments
have been exhausted" and then we would retrieve the next argument by accessing
what is on the stack at overflow_arg_area
. The value 48
comes from the
number of general purpose registers used for argument passing: rdi
, rsi
,
rdx
, rcx
, r8
, and r9
give us 6, 8-byte, registers totaling 48
bytes on
the stack for the copies of each register. Therefore, when gp_offset
is 48
or greater, the location being accessed is no longer valid. overflow_arg_area
works differently than reg_save_area
because it should always point to the
next argument to be retrieved; it needs to be updated every time an argument is
retrieved. Lastly, there is the fp_offset
member which works in the same way
as gp_offset
except it is the offset to the next argument passed in a
floating-point register. This offset will have a value of 48
to 304
where
the 304
indicates that all floating-point arguments passed in registers have
been exhausted and the next one should be retrieved by using
overflow_arg_area
.
That is a little bit complicated but overall it's not too bad. The issues comes in with passing structures as arguments because a structure, on x86_64, may be passed partially through registers and partially on the stack. This results in a lot of extra logic in order to properly rebuild the entire structure from different parts of the stack. We'll talk more about this later.
va_start
The va_start
macro is used to initialize a va_list
structure so that it can
be used to retrieve arguments. You must call va_start
before accessing any
arguments otherwise you won't know where they are.The x86_64 ABI gives us the
information we need in order to initialize the va_list
structure. This macro
takes a va_list
structure as well as the identifier of the rightmost parameter
in the variable parameter list (e.g. the argument directly before the ellipsis).
First, we need to figure out where the register save area
is. Looking at figure 3.33 in the ABI, we see that the register save area has a
total size of 304
bytes which we calculate using the last register copy
offset, 288
, plus the size of the last register copy, 16
bytes for a
floating-point register. This value should also be familiar from fp_offset
because when it is set to 304
, we know that we can't retrieve floating-point
arguments from the register save area anymore. The compiler takes care of
building the register save area for us, so we can be sure that 304
bytes of
space will be taken up on the stack directly in between the previous rbp
value
and the local function variables. Since we know that the base pointer register,
rbp
, holds the stack location immediately after our register save area, we can
subtract the size of the register save area to get the location we need.
reg_save_area
\(= rbp - 304\)
Second, we need the location of the first argument passed on the stack, which
should be in the stack location which follows the return address of the current
stack stack frame (which is directly after the saved ebp
). This location will
be 16
bytes after the saved ebp
:
overflow_arg_area
\(= rbp + 16\)
Figure 3.33 from the ABI shows us the offsets to the first general purpose
register as well as the first floating-point register, 0
and 48
,
respectively.
gp_offset
\(= 0\)fp_offset
\(= 48\)
That's all we need for va_start
to initialize the va_list
structure.
va_arg
The va_arg
macros expands to an expression that has the type and value of the
next argument passed to a function. You must provide this macro with the
va_list
struct initialized from a call to va_start
and the type of the
argument that you want to be retrieved. va_arg
will update the va_list
structure appropriately so that a subsequent call to va_arg
will work as
expected.
The only primitive types that would be returned through va_arg
would be the
int
and double
types due to the way that the ABI is defined. For an int
type we can just use reg_save_area + gp_offset
to get the next int
argument,
or if gp_offset
is greater than or equal to 48
we would read from
overflow_arg_area
. Likewise, for double
arguments we would use
reg_save_area + fp_offset
or overflow_arg_area
if fp_offset
was 304
or
greater.
For structures, things get pretty complicated since parts of the structure may
be passed in general purpose registers, some parts may be passed in
floating-point registers, and some parts may be passed directly on the stack. A
structure can contain multiple members, each of varying types, and this is what
makes it difficult to return a structure with va_arg
on this architecture.
There isn't a good way to dynamically determine the type of the next member
within a structure from the standpoint of a C library, so we need to cheat a
little bit for this macro.
The ABI even hints at this and states that "The va_arg
macro is usually
implemented as a compiler builtin and expanded in simplified forms for each
particular type." gcc provides this builtin in the form of the macro
__builtin_va_arg
and we'll use it like so:
#define va_arg(ap, type) __builtin_va_arg((ap), type)
va_end
The last macro must ensure that a normal return will happen from a function
which called va_start
. For the x86_64 architecture this macro doesn't actually
need to do anything since va_start
and va_arg
only modify the va_list
structure and not the stack or registers themselves. However, I would choose to
zero out each member of the va_list
struct just to ensure that va_start
would need to be called be called before using va_arg
again.
Cheating
Like I mentioned in the va_arg
section, we needed to cheat a little bit for
one of the macros. Since we are going to cheat on one of the macros then we also
need to cheat on the rest of them to ensure that everything works together
properly. gcc provides builtin versions of the va_list
type as well as all
three macros, so our implementation will look like so:
#ifndef _STDARG_H
#define _STDARG_H
typedef __builtin_va_list va_list;
#define va_start(ap, parmN) __builtin_va_start((ap), (parmN))
#define va_arg(ap, type) __builtin_va_arg((ap), type)
#define va_end(ap) __builtin_va_end((ap))
#endif /* _STDARG_H */
Freestanding Environment
Now that we've completed the stdarg.h
header we have all the headers necessary
for a freestanding environment. A freestanding environment is one in which C
program can run without an underlying operating system. The freestanding
execution environment requires all of the architecture specific header files:
float.h
, limits.h
, stdarg.h
, and stddef.h
. This means that the rest of
the header files (and any backing .c files) can be written in an architecture
agnostic way and shouldn't require any assembly.