Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- \input texinfo
- @c Copyright @copyright{} 2022 Richard Stallman and Free Software Foundation, Inc.
- (The work of Trevis Rothwell and Nelson Beebe has been assigned or
- licensed to the FSF.)
- @c move alignment later?
- @setfilename ./c
- @settitle GNU C Language Manual
- @documentencoding UTF-8
- @smallbook
- @synindex vr fn
- @copying
- Copyright @copyright{} 2022 Richard Stallman and Free Software Foundation, Inc.
- (The work of Trevis Rothwell and Nelson Beebe has been assigned or
- licensed to the FSF.)
- @quotation
- Permission is granted to copy, distribute and/or modify this document
- under the terms of the GNU Free Documentation License, Version 1.3 or
- any later version published by the Free Software Foundation; with the
- Invariant Sections being ``GNU General Public License,'' with the
- Front-Cover Texts being ``A GNU Manual,'' and with the Back-Cover
- Texts as in (a) below. A copy of the license is included in the
- section entitled ``GNU Free Documentation License.''
- (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
- modify this GNU manual. Buying copies from the FSF supports it in
- developing GNU and promoting software freedom.''
- @end quotation
- @end copying
- @dircategory Programming
- @direntry
- * C: (c). GNU C Language Intro and Reference Manual
- @end direntry
- @documentencoding UTF-8
- @titlepage
- @sp 6
- @center @titlefont{GNU C}
- @center @titlefont{Language Intro}
- @center @titlefont{and}
- @center @titlefont{Reference Manual}
- @sp 4
- @c @center @value{EDITION} Edition
- @sp 5
- @center Richard Stallman
- @center and
- @center Trevis Rothwell
- @center plus Nelson Beebe
- @center on floating point
- @page
- @vskip 0pt plus 1filll
- @insertcopying
- @sp 2
- WILL BE Published by the Free Software Foundation @*
- 51 Franklin Street, Fifth Floor @*
- Boston, MA 02110-1301 USA @*
- ISBN ?-??????-??-?
- @ignore
- @sp 1
- Cover art by J. Random Artist
- @end ignore
- @end titlepage
- @summarycontents
- @contents
- @node Top
- @ifnottex
- @top GNU C Manual
- @end ifnottex
- @iftex
- @top Preface
- @end iftex
- This manual explains the C language for use with the GNU Compiler
- Collection (GCC) on the GNU/Linux system and other systems. We refer
- to this dialect as GNU C. If you already know C, you can use this as
- a reference manual.
- If you understand basic concepts of programming but know nothing about
- C, you can read this manual sequentially from the beginning to learn
- the C language.
- If you are a beginner to programming, we recommend you first learn a
- language with automatic garbage collection and no explicit pointers,
- rather than starting with C@. Good choices include Lisp, Scheme,
- Python and Java. C's explicit pointers mean that programmers must be
- careful to avoid certain kinds of errors.
- C is a venerable language; it was first used in 1973. The GNU C
- Compiler, which was subsequently extended into the GNU Compiler
- Collection, was first released in 1987. Other important languages
- were designed based on C: once you know C, it gives you a useful base
- for learning C@t{++}, C#, Java, Scala, D, Go, and more.
- The special advantage of C is that it is fairly simple while allowing
- close access to the computer's hardware, which previously required
- writing in assembler language to describe the individual machine
- instructions. Some have called C a ``high-level assembler language''
- because of its explicit pointers and lack of automatic management of
- storage. As one wag put it, ``C combines the power of assembler
- language with the convenience of assembler language.'' However, C is
- far more portable, and much easier to read and write, than assembler
- language.
- This manual focuses on the GNU C language supported by the GNU
- Compiler Collection, version ???. When a construct may be absent or
- work differently in other C compilers, we say so. When it is not part
- of ISO standard C, we say it is a ``GNU C extension,'' because it is
- useful to know that; however, other dialects and standards are not the
- focus of this manual. We keep those notes short, unless it is vital
- to say more. For the same reason, we hardly mention C@t{++} or other
- languages that the GNU Compiler Collection supports.
- Some aspects of the meaning of C programs depend on the target
- platform: which computer, and which operating system, the compiled
- code will run on. Where this is the case, we say so.
- The C language provides no built-in facilities for performing such
- common operations as input/output, memory management, string
- manipulation, and the like. Instead, these facilities are defined in
- a standard library, which is automatically available in every C
- program. @xref{Top, The GNU C Library, , libc, The GNU C Library
- Reference Manual}.
- This manual incorporates the former GNU C Preprocessor Manual, which
- was among the earliest GNU Manuals. It also uses some text from the
- earlier GNU C Manual that was written by Trevis Rothwell and James
- Youngman.
- GNU C has many obscure features, each one either for historical
- compatibility or meant for very special situations. We have left them
- to a companion manual, the GNU C Obscurities Manual, which will be
- published digitally later.
- @menu
- * The First Example:: Getting started with basic C code.
- * Complete Program:: A whole example program
- that can be compiled and run.
- * Storage:: Basic layout of storage; bytes.
- * Beyond Integers:: Exploring different numeric types.
- * Lexical Syntax:: The various lexical components of C programs.
- * Arithmetic:: Numeric computations.
- * Assignment Expressions:: Storing values in variables.
- * Execution Control Expressions:: Expressions combining values in various ways.
- * Binary Operator Grammar:: An overview of operator precedence.
- * Order of Execution:: The order of program execution.
- * Primitive Types:: More details about primitive data types.
- * Constants:: Explicit constant values:
- details and examples.
- * Type Size:: The memory space occupied by a type.
- * Pointers:: Creating and manipulating memory pointers.
- * Structures:: Compound data types built
- by grouping other types.
- * Arrays:: Creating and manipulating arrays.
- * Enumeration Types:: Sets of integers with named values.
- * Defining Typedef Names:: Using @code{typedef} to define type names.
- * Statements:: Controling program flow.
- * Variables:: Details about declaring, initializing,
- and using variables.
- * Type Qualifiers:: Mark variables for certain intended uses.
- * Functions:: Declaring, defining, and calling functions.
- * Compatible Types:: How to tell if two types are compatible
- with each other.
- * Type Conversions:: Converting between types.
- * Scope:: Different categories of identifier scope.
- * Preprocessing:: Using the GNU C preprocessor.
- * Integers in Depth:: How integer numbers are represented.
- * Floating Point in Depth:: How floating-point numbers are represented.
- * Compilation:: How to compile multi-file programs.
- * Directing Compilation:: Operations that affect compilation
- but don't change the program.
- Appendices
- * Type Alignment:: Where in memory a type can validly start.
- * Aliasing:: Accessing the same data in two types.
- * Digraphs:: Two-character aliases for some characters.
- * Attributes:: Specifying additional information
- in a declaration.
- * Signals:: Fatal errors triggered in various scenarios.
- * GNU Free Documentation License:: The license for this manual.
- * Symbol Index:: Keyword and symbol index.
- * Concept Index:: Detailed topical index.
- @detailmenu
- --- The Detailed Node Listing ---
- * Recursive Fibonacci:: Writing a simple function recursively.
- * Stack:: Each function call uses space in the stack.
- * Iterative Fibonacci:: Writing the same function iteratively.
- * Complete Example:: Turn the simple function into a full program.
- * Complete Explanation:: Explanation of each part of the example.
- * Complete Line-by-Line:: Explaining each line of the example.
- * Compile Example:: Using GCC to compile the example.
- * Float Example:: A function that uses floating-point numbers.
- * Array Example:: A function that works with arrays.
- * Array Example Call:: How to call that function.
- * Array Example Variations:: Different ways to write the call example.
- Lexical Syntax
- * English:: Write programs in English!
- * Characters:: The characters allowed in C programs.
- * Whitespace:: The particulars of whitespace characters.
- * Comments:: How to include comments in C code.
- * Identifiers:: How to form identifiers (names).
- * Operators/Punctuation:: Characters used as operators or punctuation.
- * Line Continuation:: Splitting one line into multiple lines.
- * Digraphs:: Two-character substitutes for some characters.
- Arithmetic
- * Basic Arithmetic:: Addition, subtraction, multiplication,
- and division.
- * Integer Arithmetic:: How C performs arithmetic with integer values.
- * Integer Overflow:: When an integer value exceeds the range
- of its type.
- * Mixed Mode:: Calculating with both integer values
- and floating-point values.
- * Division and Remainder:: How integer division works.
- * Numeric Comparisons:: Comparing numeric values for
- equality or order.
- * Shift Operations:: Shift integer bits left or right.
- * Bitwise Operations:: Bitwise conjunction, disjunction, negation.
- Assignment Expressions
- * Simple Assignment:: The basics of storing a value.
- * Lvalues:: Expressions into which a value can be stored.
- * Modifying Assignment:: Shorthand for changing an lvalue's contents.
- * Increment/Decrement:: Shorthand for incrementing and decrementing
- an lvalue's contents.
- * Postincrement/Postdecrement:: Accessing then incrementing or decrementing.
- * Assignment in Subexpressions:: How to avoid ambiguity.
- * Write Assignments Separately:: Write assignments as separate statements.
- Execution Control Expressions
- * Logical Operators:: Logical conjunction, disjunction, negation.
- * Logicals and Comparison:: Logical operators with comparison operators.
- * Logicals and Assignments:: Assignments with logical operators.
- * Conditional Expression:: An if/else construct inside expressions.
- * Comma Operator:: Build a sequence of subexpressions.
- Order of Execution
- * Reordering of Operands:: Operations in C are not necessarily computed
- in the order they are written.
- * Associativity and Ordering:: Some associative operations are performed
- in a particular order; others are not.
- * Sequence Points:: Some guarantees about the order of operations.
- * Postincrement and Ordering:: Ambiguous excution order with postincrement.
- * Ordering of Operands:: Evaluation order of operands
- and function arguments.
- * Optimization and Ordering:: Compiler optimizations can reorder operations
- only if it has no impact on program results.
- Primitive Data Types
- * Integer Types:: Description of integer types.
- * Floating-Point Data Types:: Description of floating-point types.
- * Complex Data Types:: Description of complex number types.
- * The Void Type:: A type indicating no value at all.
- * Other Data Types:: A brief summary of other types.
- Constants
- * Integer Constants:: Literal integer values.
- * Integer Const Type:: Types of literal integer values.
- * Floating Constants:: Literal floating-point values.
- * Imaginary Constants:: Literal imaginary number values.
- * Invalid Numbers:: Avoiding preprocessing number misconceptions.
- * Character Constants:: Literal character values.
- * Unicode Character Codes:: Unicode characters represented
- in either UTF-16 or UTF-32.
- * Wide Character Constants:: Literal characters values larger than 8 bits.
- * String Constants:: Literal string values.
- * UTF-8 String Constants:: Literal UTF-8 string values.
- * Wide String Constants:: Literal string values made up of
- 16- or 32-bit characters.
- Pointers
- * Address of Data:: Using the ``address-of'' operator.
- * Pointer Types:: For each type, there is a pointer type.
- * Pointer Declarations:: Declaring variables with pointer types.
- * Pointer Type Designators:: Designators for pointer types.
- * Pointer Dereference:: Accessing what a pointer points at.
- * Null Pointers:: Pointers which do not point to any object.
- * Invalid Dereference:: Dereferencing null or invalid pointers.
- * Void Pointers:: Totally generic pointers, can cast to any.
- * Pointer Comparison:: Comparing memory address values.
- * Pointer Arithmetic:: Computing memory address values.
- * Pointers and Arrays:: Using pointer syntax instead of array syntax.
- * Pointer Arithmetic Low Level:: More about computing memory address values.
- * Pointer Increment/Decrement:: Incrementing and decrementing pointers.
- * Pointer Arithmetic Drawbacks:: A common pointer bug to watch out for.
- * Pointer-Integer Conversion:: Converting pointer types to integer types.
- * Printing Pointers:: Using @code{printf} for a pointer's value.
- Structures
- * Referencing Fields:: Accessing field values in a structure object.
- * Dynamic Memory Allocation:: Allocating space for objects
- while the program is running.
- * Field Offset:: Memory layout of fields within a structure.
- * Structure Layout:: Planning the memory layout of fields.
- * Packed Structures:: Packing structure fields as close as possible.
- * Bit Fields:: Dividing integer fields
- into fields with fewer bits.
- * Bit Field Packing:: How bit fields pack together in integers.
- * const Fields:: Making structure fields immutable.
- * Zero Length:: Zero-length array as a variable-length object.
- * Flexible Array Fields:: Another approach to variable-length objects.
- * Overlaying Structures:: Casting one structure type
- over an object of another structure type.
- * Structure Assignment:: Assigning values to structure objects.
- * Unions:: Viewing the same object in different types.
- * Packing With Unions:: Using a union type to pack various types into
- the same memory space.
- * Cast to Union:: Casting a value one of the union's alternative
- types to the type of the union itself.
- * Structure Constructors:: Building new structure objects.
- * Unnamed Types as Fields:: Fields' types do not always need names.
- * Incomplete Types:: Types which have not been fully defined.
- * Intertwined Incomplete Types:: Defining mutually-recursive structue types.
- * Type Tags:: Scope of structure and union type tags.
- Arrays
- * Accessing Array Elements:: How to access individual elements of an array.
- * Declaring an Array:: How to name and reserve space for a new array.
- * Strings:: A string in C is a special case of array.
- * Incomplete Array Types:: Naming, but not allocating, a new array.
- * Limitations of C Arrays:: Arrays are not first-class objects.
- * Multidimensional Arrays:: Arrays of arrays.
- * Constructing Array Values:: Assigning values to an entire array at once.
- * Arrays of Variable Length:: Declaring arrays of non-constant size.
- Statements
- * Expression Statement:: Evaluate an expression, as a statement,
- usually done for a side effect.
- * if Statement:: Basic conditional execution.
- * if-else Statement:: Multiple branches for conditional execution.
- * Blocks:: Grouping multiple statements together.
- * return Statement:: Return a value from a function.
- * Loop Statements:: Repeatedly executing a statement or block.
- * switch Statement:: Multi-way conditional choices.
- * switch Example:: A plausible example of using @code{switch}.
- * Duffs Device:: A special way to use @code{switch}.
- * Case Ranges:: Ranges of values for @code{switch} cases.
- * Null Statement:: A statement that does nothing.
- * goto Statement:: Jump to another point in the source code,
- identified by a label.
- * Local Labels:: Labels with limited scope.
- * Labels as Values:: Getting the address of a label.
- * Statement Exprs:: A series of statements used as an expression.
- Variables
- * Variable Declarations:: Name a variable and and reserve space for it.
- * Initializers:: Assigning inital values to variables.
- * Designated Inits:: Assigning initial values to array elements
- at particular array indices.
- * Auto Type:: Obtaining the type of a variable.
- * Local Variables:: Variables declared in function definitions.
- * File-Scope Variables:: Variables declared outside of
- function definitions.
- * Static Local Variables:: Variables declared within functions,
- but with permanent storage allocation.
- * Extern Declarations:: Declaring a variable
- which is allocated somewhere else.
- * Allocating File-Scope:: When is space allocated
- for file-scope variables?
- * auto and register:: Historically used storage directions.
- * Omitting Types:: The bad practice of declaring variables
- with implicit type.
- Type Qualifiers
- * const:: Variables whose values don't change.
- * volatile:: Variables whose values may be accessed
- or changed outside of the control of
- this program.
- * restrict Pointers:: Restricted pointers for code optimization.
- * restrict Pointer Example:: Example of how that works.
- Functions
- * Function Definitions:: Writing the body of a function.
- * Function Declarations:: Declaring the interface of a function.
- * Function Calls:: Using functions.
- * Function Call Semantics:: Call-by-value argument passing.
- * Function Pointers:: Using references to functions.
- * The main Function:: Where execution of a GNU C program begins.
- Type Conversions
- * Explicit Type Conversion:: Casting a value from one type to another.
- * Assignment Type Conversions:: Automatic conversion by assignment operation.
- * Argument Promotions:: Automatic conversion of function parameters.
- * Operand Promotions:: Automatic conversion of arithmetic operands.
- * Common Type:: When operand types differ, which one is used?
- Scope
- * Scope:: Different categories of identifier scope.
- Preprocessing
- * Preproc Overview:: Introduction to the C preprocessor.
- * Directives:: The form of preprocessor directives.
- * Preprocessing Tokens:: The lexical elements of preprocessing.
- * Header Files:: Including one source file in another.
- * Macros:: Macro expansion by the preprocessor.
- * Conditionals:: Controling whether to compile some lines
- or ignore them.
- * Diagnostics:: Reporting warnings and errors.
- * Line Control:: Reporting source line numbers.
- * Null Directive:: A preprocessing no-op.
- Integers in Depth
- * Integer Representations:: How integer values appear in memory.
- * Maximum and Minimum Values:: Value ranges of integer types.
- Floating Point in Depth
- * Floating Representations:: How floating-point values appear in memory.
- * Floating Type Specs:: Precise details of memory representations.
- * Special Float Values:: Infinity, Not a Number, and Subnormal Numbers.
- * Invalid Optimizations:: Don't mess up non-numbers and signed zeros.
- * Exception Flags:: Handling certain conditions in floating point.
- * Exact Floating-Point:: Not all floating calculations lose precision.
- * Rounding:: When a floating result can't be represented
- exactly in the floating-point type in use.
- * Rounding Issues:: Avoid magnifying rounding errors.
- * Significance Loss:: Subtracting numbers that are almost equal.
- * Fused Multiply-Add:: Taking advantage of a special floating-point
- instruction for faster execution.
- * Error Recovery:: Determining rounding errors.
- * Exact Floating Constants:: Precisely specified floating-point numbers.
- * Handling Infinity:: When floating calculation is out of range.
- * Handling NaN:: What floating calculation is undefined.
- * Signed Zeros:: Positive zero vs. negative zero.
- * Scaling by the Base:: A useful exact floating-point operation.
- * Rounding Control:: Specifying some rounding behaviors.
- * Machine Epsilon:: The smallest number you can add to 1.0
- and get a sum which is larger than 1.0.
- * Complex Arithmetic:: Details of arithmetic with complex numbers.
- * Round-Trip Base Conversion:: What happens between base-2 and base-10.
- * Further Reading:: References for floating-point numbers.
- Directing Compilation
- * Pragmas:: Controling compilation of some constructs.
- * Static Assertions:: Compile-time tests for conditions.
- @end detailmenu
- @end menu
- @node The First Example
- @chapter The First Example
- This chapter presents the source code for a very simple C program and
- uses it to explain a few features of the language. If you already
- know the basic points of C presented in this chapter, you can skim it
- or skip it.
- @menu
- * Recursive Fibonacci:: Writing a simple function recursively.
- * Stack:: Each function call uses space in the stack.
- * Iterative Fibonacci:: Writing the same function iteratively.
- @end menu
- @node Recursive Fibonacci
- @section Example: Recursive Fibonacci
- @cindex recursive Fibonacci function
- @cindex Fibonacci function, recursive
- To introduce the most basic features of C, let's look at code for a
- simple mathematical function that does calculations on integers. This
- function calculates the @var{n}th number in the Fibonacci series, in
- which each number is the sum of the previous two: 1, 1, 2, 3, 5, 8,
- 13, 21, 34, 55, @dots{}.
- @example
- int
- fib (int n)
- @{
- if (n <= 2) /* @r{This avoids infinite recursion.} */
- return 1;
- else
- return fib (n - 1) + fib (n - 2);
- @}
- @end example
- This very simple program illustrates several features of C:
- @itemize @bullet
- @item
- A function definition, whose first two lines constitute the function
- header. @xref{Function Definitions}.
- @item
- A function parameter @code{n}, referred to as the variable @code{n}
- inside the function body. @xref{Function Parameter Variables}.
- A function definition uses parameters to refer to the argument
- values provided in a call to that function.
- @item
- Arithmetic. C programs add with @samp{+} and subtract with
- @samp{-}. @xref{Arithmetic}.
- @item
- Numeric comparisons. The operator @samp{<=} tests for ``less than or
- equal.'' @xref{Numeric Comparisons}.
- @item
- Integer constants written in base 10.
- @xref{Integer Constants}.
- @item
- A function call. The function call @code{fib (n - 1)} calls the
- function @code{fib}, passing as its argument the value @code{n - 1}.
- @xref{Function Calls}.
- @item
- A comment, which starts with @samp{/*} and ends with @samp{*/}. The
- comment has no effect on the execution of the program. Its purpose is
- to provide explanations to people reading the source code. Including
- comments in the code is tremendously important---they provide
- background information so others can understand the code more quickly.
- @xref{Comments}.
- @item
- Two kinds of statements, the @code{return} statement and the
- @code{if}@dots{}@code{else} statement. @xref{Statements}.
- @item
- Recursion. The function @code{fib} calls itself; that is called a
- @dfn{recursive call}. These are valid in C, and quite common.
- The @code{fib} function would not be useful if it didn't return.
- Thus, recursive definitions, to be of any use, must avoid infinite
- recursion.
- This function definition prevents infinite recursion by specially
- handling the case where @code{n} is two or less. Thus the maximum
- depth of recursive calls is less than @code{n}.
- @end itemize
- @menu
- * Function Header:: The function's name and how it is called.
- * Function Body:: Declarations and statements that implement the function.
- @end menu
- @node Function Header
- @subsection Function Header
- @cindex function header
- In our example, the first two lines of the function definition are the
- @dfn{header}. Its purpose is to state the function's name and say how
- it is called:
- @example
- int
- fib (int n)
- @end example
- @noindent
- says that the function returns an integer (type @code{int}), its name is
- @code{fib}, and it takes one argument named @code{n} which is also an
- integer. (Data types will be explained later, in @ref{Primitive Types}.)
- @node Function Body
- @subsection Function Body
- @cindex function body
- @cindex recursion
- The rest of the function definition is called the @dfn{function body}.
- Like every function body, this one starts with @samp{@{}, ends with
- @samp{@}}, and contains zero or more @dfn{statements} and
- @dfn{declarations}. Statements specify actions to take, whereas
- declarations define names of variables, functions, and so on. Each
- statement and each declaration ends with a semicolon (@samp{;}).
- Statements and declarations often contain @dfn{expressions}; an
- expression is a construct whose execution produces a @dfn{value} of
- some data type, but may also take actions through ``side effects''
- that alter subsequent execution. A statement, by contrast, does not
- have a value; it affects further execution of the program only through
- the actions it takes.
- This function body contains no declarations, and just one statement,
- but that one is a complex statement in that it contains nested
- statements. This function uses two kinds of statements:
- @table @code
- @item return
- The @code{return} statement makes the function return immediately.
- It looks like this:
- @example
- return @var{value};
- @end example
- Its meaning is to compute the expression @var{value} and exit the
- function, making it return whatever value that expression produced.
- For instance,
- @example
- return 1;
- @end example
- @noindent
- returns the integer 1 from the function, and
- @example
- return fib (n - 1) + fib (n - 2);
- @end example
- @noindent
- returns a value computed by performing two function calls
- as specified and adding their results.
- @item @code{if}@dots{}@code{else}
- The @code{if}@dots{}@code{else} statement is a @dfn{conditional}.
- Each time it executes, it chooses one of its two substatements to execute
- and ignores the other. It looks like this:
- @example
- if (@var{condition})
- @var{if-true-statement}
- else
- @var{if-false-statement}
- @end example
- Its meaning is to compute the expression @var{condition} and, if it's
- ``true,'' execute @var{if-true-statement}. Otherwise, execute
- @var{if-false-statement}. @xref{if-else Statement}.
- Inside the @code{if}@dots{}@code{else} statement, @var{condition} is
- simply an expression. It's considered ``true'' if its value is
- nonzero. (A comparison operation, such as @code{n <= 2}, produces the
- value 1 if it's ``true'' and 0 if it's ``false.'' @xref{Numeric
- Comparisons}.) Thus,
- @example
- if (n <= 2)
- return 1;
- else
- return fib (n - 1) + fib (n - 2);
- @end example
- @noindent
- first tests whether the value of @code{n} is less than or equal to 2.
- If so, the expression @code{n <= 2} has the value 1. So execution
- continues with the statement
- @example
- return 1;
- @end example
- @noindent
- Otherwise, execution continues with this statement:
- @example
- return fib (n - 1) + fib (n - 2);
- @end example
- Each of these statements ends the execution of the function and
- provides a value for it to return. @xref{return Statement}.
- @end table
- Calculating @code{fib} using ordinary integers in C works only for
- @var{n} < 47, because the value of @code{fib (47)} is too large to fit
- in type @code{int}. The addition operation that tries to add
- @code{fib (46)} and @code{fib (45)} cannot deliver the correct result.
- This occurrence is called @dfn{integer overflow}.
- Overflow can manifest itself in various ways, but one thing that can't
- possibly happen is to produce the correct value, since that can't fit
- in the space for the value. @xref{Integer Overflow}.
- @xref{Functions}, for a full explanation about functions.
- @node Stack
- @section The Stack, And Stack Overflow
- @cindex stack
- @cindex stack frame
- @cindex stack overflow
- @cindex recursion, drawbacks of
- @cindex stack frame
- Recursion has a drawback: there are limits to how many nested function
- calls a program can make. In C, each function call allocates a block
- of memory which it uses until the call returns. C allocates these
- blocks consecutively within a large area of memory known as the
- @dfn{stack}, so we refer to the blocks as @dfn{stack frames}.
- The size of the stack is limited; if the program tries to use too
- much, that causes the program to fail because the stack is full. This
- is called @dfn{stack overflow}.
- @cindex crash
- @cindex segmentation fault
- Stack overflow on GNU/Linux typically manifests itself as the
- @dfn{signal} named @code{SIGSEGV}, also known as a ``segmentation
- fault.'' By default, this signal terminates the program immediately,
- rather than letting the program try to recover, or reach an expected
- ending point. (We commonly say in this case that the program
- ``crashes''). @xref{Signals}.
- It is inconvenient to observe a crash by passing too large
- an argument to recursive Fibonacci, because the program would run a
- long time before it crashes. This algorithm is simple but
- ridiculously slow: in calculating @code{fib (@var{n})}, the number of
- (recursive) calls @code{fib (1)} or @code{fib (2)} that it makes equals
- the final result.
- However, you can observe stack overflow very quickly if you use
- this function instead:
- @example
- int
- fill_stack (int n)
- @{
- if (n <= 1) /* @r{This limits the depth of recursion.} */
- return 1;
- else
- return fill_stack (n - 1);
- @}
- @end example
- Under gNewSense GNU/Linux on the Lemote Yeeloong, without optimization
- and using the default configuration, an experiment showed there is
- enough stack space to do 261906 nested calls to that function. One
- more, and the stack overflows and the program crashes. On another
- platform, with a different configuration, or with a different
- function, the limit might be bigger or smaller.
- @node Iterative Fibonacci
- @section Example: Iterative Fibonacci
- @cindex iterative Fibonacci function
- @cindex Fibonacci function, iterative
- Here's a much faster algorithm for computing the same Fibonacci
- series. It is faster for two reasons. First, it uses @dfn{iteration}
- (that is, repetition or looping) rather than recursion, so it doesn't
- take time for a large number of function calls. But mainly, it is
- faster because the number of repetitions is small---only @code{@var{n}}.
- @c If you change this, change the duplicate in node Example of for.
- @example
- int
- fib (int n)
- @{
- int last = 1; /* @r{Initial value is @code{fib (1)}.} */
- int prev = 0; /* @r{Initial value controls @code{fib (2)}.} */
- int i;
- for (i = 1; i < n; ++i)
- /* @r{If @code{n} is 1 or less, the loop runs zero times,} */
- /* @r{since @code{i < n} is false the first time.} */
- @{
- /* @r{Now @code{last} is @code{fib (@code{i})}}
- @r{and @code{prev} is @code{fib (@code{i} @minus{} 1)}.} */
- /* @r{Compute @code{fib (@code{i} + 1)}.} */
- int next = prev + last;
- /* @r{Shift the values down.} */
- prev = last;
- last = next;
- /* @r{Now @code{last} is @code{fib (@code{i} + 1)}}
- @r{and @code{prev} is @code{fib (@code{i})}.}
- @r{But that won't stay true for long,}
- @r{because we are about to increment @code{i}.} */
- @}
- return last;
- @}
- @end example
- This definition computes @code{fib (@var{n})} in a time proportional
- to @code{@var{n}}. The comments in the definition explain how it works: it
- advances through the series, always keeps the last two values in
- @code{last} and @code{prev}, and adds them to get the next value.
- Here are the additional C features that this definition uses:
- @table @asis
- @item Internal blocks
- Within a function, wherever a statement is called for, you can write a
- @dfn{block}. It looks like @code{@{ @r{@dots{}} @}} and contains zero or
- more statements and declarations. (You can also use additional
- blocks as statements in a block.)
- The function body also counts as a block, which is why it can contain
- statements and declarations.
- @xref{Blocks}.
- @item Declarations of local variables
- This function body contains declarations as well as statements. There
- are three declarations directly in the function body, as well as a
- fourth declaration in an internal block. Each starts with @code{int}
- because it declares a variable whose type is integer. One declaration
- can declare several variables, but each of these declarations is
- simple and declares just one variable.
- Variables declared inside a block (either a function body or an
- internal block) are @dfn{local variables}. These variables exist only
- within that block; their names are not defined outside the block, and
- exiting the block deallocates their storage. This example declares
- four local variables: @code{last}, @code{prev}, @code{i}, and
- @code{next}.
- The most basic local variable declaration looks like this:
- @example
- @var{type} @var{variablename};
- @end example
- For instance,
- @example
- int i;
- @end example
- @noindent
- declares the local variable @code{i} as an integer.
- @xref{Variable Declarations}.
- @item Initializers
- When you declare a variable, you can also specify its initial value,
- like this:
- @example
- @var{type} @var{variablename} = @var{value};
- @end example
- For instance,
- @example
- int last = 1;
- @end example
- @noindent
- declares the local variable @code{last} as an integer (type
- @code{int}) and starts it off with the value 1. @xref{Initializers}.
- @item Assignment
- Assignment: a specific kind of expression, written with the @samp{=}
- operator, that stores a new value in a variable or other place. Thus,
- @example
- @var{variable} = @var{value}
- @end example
- @noindent
- is an expression that computes @code{@var{value}} and stores the value in
- @code{@var{variable}}. @xref{Assignment Expressions}.
- @item Expression statements
- An expression statement is an expression followed by a semicolon.
- That computes the value of the expression, then ignores the value.
- An expression statement is useful when the expression changes some
- data or has other side effects---for instance, with function calls, or
- with assignments as in this example. @xref{Expression Statement}.
- Using an expression with no side effects in an expression statement is
- pointless except in very special cases. For instance, the expression
- statement @code{x;} would examine the value of @code{x} and ignore it.
- That is not useful.
- @item Increment operator
- The increment operator is @samp{++}. @code{++i} is an
- expression that is short for @code{i = i + 1}.
- @xref{Increment/Decrement}.
- @item @code{for} statements
- A @code{for} statement is a clean way of executing a statement
- repeatedly---a @dfn{loop} (@pxref{Loop Statements}). Specifically,
- @example
- for (i = 1; i < n; ++i)
- @var{body}
- @end example
- @noindent
- means to start by doing @code{i = 1} (set @code{i} to one) to prepare
- for the loop. The loop itself consists of
- @itemize @bullet
- @item
- Testing @code{i < n} and exiting the loop if that's false.
- @item
- Executing @var{body}.
- @item
- Advancing the loop (executing @code{++i}, which increments @code{i}).
- @end itemize
- The net result is to execute @var{body} with 0 in @code{i},
- then with 1 in @code{i}, and so on, stopping just before the repetition
- where @code{i} would equal @code{n}.
- The body of the @code{for} statement must be one and only one
- statement. You can't write two statements in a row there; if you try
- to, only the first of them will be treated as part of the loop.
- The way to put multiple statements in those places is to group them
- with a block, and that's what we do in this example.
- @end table
- @node Complete Program
- @chapter A Complete Program
- @cindex complete example program
- @cindex example program, complete
- It's all very well to write a Fibonacci function, but you cannot run
- it by itself. It is a useful program, but it is not a complete
- program.
- In this chapter we present a complete program that contains the
- @code{fib} function. This example shows how to make the program
- start, how to make it finish, how to do computation, and how to print
- a result.
- @menu
- * Complete Example:: Turn the simple function into a full program.
- * Complete Explanation:: Explanation of each part of the example.
- * Complete Line-by-Line:: Explaining each line of the example.
- * Compile Example:: Using GCC to compile the example.
- @end menu
- @node Complete Example
- @section Complete Program Example
- Here is the complete program that uses the simple, recursive version
- of the @code{fib} function (@pxref{Recursive Fibonacci}):
- @example
- #include <stdio.h>
- int
- fib (int n)
- @{
- if (n <= 2) /* @r{This avoids infinite recursion.} */
- return 1;
- else
- return fib (n - 1) + fib (n - 2);
- @}
- int
- main (void)
- @{
- printf ("Fibonacci series item %d is %d\n",
- 20, fib (20));
- return 0;
- @}
- @end example
- @noindent
- This program prints a message that shows the value of @code{fib (20)}.
- Now for an explanation of what that code means.
- @node Complete Explanation
- @section Complete Program Explanation
- @ifnottex
- Here's the explanation of the code of the example in the
- previous section.
- @end ifnottex
- This sample program prints a message that shows the value of @code{fib
- (20)}, and exits with code 0 (which stands for successful execution).
- Every C program is started by running the function named @code{main}.
- Therefore, the example program defines a function named @code{main} to
- provide a way to start it. Whatever that function does is what the
- program does. @xref{The main Function}.
- The @code{main} function is the first one called when the program
- runs, but it doesn't come first in the example code. The order of the
- function definitions in the source code makes no difference to the
- program's meaning.
- The initial call to @code{main} always passes certain arguments, but
- @code{main} does not have to pay attention to them. To ignore those
- arguments, define @code{main} with @code{void} as the parameter list.
- (@code{void} as a function's parameter list normally means ``call with
- no arguments,'' but @code{main} is a special case.)
- The function @code{main} returns 0 because that is
- the conventional way for @code{main} to indicate successful execution.
- It could instead return a positive integer to indicate failure, and
- some utility programs have specific conventions for the meaning of
- certain numeric @dfn{failure codes}. @xref{Values from main}.
- @cindex @code{printf}
- The simplest way to print text in C is by calling the @code{printf}
- function, so here we explain what that does.
- @cindex standard output
- The first argument to @code{printf} is a @dfn{string constant}
- (@pxref{String Constants}) that is a template for output. The
- function @code{printf} copies most of that string directly as output,
- including the newline character at the end of the string, which is
- written as @samp{\n}. The output goes to the program's @dfn{standard
- output} destination, which in the usual case is the terminal.
- @samp{%} in the template introduces a code that substitutes other text
- into the output. Specifically, @samp{%d} means to take the next
- argument to @code{printf} and substitute it into the text as a decimal
- number. (The argument for @samp{%d} must be of type @code{int}; if it
- isn't, @code{printf} will malfunction.) So the output is a line that
- looks like this:
- @example
- Fibonacci series item 20 is 6765
- @end example
- This program does not contain a definition for @code{printf} because
- it is defined by the C library, which makes it available in all C
- programs. However, each program does need to @dfn{declare}
- @code{printf} so it will be called correctly. The @code{#include}
- line takes care of that; it includes a @dfn{header file} called
- @file{stdio.h} into the program's code. That file is provided by the
- operating system and it contains declarations for the many standard
- input/output functions in the C library, one of which is
- @code{printf}.
- Don't worry about header files for now; we'll explain them later in
- @ref{Header Files}.
- The first argument of @code{printf} does not have to be a string
- constant; it can be any string (@pxref{Strings}). However, using a
- constant is the most common case.
- To learn more about @code{printf} and other facilities of the C
- library, see @ref{Top, The GNU C Library, , libc, The GNU C Library
- Reference Manual}.
- @node Complete Line-by-Line
- @section Complete Program, Line by Line
- Here's the same example, explained line by line.
- @strong{Beginners, do you find this helpful or not?
- Would you prefer a different layout for the example?
- Please tell rms@@gnu.org.}
- @example
- #include <stdio.h> /* @r{Include declaration of usual} */
- /* @r{I/O functions such as @code{printf}.} */
- /* @r{Most programs need these.} */
- int /* @r{This function returns an @code{int}.} */
- fib (int n) /* @r{Its name is @code{fib};} */
- /* @r{its argument is called @code{n}.} */
- @{ /* @r{Start of function body.} */
- /* @r{This stops the recursion from being infinite.} */
- if (n <= 2) /* @r{If @code{n} is 1 or 2,} */
- return 1; /* @r{make @code{fib} return 1.} */
- else /* @r{otherwise, add the two previous} */
- /* @r{fibonacci numbers.} */
- return fib (n - 1) + fib (n - 2);
- @}
- int /* @r{This function returns an @code{int}.} */
- main (void) /* @r{Start here; ignore arguments.} */
- @{ /* @r{Print message with numbers in it.} */
- printf ("Fibonacci series item %d is %d\n",
- 20, fib (20));
- return 0; /* @r{Terminate program, report success.} */
- @}
- @end example
- @node Compile Example
- @section Compiling the Example Program
- @cindex compiling
- @cindex executable file
- To run a C program requires converting the source code into an
- @dfn{executable file}. This is called @dfn{compiling} the program,
- and the command to do that using GNU C is @command{gcc}.
- This example program consists of a single source file. If we
- call that file @file{fib1.c}, the complete command to compile it is
- this:
- @example
- gcc -g -O -o fib1 fib1.c
- @end example
- @noindent
- Here, @option{-g} says to generate debugging information, @option{-O}
- says to optimize at the basic level, and @option{-o fib1} says to put
- the executable program in the file @file{fib1}.
- To run the program, use its file name as a shell command.
- For instance,
- @example
- ./fib1
- @end example
- @noindent
- However, unless you are sure the program is correct, you should
- expect to need to debug it. So use this command,
- @example
- gdb fib1
- @end example
- @noindent
- which starts the GDB debugger (@pxref{Sample Session, Sample Session,
- A Sample GDB Session, gdb, Debugging with GDB}) so you can run and
- debug the executable program @code{fib1}.
- @xref{Compilation}, for an introduction to compiling more complex
- programs which consist of more than one source file.
- @node Storage
- @chapter Storage and Data
- @cindex bytes
- @cindex storage organization
- @cindex memory organization
- Storage in C programs is made up of units called @dfn{bytes}. On
- nearly all computers, a byte consists of 8 bits, but there are a few
- peculiar computers (mostly ``embedded controllers'' for very small
- systems) where a byte is longer than that. This manual does not try
- to explain the peculiarity of those computers; we assume that a byte
- is 8 bits.
- Every C data type is made up of a certain number of bytes; that number
- is the data type's @dfn{size}. @xref{Type Size}, for details. The
- types @code{signed char} and @code{unsigned char} are one byte long;
- use those types to operate on data byte by byte. @xref{Signed and
- Unsigned Types}. You can refer to a series of consecutive bytes as an
- array of @code{char} elements; that's what an ASCII string looks like
- in memory. @xref{String Constants}.
- @node Beyond Integers
- @chapter Beyond Integers
- So far we've presented programs that operate on integers. In this
- chapter we'll present examples of handling non-integral numbers and
- arrays of numbers.
- @menu
- * Float Example:: A function that uses floating-point numbers.
- * Array Example:: A function that works with arrays.
- * Array Example Call:: How to call that function.
- * Array Example Variations:: Different ways to write the call example.
- @end menu
- @node Float Example
- @section An Example with Non-Integer Numbers
- @cindex floating point example
- Here's a function that operates on and returns @dfn{floating point}
- numbers that don't have to be integers. Floating point represents a
- number as a fraction together with a power of 2. (For more detail,
- @pxref{Floating-Point Data Types}.) This example calculates the
- average of three floating point numbers that are passed to it as
- arguments:
- @example
- double
- average_of_three (double a, double b, double c)
- @{
- return (a + b + c) / 3;
- @}
- @end example
- The values of the parameter @var{a}, @var{b} and @var{c} do not have to be
- integers, and even when they happen to be integers, most likely their
- average is not an integer.
- @code{double} is the usual data type in C for calculations on
- floating-point numbers.
- To print a @code{double} with @code{printf}, we must use @samp{%f}
- instead of @samp{%d}:
- @example
- printf ("Average is %f\n",
- average_of_three (1.1, 9.8, 3.62));
- @end example
- The code that calls @code{printf} must pass a @code{double} for
- printing with @samp{%f} and an @code{int} for printing with @samp{%d}.
- If the argument has the wrong type, @code{printf} will produce garbage
- output.
- Here's a complete program that computes the average of three
- specific numbers and prints the result:
- @example
- double
- average_of_three (double a, double b, double c)
- @{
- return (a + b + c) / 3;
- @}
- int
- main (void)
- @{
- printf ("Average is %f\n",
- average_of_three (1.1, 9.8, 3.62));
- return 0;
- @}
- @end example
- From now on we will not present examples of calls to @code{main}.
- Instead we encourage you to write them for yourself when you want
- to test executing some code.
- @node Array Example
- @section An Example with Arrays
- @cindex array example
- A function to take the average of three numbers is very specific and
- limited. A more general function would take the average of any number
- of numbers. That requires passing the numbers in an array. An array
- is an object in memory that contains a series of values of the same
- data type. This chapter presents the basic concepts and use of arrays
- through an example; for the full explanation, see @ref{Arrays}.
- Here's a function definition to take the average of several
- floating-point numbers, passed as type @code{double}. The first
- parameter, @code{length}, specifies how many numbers are passed. The
- second parameter, @code{input_data}, is an array that holds those
- numbers.
- @example
- double
- avg_of_double (int length, double input_data[])
- @{
- double sum = 0;
- int i;
- for (i = 0; i < length; i++)
- sum = sum + input_data[i];
- return sum / length;
- @}
- @end example
- This introduces the expression to refer to an element of an array:
- @code{input_data[i]} means the element at index @code{i} in
- @code{input_data}. The index of the element can be any expression
- with an integer value; in this case, the expression is @code{i}.
- @xref{Accessing Array Elements}.
- @cindex zero-origin indexing
- The lowest valid index in an array is 0, @emph{not} 1, and the highest
- valid index is one less than the number of elements. (This is known
- as @dfn{zero-origin indexing}.)
- This example also introduces the way to declare that a function
- parameter is an array. Such declarations are modeled after the syntax
- for an element of the array. Just as @code{double foo} declares that
- @code{foo} is of type @code{double}, @code{double input_data[]}
- declares that each element of @code{input_data} is of type
- @code{double}. Therefore, @code{input_data} itself has type ``array
- of @code{double}.''
- When declaring an array parameter, it's not necessary to say how long
- the array is. In this case, the parameter @code{input_data} has no
- length information. That's why the function needs another parameter,
- @code{length}, for the caller to provide that information to the
- function @code{avg_of_double}.
- @node Array Example Call
- @section Calling the Array Example
- To call the function @code{avg_of_double} requires making an
- array and then passing it as an argument. Here is an example.
- @example
- @{
- /* @r{The array of values to average.} */
- double nums_to_average[5];
- /* @r{The average, once we compute it.} */
- double average;
- /* @r{Fill in elements of @code{nums_to_average}.} */
- nums_to_average[0] = 58.7;
- nums_to_average[1] = 5.1;
- nums_to_average[2] = 7.7;
- nums_to_average[3] = 105.2;
- nums_to_average[4] = -3.14159;
- average = avg_of_double (5, nums_to_average);
- /* @r{@dots{}now make use of @code{average}@dots{}} */
- @}
- @end example
- This shows an array subscripting expression again, this time
- on the left side of an assignment, storing a value into an
- element of an array.
- It also shows how to declare a local variable that is an array:
- @code{double nums_to_average[5];}. Since this declaration allocates the
- space for the array, it needs to know the array's length. You can
- specify the length with any expression whose value is an integer, but
- in this declaration the length is a constant, the integer 5.
- The name of the array, when used by itself as an expression, stands
- for the address of the array's data, and that's what gets passed to
- the function @code{avg_of_double} in @code{avg_of_double (5,
- nums_to_average)}.
- We can make the code easier to maintain by avoiding the need to write
- 5, the array length, when calling @code{avg_of_double}. That way, if
- we change the array to include more elements, we won't have to change
- that call. One way to do this is with the @code{sizeof} operator:
- @example
- average = avg_of_double ((sizeof (nums_to_average)
- / sizeof (nums_to_average[0])),
- nums_to_average);
- @end example
- This computes the number of elements in @code{nums_to_average} by dividing
- its total size by the size of one element. @xref{Type Size}, for more
- details of using @code{sizeof}.
- We don't show in this example what happens after storing the result of
- @code{avg_of_double} in the variable @code{average}. Presumably
- more code would follow that uses that result somehow. (Why compute
- the average and not use it?) But that isn't part of this topic.
- @node Array Example Variations
- @section Variations for Array Example
- The code to call @code{avg_of_double} has two declarations that
- start with the same data type:
- @example
- /* @r{The array of values to average.} */
- double nums_to_average[5];
- /* @r{The average, once we compute it.} */
- double average;
- @end example
- In C, you can combine the two, like this:
- @example
- double nums_to_average[5], average;
- @end example
- This declares @code{nums_to_average} so each of its elements is a
- @code{double}, and @code{average} so that it simply is a
- @code{double}.
- However, while you @emph{can} combine them, that doesn't mean you
- @emph{should}. If it is useful to write comments about the variables,
- and usually it is, then it's clearer to keep the declarations separate
- so you can put a comment on each one.
- We set all of the elements of the array @code{nums_to_average} with
- assignments, but it is more convenient to use an initializer in the
- declaration:
- @example
- @{
- /* @r{The array of values to average.} */
- double nums_to_average[]
- = @{ 58.7, 5.1, 7.7, 105.2, -3.14159 @};
- /* @r{The average, once we compute it.} */
- average = avg_of_double ((sizeof (nums_to_average)
- / sizeof (nums_to_average[0])),
- nums_to_average);
- /* @r{@dots{}now make use of @code{average}@dots{}} */
- @}
- @end example
- The array initializer is a comma-separated list of values, delimited
- by braces. @xref{Initializers}.
- Note that the declaration does not specify a size for
- @code{nums_to_average}, so the size is determined from the
- initializer. There are five values in the initializer, so
- @code{nums_to_average} gets length 5. If we add another element to
- the initializer, @code{nums_to_average} will have six elements.
- Because the code computes the number of elements from the size of
- the array, using @code{sizeof}, the program will operate on all the
- elements in the initializer, regardless of how many those are.
- @node Lexical Syntax
- @chapter Lexical Syntax
- @cindex lexical syntax
- @cindex token
- To start the full description of the C language, we explain the
- lexical syntax and lexical units of C code. The lexical units of a
- programming language are known as @dfn{tokens}. This chapter covers
- all the tokens of C except for constants, which are covered in a later
- chapter (@pxref{Constants}). One vital kind of token is the
- @dfn{identifier} (@pxref{Identifiers}), which is used for names of any
- kind.
- @menu
- * English:: Write programs in English!
- * Characters:: The characters allowed in C programs.
- * Whitespace:: The particulars of whitespace characters.
- * Comments:: How to include comments in C code.
- * Identifiers:: How to form identifiers (names).
- * Operators/Punctuation:: Characters used as operators or punctuation.
- * Line Continuation:: Splitting one line into multiple lines.
- @end menu
- @node English
- @section Write Programs in English!
- In principle, you can write the function and variable names in a
- program, and the comments, in any human language. C allows any kinds
- of characters in comments, and you can put non-ASCII characters into
- identifiers with a special prefix. However, to enable programmers in
- all countries to understand and develop the program, it is best given
- today's circumstances to write identifiers and comments in
- English.
- English is the one language that programmers in all countries
- generally study. If a program's names are in English, most
- programmers in Bangladesh, Belgium, Bolivia, Brazil, and Bulgaria can
- understand them. Most programmers in those countries can speak
- English, or at least read it, but they do not read each other's
- languages at all. In India, with so many languages, two programmers
- may have no common language other than English.
- If you don't feel confident in writing English, do the best you can,
- and follow each English comment with a version in a language you
- write better; add a note asking others to translate that to English.
- Someone will eventually do that.
- The program's user interface is a different matter. We don't need to
- choose one language for that; it is easy to support multiple languages
- and let each user choose the language to use. This requires writing
- the program to support localization of its interface. (The
- @code{gettext} package exists to support this; @pxref{Message
- Translation, The GNU C Library, , libc, The GNU C Library Reference
- Manual}.) Then a community-based translation effort can provide
- support for all the languages users want to use.
- @node Characters
- @section Characters
- @cindex character set
- @cindex Unicode
- @c ??? How to express ¶?
- GNU C source files are usually written in the
- @url{https://en.wikipedia.org/wiki/ASCII,,ASCII} character set, which
- was defined in the 1960s for English. However, they can also include
- Unicode characters represented in the
- @url{https://en.wikipedia.org/wiki/UTF-8,,UTF-8} multibyte encoding.
- This makes it possible to represent accented letters such as @samp{á},
- as well as other scripts such as Arabic, Chinese, Cyrillic, Hebrew,
- Japanese, and Korean.@footnote{On some obscure systems, GNU C uses
- UTF-EBCDIC instead of UTF-8, but that is not worth describing in this
- manual.}
- In C source code, non-ASCII characters are valid in comments, in wide
- character constants (@pxref{Wide Character Constants}), and in string
- constants (@pxref{String Constants}).
- @c ??? valid in identifiers?
- Another way to specify non-ASCII characters in constants (character or
- string) and identifiers is with an escape sequence starting with
- backslash, specifying the intended Unicode character. (@xref{Unicode
- Character Codes}.) This specifies non-ASCII characters without
- putting a real non-ASCII character in the source file itself.
- C accepts two-character aliases called @dfn{digraphs} for certain
- characters. @xref{Digraphs}.
- @node Whitespace
- @section Whitespace
- @cindex whitespace characters in source files
- @cindex space character in source
- @cindex tab character in source
- @cindex formfeed in source
- @cindex linefeed in source
- @cindex newline in source
- @cindex carriage return in source
- @cindex vertical tab in source
- Whitespace means characters that exist in a file but appear blank in a
- printed listing of a file (or traditionally did appear blank, several
- decades ago). The C language requires whitespace in order to separate
- two consecutive identifiers, or to separate an identifier from a
- numeric constant. Other than that, and a few special situations
- described later, whitespace is optional; you can put it in when you
- wish, to make the code easier to read.
- Space and tab in C code are treated as whitespace characters. So are
- line breaks. You can represent a line break with the newline
- character (also called @dfn{linefeed} or LF), CR (carriage return), or
- the CRLF sequence (two characters: carriage return followed by a
- newline character).
- The @dfn{formfeed} character, Control-L, was traditionally used to
- divide a file into pages. It is still used this way in source code,
- and the tools that generate nice printouts of source code still start
- a new page after each ``formfeed'' character. Dividing code into
- pages separated by formfeed characters is a good way to break it up
- into comprehensible pieces and show other programmers where they start
- and end.
- The @dfn{vertical tab} character, Control-K, was traditionally used to
- make printing advance down to the next section of a page. We know of
- no particular reason to use it in source code, but it is still
- accepted as whitespace in C.
- Comments are also syntactically equivalent to whitespace.
- @ifinfo
- @xref{Comments}.
- @end ifinfo
- @node Comments
- @section Comments
- @cindex comments
- A comment encapsulates text that has no effect on the program's
- execution or meaning.
- The purpose of comments is to explain the code to people that read it.
- Writing good comments for your code is tremendously important---they
- should provide background information that helps programmers
- understand the reasons why the code is written the way it is. You,
- returning to the code six months from now, will need the help of these
- comments to remember why you wrote it this way.
- Outdated comments that become incorrect are counterproductive, so part
- of the software developer's responsibility is to update comments as
- needed to correspond with changes to the program code.
- C allows two kinds of comment syntax, the traditional style and the
- C@t{++} style. A traditional C comment starts with @samp{/*} and ends
- with @samp{*/}. For instance,
- @example
- /* @r{This is a comment in traditional C syntax.} */
- @end example
- A traditional comment can contain @samp{/*}, but these delimiters do
- not nest as pairs. The first @samp{*/} ends the comment regardless of
- whether it contains @samp{/*} sequences.
- @example
- /* @r{This} /* @r{is a comment} */ But this is not! */
- @end example
- A @dfn{line comment} starts with @samp{//} and ends at the end of the line.
- For instance,
- @example
- // @r{This is a comment in C@t{++} style.}
- @end example
- Line comments do nest, in effect, because @samp{//} inside a line
- comment is part of that comment:
- @example
- // @r{this whole line is} // @r{one comment}
- This is code, not comment.
- @end example
- It is safe to put line comments inside block comments, or vice versa.
- @example
- @group
- /* @r{traditional comment}
- // @r{contains line comment}
- @r{more traditional comment}
- */ text here is not a comment
- // @r{line comment} /* @r{contains traditional comment} */
- @end group
- @end example
- But beware of commenting out one end of a traditional comment with a line
- comment. The delimiter @samp{/*} doesn't start a comment if it occurs
- inside an already-started comment.
- @example
- @group
- // @r{line comment} /* @r{That would ordinarily begin a block comment.}
- Oops! The line comment has ended;
- this isn't a comment any more. */
- @end group
- @end example
- Comments are not recognized within string constants. @t{@w{"/* blah
- */"}} is the string constant @samp{@w{/* blah */}}, not an empty
- string.
- In this manual we show the text in comments in a variable-width font,
- for readability, but this font distinction does not exist in source
- files.
- A comment is syntactically equivalent to whitespace, so it always
- separates tokens. Thus,
- @example
- @group
- int/* @r{comment} */foo;
- @r{is equivalent to}
- int foo;
- @end group
- @end example
- @noindent
- but clean code always uses real whitespace to separate the comment
- visually from surrounding code.
- @node Identifiers
- @section Identifiers
- @cindex identifiers
- An @dfn{identifier} (name) in C is a sequence of letters and digits,
- as well as @samp{_}, that does not start with a digit. Most compilers
- also allow @samp{$}. An identifier can be as long as you like; for
- example,
- @example
- int anti_dis_establishment_arian_ism;
- @end example
- @cindex case of letters in identifiers
- Letters in identifiers are case-sensitive in C; thus, @code{a}
- and @code{A} are two different identifiers.
- @cindex keyword
- @cindex reserved words
- Identifiers in C are used as variable names, function names, typedef
- names, enumeration constants, type tags, field names, and labels.
- Certain identifiers in C are @dfn{keywords}, which means they have
- specific syntactic meanings. Keywords in C are @dfn{reserved words},
- meaning you cannot use them in any other way. For instance, you can't
- define a variable or function named @code{return} or @code{if}.
- You can also include other characters, even non-ASCII characters, in
- identifiers by writing their Unicode character names, which start with
- @samp{\u} or @samp{\U}, in the identifier name. @xref{Unicode
- Character Codes}. However, it is usually a bad idea to use non-ASCII
- characters in identifiers, and when they are written in English, they
- never need non-ASCII characters. @xref{English}.
- Whitespace is required to separate two consecutive identifiers, or to
- separate an identifier from a preceding or following numeric
- constant.
- @node Operators/Punctuation
- @section Operators and Punctuation
- @cindex operators
- @cindex punctuation
- Here we describe the lexical syntax of operators and punctuation in C.
- The specific operators of C and their meanings are presented in
- subsequent chapters.
- Most operators in C consist of one or two characters that can't be
- used in identifiers. The characters used for operators in C are
- @samp{!~^&|*/%+-=<>,.?:}.
- Some operators are a single character. For instance, @samp{-} is the
- operator for negation (with one operand) and the operator for
- subtraction (with two operands).
- Some operators are two characters. For example, @samp{++} is the
- increment operator. Recognition of multicharacter operators works by
- grouping together as many consecutive characters as can constitute one
- operator.
- For instance, the character sequence @samp{++} is always interpreted
- as the increment operator; therefore, if we want to write two
- consecutive instances of the operator @samp{+}, we must separate them
- with a space so that they do not combine as one token. Applying the
- same rule, @code{a+++++b} is always tokenized as @code{@w{a++ ++ +
- b}}, not as @code{@w{a++ + ++b}}, even though the latter could be part
- of a valid C program and the former could not (since @code{a++}
- is not an lvalue and thus can't be the operand of @code{++}).
- A few C operators are keywords rather than special characters. They
- include @code{sizeof} (@pxref{Type Size}) and @code{_Alignof}
- (@pxref{Type Alignment}).
- The characters @samp{;@{@}[]()} are used for punctuation and grouping.
- Semicolon (@samp{;}) ends a statement. Braces (@samp{@{} and
- @samp{@}}) begin and end a block at the statement level
- (@pxref{Blocks}), and surround the initializer (@pxref{Initializers})
- for a variable with multiple elements or components (such as arrays or
- structures).
- Square brackets (@samp{[} and @samp{]}) do array indexing, as in
- @code{array[5]}.
- Parentheses are used in expressions for explicit nesting of
- expressions (@pxref{Basic Arithmetic}), around the parameter
- declarations in a function declaration or definition, and around the
- arguments in a function call, as in @code{printf ("Foo %d\n", i)}
- (@pxref{Function Calls}). Several kinds of statements also use
- parentheses as part of their syntax---for instance, @code{if}
- statements, @code{for} statements, @code{while} statements, and
- @code{switch} statements. @xref{if Statement}, and following
- sections.
- Parentheses are also required around the operand of the operator
- keywords @code{sizeof} and @code{_Alignof} when the operand is a data
- type rather than a value. @xref{Type Size}.
- @node Line Continuation
- @section Line Continuation
- @cindex line continuation
- @cindex continuation of lines
- The sequence of a backslash and a newline is ignored absolutely
- anywhere in a C program. This makes it possible to split a single
- source line into multiple lines in the source file. GNU C tolerates
- and ignores other whitespace between the backslash and the newline.
- In particular, it always ignores a CR (carriage return) character
- there, in case some text editor decided to end the line with the CRLF
- sequence.
- The main use of line continuation in C is for macro definitions that
- would be inconveniently long for a single line (@pxref{Macros}).
- It is possible to continue a line comment onto another line with
- backslash-newline. You can put backslash-newline in the middle of an
- identifier, even a keyword, or an operator. You can even split
- @samp{/*}, @samp{*/}, and @samp{//} onto multiple lines with
- backslash-newline. Here's an ugly example:
- @example
- @group
- /\
- *
- */ fo\
- o +\
- = 1\
- 0;
- @end group
- @end example
- @noindent
- That's equivalent to @samp{/* */ foo += 10;}.
- Don't do those things in real programs, since they make code hard to
- read.
- @strong{Note:} For the sake of using certain tools on the source code, it is
- wise to end every source file with a newline character which is not
- preceded by a backslash, so that it really ends the last line.
- @node Arithmetic
- @chapter Arithmetic
- @cindex arithmetic operators
- @cindex operators, arithmetic
- @c ??? Duplication with other sections -- get rid of that?
- Arithmetic operators in C attempt to be as similar as possible to the
- abstract arithmetic operations, but it is impossible to do this
- perfectly. Numbers in a computer have a finite range of possible
- values, and non-integer values have a limit on their possible
- accuracy. Nonetheless, in most cases you will encounter no surprises
- in using @samp{+} for addition, @samp{-} for subtraction, and @samp{*}
- for multiplication.
- Each C operator has a @dfn{precedence}, which is its rank in the
- grammatical order of the various operators. The operators with the
- highest precedence grab adjoining operands first; these expressions
- then become operands for operators of lower precedence. We give some
- information about precedence of operators in this chapter where we
- describe the operators; for the full explanation, see @ref{Binary
- Operator Grammar}.
- The arithmetic operators always @dfn{promote} their operands before
- operating on them. This means converting narrow integer data types to
- a wider data type (@pxref{Operand Promotions}). If you are just
- learning C, don't worry about this yet.
- Given two operands that have different types, most arithmetic
- operations convert them both to their @dfn{common type}. For
- instance, if one is @code{int} and the other is @code{double}, the
- common type is @code{double}. (That's because @code{double} can
- represent all the values that an @code{int} can hold, but not vice
- versa.) For the full details, see @ref{Common Type}.
- @menu
- * Basic Arithmetic:: Addition, subtraction, multiplication,
- and division.
- * Integer Arithmetic:: How C performs arithmetic with integer values.
- * Integer Overflow:: When an integer value exceeds the range
- of its type.
- * Mixed Mode:: Calculating with both integer values
- and floating-point values.
- * Division and Remainder:: How integer division works.
- * Numeric Comparisons:: Comparing numeric values for equality or order.
- * Shift Operations:: Shift integer bits left or right.
- * Bitwise Operations:: Bitwise conjunction, disjunction, negation.
- @end menu
- @node Basic Arithmetic
- @section Basic Arithmetic
- @cindex addition operator
- @cindex subtraction operator
- @cindex multiplication operator
- @cindex division operator
- @cindex negation operator
- @cindex operator, addition
- @cindex operator, subtraction
- @cindex operator, multiplication
- @cindex operator, division
- @cindex operator, negation
- Basic arithmetic in C is done with the usual binary operators of
- algebra: addition (@samp{+}), subtraction (@samp{-}), multiplication
- (@samp{*}) and division (@samp{/}). The unary operator @samp{-} is
- used to change the sign of a number. The unary @code{+} operator also
- exists; it yields its operand unaltered.
- @samp{/} is the division operator, but dividing integers may not give
- the result you expect. Its value is an integer, which is not equal to
- the mathematical quotient when that is a fraction. Use @samp{%} to
- get the corresponding integer remainder when necessary.
- @xref{Division and Remainder}. Floating point division yields value
- as close as possible to the mathematical quotient.
- These operators use algebraic syntax with the usual algebraic
- precedence rule (@pxref{Binary Operator Grammar}) that multiplication
- and division are done before addition and subtraction, but you can use
- parentheses to explicitly specify how the operators nest. They are
- left-associative (@pxref{Associativity and Ordering}). Thus,
- @example
- -a + b - c + d * e / f
- @end example
- @noindent
- is equivalent to
- @example
- (((-a) + b) - c) + ((d * e) / f)
- @end example
- @node Integer Arithmetic
- @section Integer Arithmetic
- @cindex integer arithmetic
- Each of the basic arithmetic operations in C has two variants for
- integers: @dfn{signed} and @dfn{unsigned}. The choice is determined
- by the data types of their operands.
- Each integer data type in C is either @dfn{signed} or @dfn{unsigned}.
- A signed type can hold a range of positive and negative numbers, with
- zero near the middle of the range. An unsigned type can hold only
- nonnegative numbers; its range starts with zero and runs upward.
- The most basic integer types are @code{int}, which normally can hold
- numbers from @minus{}2,147,483,648 to 2,147,483,647, and @code{unsigned
- int}, which normally can hold numbers from 0 to 4,294.967,295. (This
- assumes @code{int} is 32 bits wide, always true for GNU C on real
- computers but not always on embedded controllers.) @xref{Integer
- Types}, for full information about integer types.
- When a basic arithmetic operation is given two signed operands, it
- does signed arithmetic. Given two unsigned operands, it does
- unsigned arithmetic.
- If one operand is @code{unsigned int} and the other is @code{int}, the
- operator treats them both as unsigned. More generally, the common
- type of the operands determines whether the operation is signed or
- not. @xref{Common Type}.
- Printing the results of unsigned arithmetic with @code{printf} using
- @samp{%d} can produce surprising results for values far away from
- zero. Even though the rules above say that the computation was done
- with unsigned arithmetic, the printed result may appear to be signed!
- The explanation is that the bit pattern resulting from addition,
- subtraction or multiplication is actually the same for signed and
- unsigned operations. The difference is only in the data type of the
- result, which affects the @emph{interpretation} of the result bit pattern,
- and whether the arithmetic operation can overflow (see the next section).
- But @samp{%d} doesn't know its argument's data type. It sees only the
- value's bit pattern, and it is defined to interpret that as
- @code{signed int}. To print it as unsigned requires using @samp{%u}
- instead of @samp{%d}. @xref{Formatted Output, The GNU C Library, ,
- libc, The GNU C Library Reference Manual}.
- Arithmetic in C never operates directly on narrow integer types (those
- with fewer bits than @code{int}; @ref{Narrow Integers}). Instead it
- ``promotes'' them to @code{int}. @xref{Operand Promotions}.
- @node Integer Overflow
- @section Integer Overflow
- @cindex integer overflow
- @cindex overflow, integer
- When the mathematical value of an arithmetic operation doesn't fit in
- the range of the data type in use, that's called @dfn{overflow}.
- When it happens in integer arithmetic, it is @dfn{integer overflow}.
- Integer overflow happens only in arithmetic operations. Type conversion
- operations, by definition, do not cause overflow, not even when the
- result can't fit in its new type. @xref{Integer Conversion}.
- Signed numbers use two's-complement representation, in which the most
- negative number lacks a positive counterpart (@pxref{Integers in
- Depth}). Thus, the unary @samp{-} operator on a signed integer can
- overflow.
- @menu
- * Unsigned Overflow:: Overlow in unsigned integer arithmetic.
- * Signed Overflow:: Overlow in signed integer arithmetic.
- @end menu
- @node Unsigned Overflow
- @subsection Overflow with Unsigned Integers
- Unsigned arithmetic in C ignores overflow; it produces the true result
- modulo the @var{n}th power of 2, where @var{n} is the number of bits
- in the data type. We say it ``truncates'' the true result to the
- lowest @var{n} bits.
- A true result that is negative, when taken modulo the @var{n}th power
- of 2, yields a positive number. For instance,
- @example
- unsigned int x = 1;
- unsigned int y;
- y = -x;
- @end example
- @noindent
- causes overflow because the negative number @minus{}1 can't be stored
- in an unsigned type. The actual result, which is @minus{}1 modulo the
- @var{n}th power of 2, is one less than the @var{n}th power of 2. That
- is the largest value that the unsigned data type can store. For a
- 32-bit @code{unsigned int}, the value is 4,294,967,295. @xref{Maximum
- and Minimum Values}.
- Adding that number to itself, as here,
- @example
- unsigned int z;
- z = y + y;
- @end example
- @noindent
- ought to yield 8,489,934,590; however, that is again too large to fit,
- so overflow truncates the value to 4,294,967,294. If that were a
- signed integer, it would mean @minus{}2, which (not by coincidence)
- equals @minus{}1 + @minus{}1.
- @node Signed Overflow
- @subsection Overflow with Signed Integers
- @cindex compiler options for integer overflow
- @cindex integer overflow, compiler options
- @cindex overflow, compiler options
- For signed integers, the result of overflow in C is @emph{in
- principle} undefined, meaning that anything whatsoever could happen.
- Therefore, C compilers can do optimizations that treat the overflow
- case with total unconcern. (Since the result of overflow is undefined
- in principle, one cannot claim that these optimizations are
- erroneous.)
- @strong{Watch out:} These optimizations can do surprising things. For
- instance,
- @example
- int i;
- @r{@dots{}}
- if (i < i + 1)
- x = 5;
- @end example
- @noindent
- could be optimized to do the assignment unconditionally, because the
- @code{if}-condition is always true if @code{i + 1} does not overflow.
- GCC offers compiler options to control handling signed integer
- overflow. These options operate per module; that is, each module
- behaves according to the options it was compiled with.
- These two options specify particular ways to handle signed integer
- overflow, other than the default way:
- @table @option
- @item -fwrapv
- Make signed integer operations well-defined, like unsigned integer
- operations: they produce the @var{n} low-order bits of the true
- result. The highest of those @var{n} bits is the sign bit of the
- result. With @option{-fwrapv}, these out-of-range operations are not
- considered overflow, so (strictly speaking) integer overflow never
- happens.
- The option @option{-fwrapv} enables some optimizations based on the
- defined values of out-of-range results. In GCC 8, it disables
- optimizations that are based on assuming signed integer operations
- will not overflow.
- @item -ftrapv
- Generate a signal @code{SIGFPE} when signed integer overflow occurs.
- This terminates the program unless the program handles the signal.
- @xref{Signals}.
- @end table
- One other option is useful for finding where overflow occurs:
- @ignore
- @item -fno-strict-overflow
- Disable optimizations that are based on assuming signed integer
- operations will not overflow.
- @end ignore
- @table @option
- @item -fsanitize=signed-integer-overflow
- Output a warning message at run time when signed integer overflow
- occurs. This checks the @samp{+}, @samp{*}, and @samp{-} operators.
- This takes priority over @option{-ftrapv}.
- @end table
- @node Mixed Mode
- @section Mixed-Mode Arithmetic
- Mixing integers and floating-point numbers in a basic arithmetic
- operation converts the integers automatically to floating point.
- In most cases, this gives exactly the desired results.
- But sometimes it matters precisely where the conversion occurs.
- If @code{i} and @code{j} are integers, @code{(i + j) * 2.0} adds them
- as an integer, then converts the sum to floating point for the
- multiplication. If the addition gets an overflow, that is not
- equivalent to converting both integers to floating point and then
- adding them. You can get the latter result by explicitly converting
- the integers, as in @code{((double) i + (double) j) * 2.0}.
- @xref{Explicit Type Conversion}.
- @c Eggert's report
- Adding or multiplying several values, including some integers and some
- floating point, does the operations left to right. Thus, @code{3.0 +
- i + j} converts @code{i} to floating point, then adds 3.0, then
- converts @code{j} to floating point and adds that. You can specify a
- different order using parentheses: @code{3.0 + (i + j)} adds @code{i}
- and @code{j} first and then adds that result (converting to floating
- point) to 3.0. In this respect, C differs from other languages, such
- as Fortran.
- @node Division and Remainder
- @section Division and Remainder
- @cindex remainder operator
- @cindex modulus
- @cindex operator, remainder
- Division of integers in C rounds the result to an integer. The result
- is always rounded towards zero.
- @example
- 16 / 3 @result{} 5
- -16 / 3 @result{} -5
- 16 / -3 @result{} -5
- -16 / -3 @result{} 5
- @end example
- @noindent
- To get the corresponding remainder, use the @samp{%} operator:
- @example
- 16 % 3 @result{} 1
- -16 % 3 @result{} -1
- 16 % -3 @result{} 1
- -16 % -3 @result{} -1
- @end example
- @noindent
- @samp{%} has the same operator precedence as @samp{/} and @samp{*}.
- From the rounded quotient and the remainder, you can reconstruct
- the dividend, like this:
- @example
- int
- original_dividend (int divisor, int quotient, int remainder)
- @{
- return divisor * quotient + remainder;
- @}
- @end example
- To do unrounded division, use floating point. If only one operand is
- floating point, @samp{/} converts the other operand to floating
- point.
- @example
- 16.0 / 3 @result{} 5.333333333333333
- 16 / 3.0 @result{} 5.333333333333333
- 16.0 / 3.0 @result{} 5.333333333333333
- 16 / 3 @result{} 5
- @end example
- The remainder operator @samp{%} is not allowed for floating-point
- operands, because it is not needed. The concept of remainder makes
- sense for integers because the result of division of integers has to
- be an integer. For floating point, the result of division is a
- floating-point number, in other words a fraction, which will differ
- from the exact result only by a very small amount.
- There are functions in the standard C library to calculate remainders
- from integral-values division of floating-point numbers.
- @xref{Remainder Functions, The GNU C Library, , libc, The GNU C Library
- Reference Manual}.
- Integer division overflows in one specific case: dividing the smallest
- negative value for the data type (@pxref{Maximum and Minimum Values})
- by @minus{}1. That's because the correct result, which is the
- corresponding positive number, does not fit (@pxref{Integer Overflow})
- in the same number of bits. On some computers now in use, this always
- causes a signal @code{SIGFPE} (@pxref{Signals}), the same behavior
- that the option @option{-ftrapv} specifies (@pxref{Signed Overflow}).
- Division by zero leads to unpredictable results---depending on the
- type of computer, it might cause a signal @code{SIGFPE}, or it might
- produce a numeric result.
- @cindex division by zero
- @cindex zero, division by
- @strong{Watch out:} Make sure the program does not divide by zero. If
- you can't prove that the divisor is not zero, test whether it is zero,
- and skip the division if so.
- @node Numeric Comparisons
- @section Numeric Comparisons
- @cindex numeric comparisons
- @cindex comparisons
- @cindex operators, comparison
- @cindex equal operator
- @cindex not-equal operator
- @cindex less-than operator
- @cindex greater-than operator
- @cindex less-or-equal operator
- @cindex greater-or-equal operator
- @cindex operator, equal
- @cindex operator, not-equal
- @cindex operator, less-than
- @cindex operator, greater-than
- @cindex operator, less-or-equal
- @cindex operator, greater-or-equal
- @cindex truth value
- There are two kinds of comparison operators: @dfn{equality} and
- @dfn{ordering}. Equality comparisons test whether two expressions
- have the same value. The result is a @dfn{truth value}: a number that
- is 1 for ``true'' and 0 for ``false.''
- @example
- a == b /* @r{Test for equal.} */
- a != b /* @r{Test for not equal.} */
- @end example
- The equality comparison is written @code{==} because plain @code{=}
- is the assignment operator.
- Ordering comparisons test which operand is greater or less. Their
- results are truth values. These are the ordering comparisons of C:
- @example
- a < b /* @r{Test for less-than.} */
- a > b /* @r{Test for greater-than.} */
- a <= b /* @r{Test for less-than-or-equal.} */
- a >= b /* @r{Test for greater-than-or-equal.} */
- @end example
- For any integers @code{a} and @code{b}, exactly one of the comparisons
- @code{a < b}, @code{a == b} and @code{a > b} is true, just as in
- mathematics. However, if @code{a} and @code{b} are special floating
- point values (not ordinary numbers), all three can be false.
- @xref{Special Float Values}, and @ref{Invalid Optimizations}.
- @node Shift Operations
- @section Shift Operations
- @cindex shift operators
- @cindex operators, shift
- @cindex operators, shift
- @cindex shift count
- @dfn{Shifting} an integer means moving the bit values to the left or
- right within the bits of the data type. Shifting is defined only for
- integers. Here's the way to write it:
- @example
- /* @r{Left shift.} */
- 5 << 2 @result{} 20
- /* @r{Right shift.} */
- 5 >> 2 @result{} 1
- @end example
- @noindent
- The left operand is the value to be shifted, and the right operand
- says how many bits to shift it (the @dfn{shift count}). The left
- operand is promoted (@pxref{Operand Promotions}), so shifting never
- operates on a narrow integer type; it's always either @code{int} or
- wider. The value of the shift operator has the same type as the
- promoted left operand.
- @menu
- * Bits Shifted In:: How shifting makes new bits to shift in.
- * Shift Caveats:: Caveats of shift operations.
- * Shift Hacks:: Clever tricks with shift operations.
- @end menu
- @node Bits Shifted In
- @subsection Shifting Makes New Bits
- A shift operation shifts towards one end of the number and has to
- generate new bits at the other end.
- Shifting left one bit must generate a new least significant bit. It
- always brings in zero there. It is equivalent to multiplying by the
- appropriate power of 2. For example,
- @example
- 5 << 3 @r{is equivalent to} 5 * 2*2*2
- -10 << 4 @r{is equivalent to} -10 * 2*2*2*2
- @end example
- The meaning of shifting right depends on whether the data type is
- signed or unsigned (@pxref{Signed and Unsigned Types}). For a signed
- data type, it performs ``arithmetic shift,'' which keeps the number's
- sign unchanged by duplicating the sign bit. For an unsigned data
- type, it performs ``logical shift,'' which always shifts in zeros at
- the most significant bit.
- In both cases, shifting right one bit is division by two, rounding
- towards negative infinity. For example,
- @example
- (unsigned) 19 >> 2 @result{} 4
- (unsigned) 20 >> 2 @result{} 5
- (unsigned) 21 >> 2 @result{} 5
- @end example
- For negative left operand @code{a}, @code{a >> 1} is not equivalent to
- @code{a / 2}. They both divide by 2, but @samp{/} rounds toward
- zero.
- The shift count must be zero or greater. Shifting by a negative
- number of bits gives machine-dependent results.
- @node Shift Caveats
- @subsection Caveats for Shift Operations
- @strong{Warning:} If the shift count is greater than or equal to the
- width in bits of the first operand, the results are machine-dependent.
- Logically speaking, the ``correct'' value would be either -1 (for
- right shift of a negative number) or 0 (in all other cases), but what
- it really generates is whatever the machine's shift instruction does in
- that case. So unless you can prove that the second operand is not too
- large, write code to check it at run time.
- @strong{Warning:} Never rely on how the shift operators relate in
- precedence to other arithmetic binary operators. Programmers don't
- remember these precedences, and won't understand the code. Always use
- parentheses to explicitly specify the nesting, like this:
- @example
- a + (b << 5) /* @r{Shift first, then add.} */
- (a + b) << 5 /* @r{Add first, then shift.} */
- @end example
- Note: according to the C standard, shifting of signed values isn't
- guaranteed to work properly when the value shifted is negative, or
- becomes negative during the operation of shifting left. However, only
- pedants have a reason to be concerned about this; only computers with
- strange shift instructions could plausibly do this wrong. In GNU C,
- the operation always works as expected,
- @node Shift Hacks
- @subsection Shift Hacks
- You can use the shift operators for various useful hacks. For
- example, given a date specified by day of the month @code{d}, month
- @code{m}, and year @code{y}, you can store the entire date in a single
- integer @code{date}:
- @example
- unsigned int d = 12;
- unsigned int m = 6;
- unsigned int y = 1983;
- unsigned int date = ((y << 4) + m) << 5) + d;
- @end example
- @noindent
- To extract the original day, month, and year out of
- @code{date}, use a combination of shift and remainder.
- @example
- d = date % 32;
- m = (date >> 5) % 16;
- y = date >> 9;
- @end example
- @code{-1 << LOWBITS} is a clever way to make an integer whose
- @code{LOWBITS} lowest bits are all 0 and the rest are all 1.
- @code{-(1 << LOWBITS)} is equivalent to that, due to associativity of
- multiplication, since negating a value is equivalent to multiplying it
- by @minus{}1.
- @node Bitwise Operations
- @section Bitwise Operations
- @cindex bitwise operators
- @cindex operators, bitwise
- @cindex negation, bitwise
- @cindex conjunction, bitwise
- @cindex disjunction, bitwise
- Bitwise operators operate on integers, treating each bit independently.
- They are not allowed for floating-point types.
- The examples in this section use binary constants, starting with
- @samp{0b} (@pxref{Integer Constants}). They stand for 32-bit integers
- of type @code{int}.
- @table @code
- @item ~@code{a}
- Unary operator for bitwise negation; this changes each bit of
- @code{a} from 1 to 0 or from 0 to 1.
- @example
- ~0b10101000 @result{} 0b11111111111111111111111101010111
- ~0 @result{} 0b11111111111111111111111111111111
- ~0b11111111111111111111111111111111 @result{} 0
- ~ (-1) @result{} 0
- @end example
- It is useful to remember that @code{~@var{x} + 1} equals
- @code{-@var{x}}, for integers, and @code{~@var{x}} equals
- @code{-@var{x} - 1}. The last example above shows this with @minus{}1
- as @var{x}.
- @item @code{a} & @code{b}
- Binary operator for bitwise ``and'' or ``conjunction.'' Each bit in
- the result is 1 if that bit is 1 in both @code{a} and @code{b}.
- @example
- 0b10101010 & 0b11001100 @result{} 0b10001000
- @end example
- @item @code{a} | @code{b}
- Binary operator for bitwise ``or'' (``inclusive or'' or
- ``disjunction''). Each bit in the result is 1 if that bit is 1 in
- either @code{a} or @code{b}.
- @example
- 0b10101010 | 0b11001100 @result{} 0b11101110
- @end example
- @item @code{a} ^ @code{b}
- Binary operator for bitwise ``xor'' (``exclusive or''). Each bit in
- the result is 1 if that bit is 1 in exactly one of @code{a} and @code{b}.
- @example
- 0b10101010 ^ 0b11001100 @result{} 0b01100110
- @end example
- @end table
- To understand the effect of these operators on signed integers, keep
- in mind that all modern computers use two's-complement representation
- (@pxref{Integer Representations}) for negative integers. This means
- that the highest bit of the number indicates the sign; it is 1 for a
- negative number and 0 for a positive number. In a negative number,
- the value in the other bits @emph{increases} as the number gets closer
- to zero, so that @code{0b111@r{@dots{}}111} is @minus{}1 and
- @code{0b100@r{@dots{}}000} is the most negative possible integer.
- @strong{Warning:} C defines a precedence ordering for the bitwise
- binary operators, but you should never rely on it. You should
- never rely on how bitwise binary operators relate in precedence to the
- arithmetic and shift binary operators. Other programmers don't
- remember this precedence ordering, so always use parentheses to
- explicitly specify the nesting.
- For example, suppose @code{offset} is an integer that specifies
- the offset within shared memory of a table, except that its bottom few
- bits (@code{LOWBITS} says how many) are special flags. Here's
- how to get just that offset and add it to the base address.
- @example
- shared_mem_base + (offset & (-1 << LOWBITS))
- @end example
- Thanks to the outer set of parentheses, we don't need to know whether
- @samp{&} has higher precedence than @samp{+}. Thanks to the inner
- set, we don't need to know whether @samp{&} has higher precedence than
- @samp{<<}. But we can rely on all unary operators to have higher
- precedence than any binary operator, so we don't need parentheses
- around the left operand of @samp{<<}.
- @node Assignment Expressions
- @chapter Assignment Expressions
- @cindex assignment expressions
- @cindex operators, assignment
- As a general concept in programming, an @dfn{assignment} is a
- construct that stores a new value into a place where values can be
- stored---for instance, in a variable. Such places are called
- @dfn{lvalues} (@pxref{Lvalues}) because they are locations that hold a value.
- An assignment in C is an expression because it has a value; we call
- it an @dfn{assignment expression}. A simple assignment looks like
- @example
- @var{lvalue} = @var{value-to-store}
- @end example
- @noindent
- We say it assigns the value of the expression @var{value-to-store} to
- the location @var{lvalue}, or that it stores @var{value-to-store}
- there. You can think of the ``l'' in ``lvalue'' as standing for
- ``left,'' since that's what you put on the left side of the assignment
- operator.
- However, that's not the only way to use an lvalue, and not all lvalues
- can be assigned to. To use the lvalue in the left side of an
- assignment, it has to be @dfn{modifiable}. In C, that means it was
- not declared with the type qualifier @code{const} (@pxref{const}).
- The value of the assignment expression is that of @var{lvalue} after
- the new value is stored in it. This means you can use an assignment
- inside other expressions. Assignment operators are right-associative
- so that
- @example
- x = y = z = 0;
- @end example
- @noindent
- is equivalent to
- @example
- x = (y = (z = 0));
- @end example
- This is the only useful way for them to associate;
- the other way,
- @example
- ((x = y) = z) = 0;
- @end example
- @noindent
- would be invalid since an assignment expression such as @code{x = y}
- is not valid as an lvalue.
- @strong{Warning:} Write parentheses around an assignment if you nest
- it inside another expression, unless that is a conditional expression,
- or comma-separated series, or another assignment.
- @menu
- * Simple Assignment:: The basics of storing a value.
- * Lvalues:: Expressions into which a value can be stored.
- * Modifying Assignment:: Shorthand for changing an lvalue's contents.
- * Increment/Decrement:: Shorthand for incrementing and decrementing
- an lvalue's contents.
- * Postincrement/Postdecrement:: Accessing then incrementing or decrementing.
- * Assignment in Subexpressions:: How to avoid ambiguity.
- * Write Assignments Separately:: Write assignments as separate statements.
- @end menu
- @node Simple Assignment
- @section Simple Assignment
- @cindex simple assignment
- @cindex assignment, simple
- A @dfn{simple assignment expression} computes the value of the right
- operand and stores it into the lvalue on the left. Here is a simple
- assignment expression that stores 5 in @code{i}:
- @example
- i = 5
- @end example
- @noindent
- We say that this is an @dfn{assignment to} the variable @code{i} and
- that it @dfn{assigns} @code{i} the value 5. It has no semicolon
- because it is an expression (so it has a value). Adding a semicolon
- at the end would make it a statement (@pxref{Expression Statement}).
- Here is another example of a simple assignment expression. Its
- operands are not simple, but the kind of assignment done here is
- simple assignment.
- @example
- x[foo ()] = y + 6
- @end example
- A simple assignment with two different numeric data types converts the
- right operand value to the lvalue's type, if possible. It can convert
- any numeric type to any other numeric type.
- Simple assignment is also allowed on some non-numeric types: pointers
- (@pxref{Pointers}), structures (@pxref{Structure Assignment}), and
- unions (@pxref{Unions}).
- @strong{Warning:} Assignment is not allowed on arrays because
- there are no array values in C; C variables can be arrays, but these
- arrays cannot be manipulated as wholes. @xref{Limitations of C
- Arrays}.
- @xref{Assignment Type Conversions}, for the complete rules about data
- types used in assignments.
- @node Lvalues
- @section Lvalues
- @cindex lvalues
- An expression that identifies a memory space that holds a value is
- called an @dfn{lvalue}, because it is a location that can hold a value.
- The standard kinds of lvalues are:
- @itemize @bullet
- @item
- A variable.
- @item
- A pointer-dereference expression (@pxref{Pointer Dereference}) using
- unary @samp{*}.
- @item
- A structure field reference (@pxref{Structures}) using @samp{.}, if
- the structure value is an lvalue.
- @item
- A structure field reference using @samp{->}. This is always an lvalue
- since @samp{->} implies pointer dereference.
- @item
- A union alternative reference (@pxref{Unions}), on the same conditions
- as for structure fields.
- @item
- An array-element reference using @samp{[@r{@dots{}}]}, if the array
- is an lvalue.
- @end itemize
- If an expression's outermost operation is any other operator, that
- expression is not an lvalue. Thus, the variable @code{x} is an
- lvalue, but @code{x + 0} is not, even though these two expressions
- compute the same value (assuming @code{x} is a number).
- An array can be an lvalue (the rules above determine whether it is
- one), but using the array in an expression converts it automatically
- to a pointer to the first element. The result of this conversion is
- not an lvalue. Thus, if the variable @code{a} is an array, you can't
- use @code{a} by itself as the left operand of an assignment. But you
- can assign to an element of @code{a}, such as @code{a[0]}. That is an
- lvalue since @code{a} is an lvalue.
- @node Modifying Assignment
- @section Modifying Assignment
- @cindex modifying assignment
- @cindex assignment, modifying
- You can abbreviate the common construct
- @example
- @var{lvalue} = @var{lvalue} + @var{expression}
- @end example
- @noindent
- as
- @example
- @var{lvalue} += @var{expression}
- @end example
- This is known as a @dfn{modifying assignment}. For instance,
- @example
- i = i + 5;
- i += 5;
- @end example
- @noindent
- shows two statements that are equivalent. The first uses
- simple assignment; the second uses modifying assignment.
- Modifying assignment works with any binary arithmetic operator. For
- instance, you can subtract something from an lvalue like this,
- @example
- @var{lvalue} -= @var{expression}
- @end example
- @noindent
- or multiply it by a certain amount like this,
- @example
- @var{lvalue} *= @var{expression}
- @end example
- @noindent
- or shift it by a certain amount like this.
- @example
- @var{lvalue} <<= @var{expression}
- @var{lvalue} >>= @var{expression}
- @end example
- In most cases, this feature adds no power to the language, but it
- provides substantial convenience. Also, when @var{lvalue} contains
- code that has side effects, the simple assignment performs those side
- effects twice, while the modifying assignment performs them once. For
- instance,
- @example
- x[foo ()] = x[foo ()] + 5;
- @end example
- @noindent
- calls @code{foo} twice, and it could return different values each
- time. If @code{foo ()} returns 1 the first time and 3 the second
- time, then the effect could be to add @code{x[3]} and 5 and store the
- result in @code{x[1]}, or to add @code{x[1]} and 5 and store the
- result in @code{x[3]}. We don't know which of the two it will do,
- because C does not specify which call to @code{foo} is computed first.
- Such a statement is not well defined, and shouldn't be used.
- By contrast,
- @example
- x[foo ()] += 5;
- @end example
- @noindent
- is well defined: it calls @code{foo} only once to determine which
- element of @code{x} to adjust, and it adjusts that element by adding 5
- to it.
- @node Increment/Decrement
- @section Increment and Decrement Operators
- @cindex increment operator
- @cindex decrement operator
- @cindex operator, increment
- @cindex operator, decrement
- @cindex preincrement expression
- @cindex predecrement expression
- The operators @samp{++} and @samp{--} are the @dfn{increment} and
- @dfn{decrement} operators. When used on a numeric value, they add or
- subtract 1. We don't consider them assignments, but they are
- equivalent to assignments.
- Using @samp{++} or @samp{--} as a prefix, before an lvalue, is called
- @dfn{preincrement} or @dfn{predecrement}. This adds or subtracts 1
- and the result becomes the expression's value. For instance,
- @example
- #include <stdio.h> /* @r{Declares @code{printf}.} */
- int
- main (void)
- @{
- int i = 5;
- printf ("%d\n", i);
- printf ("%d\n", ++i);
- printf ("%d\n", i);
- return 0;
- @}
- @end example
- @noindent
- prints lines containing 5, 6, and 6 again. The expression @code{++i}
- increments @code{i} from 5 to 6, and has the value 6, so the output
- from @code{printf} on that line says @samp{6}.
- Using @samp{--} instead, for predecrement,
- @example
- #include <stdio.h> /* @r{Declares @code{printf}.} */
- int
- main (void)
- @{
- int i = 5;
- printf ("%d\n", i);
- printf ("%d\n", --i);
- printf ("%d\n", i);
- return 0;
- @}
- @end example
- @noindent
- prints three lines that contain (respectively) @samp{5}, @samp{4}, and
- again @samp{4}.
- @node Postincrement/Postdecrement
- @section Postincrement and Postdecrement
- @cindex postincrement expression
- @cindex postdecrement expression
- @cindex operator, postincrement
- @cindex operator, postdecrement
- Using @samp{++} or @samp{--} @emph{after} an lvalue does something
- peculiar: it gets the value directly out of the lvalue and @emph{then}
- increments or decrement it. Thus, the value of @code{i++} is the same
- as the value of @code{i}, but @code{i++} also increments @code{i} ``a
- little later.'' This is called @dfn{postincrement} or
- @dfn{postdecrement}.
- For example,
- @example
- #include <stdio.h> /* @r{Declares @code{printf}.} */
- int
- main (void)
- @{
- int i = 5;
- printf ("%d\n", i);
- printf ("%d\n", i++);
- printf ("%d\n", i);
- return 0;
- @}
- @end example
- @noindent
- prints lines containing 5, again 5, and 6. The expression @code{i++}
- has the value 5, which is the value of @code{i} at the time,
- but it increments @code{i} from 5 to 6 just a little later.
- How much later is ``just a little later''? That is flexible. The
- increment has to happen by the next @dfn{sequence point}. In simple cases,
- that means by the end of the statement. @xref{Sequence Points}.
- If a unary operator precedes a postincrement or postincrement expression,
- the increment nests inside:
- @example
- -a++ @r{is equivalent to} -(a++)
- @end example
- That's the only order that makes sense; @code{-a} is not an lvalue, so
- it can't be incremented.
- @node Assignment in Subexpressions
- @section Pitfall: Assignment in Subexpressions
- @cindex assignment in subexpressions
- @cindex subexpressions, assignment in
- In C, the order of computing parts of an expression is not fixed.
- Aside from a few special cases, the operations can be computed in any
- order. If one part of the expression has an assignment to @code{x}
- and another part of the expression uses @code{x}, the result is
- unpredictable because that use might be computed before or after the
- assignment.
- Here's an example of ambiguous code:
- @example
- x = 20;
- printf ("%d %d\n", x, x = 4);
- @end example
- @noindent
- If the second argument, @code{x}, is computed before the third argument,
- @code{x = 4}, the second argument's value will be 20. If they are
- computed in the other order, the second argument's value will be 4.
- Here's one way to make that code unambiguous:
- @example
- y = 20;
- printf ("%d %d\n", y, x = 4);
- @end example
- Here's another way, with the other meaning:
- @example
- x = 4;
- printf ("%d %d\n", x, x);
- @end example
- This issue applies to all kinds of assignments, and to the increment
- and decrement operators, which are equivalent to assignments.
- @xref{Order of Execution}, for more information about this.
- However, it can be useful to write assignments inside an
- @code{if}-condition or @code{while}-test along with logical operators.
- @xref{Logicals and Assignments}.
- @node Write Assignments Separately
- @section Write Assignments in Separate Statements
- It is often convenient to write an assignment inside an
- @code{if}-condition, but that can reduce the readability of the
- program. Here's an example of what to avoid:
- @example
- if (x = advance (x))
- @r{@dots{}}
- @end example
- The idea here is to advance @code{x} and test if the value is nonzero.
- However, readers might miss the fact that it uses @samp{=} and not
- @samp{==}. In fact, writing @samp{=} where @samp{==} was intended
- inside a condition is a common error, so GNU C can give warnings when
- @samp{=} appears in a way that suggests it's an error.
- It is much clearer to write the assignment as a separate statement, like this:
- @example
- x = advance (x);
- if (x != 0)
- @r{@dots{}}
- @end example
- @noindent
- This makes it unmistakably clear that @code{x} is assigned a new value.
- Another method is to use the comma operator (@pxref{Comma Operator}),
- like this:
- @example
- if (x = advance (x), x != 0)
- @r{@dots{}}
- @end example
- @noindent
- However, putting the assignment in a separate statement is usually clearer
- unless the assignment is very short, because it reduces nesting.
- @node Execution Control Expressions
- @chapter Execution Control Expressions
- @cindex execution control expressions
- @cindex expressions, execution control
- This chapter describes the C operators that combine expressions to
- control which of those expressions execute, or in which order.
- @menu
- * Logical Operators:: Logical conjunction, disjunction, negation.
- * Logicals and Comparison:: Logical operators with comparison operators.
- * Logicals and Assignments:: Assignments with logical operators.
- * Conditional Expression:: An if/else construct inside expressions.
- * Comma Operator:: Build a sequence of subexpressions.
- @end menu
- @node Logical Operators
- @section Logical Operators
- @cindex logical operators
- @cindex operators, logical
- @cindex conjunction operator
- @cindex disjunction operator
- @cindex negation operator, logical
- The @dfn{logical operators} combine truth values, which are normally
- represented in C as numbers. Any expression with a numeric value is a
- valid truth value: zero means false, and any other value means true.
- A pointer type is also meaningful as a truth value; a null pointer
- (which is zero) means false, and a non-null pointer means true
- (@pxref{Pointer Types}). The value of a logical operator is always 1
- or 0 and has type @code{int} (@pxref{Integer Types}).
- The logical operators are used mainly in the condition of an @code{if}
- statement, or in the end test in a @code{for} statement or
- @code{while} statement (@pxref{Statements}). However, they are valid
- in any context where an integer-valued expression is allowed.
- @table @samp
- @item ! @var{exp}
- Unary operator for logical ``not.'' The value is 1 (true) if
- @var{exp} is 0 (false), and 0 (false) if @var{exp} is nonzero (true).
- @strong{Warning:} if @code{exp} is anything but an lvalue or a
- function call, you should write parentheses around it.
- @item @var{left} && @var{right}
- The logical ``and'' binary operator computes @var{left} and, if necessary,
- @var{right}. If both of the operands are true, the @samp{&&} expression
- gives the value 1 (which is true). Otherwise, the @samp{&&} expression
- gives the value 0 (false). If @var{left} yields a false value,
- that determines the overall result, so @var{right} is not computed.
- @item @var{left} || @var{right}
- The logical ``or'' binary operator computes @var{left} and, if necessary,
- @var{right}. If at least one of the operands is true, the @samp{||} expression
- gives the value 1 (which is true). Otherwise, the @samp{||} expression
- gives the value 0 (false). If @var{left} yields a true value,
- that determines the overall result, so @var{right} is not computed.
- @end table
- @strong{Warning:} never rely on the relative precedence of @samp{&&}
- and @samp{||}. When you use them together, always use parentheses to
- specify explicitly how they nest, as shown here:
- @example
- if ((r != 0 && x % r == 0)
- ||
- (s != 0 && x % s == 0))
- @end example
- @node Logicals and Comparison
- @section Logical Operators and Comparisons
- The most common thing to use inside the logical operators is a
- comparison. Conveniently, @samp{&&} and @samp{||} have lower
- precedence than comparison operators and arithmetic operators, so we
- can write expressions like this without parentheses and get the
- nesting that is natural: two comparison operations that must both be
- true.
- @example
- if (r != 0 && x % r == 0)
- @end example
- @noindent
- This example also shows how it is useful that @samp{&&} guarantees to
- skip the right operand if the left one turns out false. Because of
- that, this code never tries to divide by zero.
- This is equivalent:
- @example
- if (r && x % r == 0)
- @end example
- @noindent
- A truth value is simply a number, so @code{r}
- as a truth value tests whether it is nonzero.
- But @code{r}'s meaning is not a truth value---it is a number to divide by.
- So it is better style to write the explicit @code{!= 0}.
- Here's another equivalent way to write it:
- @example
- if (!(r == 0) && x % r == 0)
- @end example
- @noindent
- This illustrates the unary @samp{!} operator, and the need to
- write parentheses around its operand.
- @node Logicals and Assignments
- @section Logical Operators and Assignments
- There are cases where assignments nested inside the condition can
- actually make a program @emph{easier} to read. Here is an example
- using a hypothetical type @code{list} which represents a list; it
- tests whether the list has at least two links, using hypothetical
- functions, @code{nonempty} which is true of the argument is a nonempty
- list, and @code{list_next} which advances from one list link to the
- next. We assume that a list is never a null pointer, so that the
- assignment expressions are always ``true.''
- @example
- if (nonempty (list)
- && (temp1 = list_next (list))
- && nonempty (temp1)
- && (temp2 = list_next (temp1)))
- @r{@dots{}} /* @r{use @code{temp1} and @code{temp2}} */
- @end example
- @noindent
- Here we get the benefit of the @samp{&&} operator, to avoid executing
- the rest of the code if a call to @code{nonempty} says ``false.'' The
- only natural place to put the assignments is among those calls.
- It would be possible to rewrite this as several statements, but that
- could make it much more cumbersome. On the other hand, when the test
- is even more complex than this one, splitting it into multiple
- statements might be necessary for clarity.
- If an empty list is a null pointer, we can dispense with calling
- @code{nonempty}:
- @example
- if ((temp1 = list_next (list))
- && (temp2 = list_next (temp1)))
- @r{@dots{}}
- @end example
- @node Conditional Expression
- @section Conditional Expression
- @cindex conditional expression
- @cindex expression, conditional
- C has a conditional expression that selects one of two expressions
- to compute and get the value from. It looks like this:
- @example
- @var{condition} ? @var{iftrue} : @var{iffalse}
- @end example
- @menu
- * Conditional Rules:: Rules for the conditional operator.
- * Conditional Branches:: About the two branches in a conditional.
- @end menu
- @node Conditional Rules
- @subsection Rules for Conditional Operator
- The first operand, @var{condition}, should be a value that can be
- compared with zero---a number or a pointer. If it is true (nonzero),
- then the conditional expression computes @var{iftrue} and its value
- becomes the value of the conditional expression. Otherwise the
- conditional expression computes @var{iffalse} and its value becomes
- the value of the conditional expression. The conditional expression
- always computes just one of @var{iftrue} and @var{iffalse}, never both
- of them.
- Here's an example: the absolute value of a number @code{x}
- can be written as @code{(x >= 0 ? x : -x)}.
- @strong{Warning:} The conditional expression operators have rather low
- syntactic precedence. Except when the conditional expression is used
- as an argument in a function call, write parentheses around it. For
- clarity, always write parentheses around it if it extends across more
- than one line.
- Assignment operators and the comma operator (@pxref{Comma Operator})
- have lower precedence than conditional expression operators, so write
- parentheses around those when they appear inside a conditional
- expression. @xref{Order of Execution}.
- @node Conditional Branches
- @subsection Conditional Operator Branches
- @cindex branches of conditional expression
- We call @var{iftrue} and @var{iffalse} the @dfn{branches} of the
- conditional.
- The two branches should normally have the same type, but a few
- exceptions are allowed. If they are both numeric types, the
- conditional converts both to their common type (@pxref{Common Type}).
- With pointers (@pxref{Pointers}), the two values can be pointers to
- nearly compatible types (@pxref{Compatible Types}). In this case, the
- result type is a similar pointer whose target type combines all the
- type qualifiers (@pxref{Type Qualifiers}) of both branches.
- If one branch has type @code{void *} and the other is a pointer to an
- object (not to a function), the conditional converts the @code{void *}
- branch to the type of the other.
- If one branch is an integer constant with value zero and the other is
- a pointer, the conditional converts zero to the pointer's type.
- In GNU C, you can omit @var{iftrue} in a conditional expression. In
- that case, if @var{condition} is nonzero, its value becomes the value of
- the conditional expression, after conversion to the common type.
- Thus,
- @example
- x ? : y
- @end example
- @noindent
- has the value of @code{x} if that is nonzero; otherwise, the value of
- @code{y}.
- @cindex side effect in ?:
- @cindex ?: side effect
- Omitting @var{iftrue} is useful when @var{condition} has side effects.
- In that case, writing that expression twice would carry out the side
- effects twice, but writing it once does them just once. For example,
- if we suppose that the function @code{next_element} advances a pointer
- variable to point to the next element in a list and returns the new
- pointer,
- @example
- next_element () ? : default_pointer
- @end example
- @noindent
- is a way to advance the pointer and use its new value if it isn't
- null, but use @code{default_pointer} if that is null. We must not do
- it this way,
- @example
- next_element () ? next_element () : default_pointer
- @end example
- @noindent
- because it would advance the pointer a second time.
- @node Comma Operator
- @section Comma Operator
- @cindex comma operator
- @cindex operator, comma
- The comma operator stands for sequential execution of expressions.
- The value of the comma expression comes from the last expression in
- the sequence; the previous expressions are computed only for their
- side effects. It looks like this:
- @example
- @var{exp1}, @var{exp2} @r{@dots{}}
- @end example
- @noindent
- You can bundle any number of expressions together this way, by putting
- commas between them.
- @menu
- * Uses of Comma:: When to use the comma operator.
- * Clean Comma:: Clean use of the comma operator.
- * Avoid Comma:: When to not use the comma operator.
- @end menu
- @node Uses of Comma
- @subsection The Uses of the Comma Operator
- With commas, you can put several expressions into a place that
- requires just one expression---for example, in the header of a
- @code{for} statement. This statement
- @example
- for (i = 0, j = 10, k = 20; i < n; i++)
- @end example
- @noindent
- contains three assignment expressions, to initialize @code{i}, @code{j}
- and @code{k}. The syntax of @code{for} requires just one expression
- for initialization; to include three assignments, we use commas to
- bundle them into a single larger expression, @code{i = 0, j = 10, k =
- 20}. This technique is also useful in the loop-advance expression,
- the last of the three inside the @code{for} parentheses.
- In the @code{for} statement and the @code{while} statement
- (@pxref{Loop Statements}), a comma provides a way to perform some side
- effect before the loop-exit test. For example,
- @example
- while (printf ("At the test, x = %d\n", x), x != 0)
- @end example
- @node Clean Comma
- @subsection Clean Use of the Comma Operator
- Always write parentheses around a series of comma operators, except
- when it is at top level in an expression statement, or within the
- parentheses of an @code{if}, @code{for}, @code{while}, or @code{switch}
- statement (@pxref{Statements}). For instance, in
- @example
- for (i = 0, j = 10, k = 20; i < n; i++)
- @end example
- @noindent
- the commas between the assignments are clear because they are between
- a parenthesis and a semicolon.
- The arguments in a function call are also separated by commas, but that is
- not an instance of the comma operator. Note the difference between
- @example
- foo (4, 5, 6)
- @end example
- @noindent
- which passes three arguments to @code{foo} and
- @example
- foo ((4, 5, 6))
- @end example
- @noindent
- which uses the comma operator and passes just one argument
- (with value 6).
- @strong{Warning:} don't use the comma operator around an argument
- of a function unless it helps understand the code. When you do so,
- don't put part of another argument on the same line. Instead, add a
- line break to make the parentheses around the comma operator easier to
- see, like this.
- @example
- foo ((mumble (x, y), frob (z)),
- *p)
- @end example
- @node Avoid Comma
- @subsection When Not to Use the Comma Operator
- You can use a comma in any subexpression, but in most cases it only
- makes the code confusing, and it is clearer to raise all but the last
- of the comma-separated expressions to a higher level. Thus, instead
- of this:
- @example
- x = (y += 4, 8);
- @end example
- @noindent
- it is much clearer to write this:
- @example
- y += 4, x = 8;
- @end example
- @noindent
- or this:
- @example
- y += 4;
- x = 8;
- @end example
- Use commas only in the cases where there is no clearer alternative
- involving multiple statements.
- By contrast, don't hesitate to use commas in the expansion in a macro
- definition. The trade-offs of code clarity are different in that
- case, because the @emph{use} of the macro may improve overall clarity
- so much that the ugliness of the macro's @emph{definition} is a small
- price to pay. @xref{Macros}.
- @node Binary Operator Grammar
- @chapter Binary Operator Grammar
- @cindex binary operator grammar
- @cindex grammar, binary operator
- @cindex operator precedence
- @cindex precedence, operator
- @cindex left-associative
- @dfn{Binary operators} are those that take two operands, one
- on the left and one on the right.
- All the binary operators in C are syntactically left-associative.
- This means that @w{@code{a @var{op} b @var{op} c}} means @w{@code{(a
- @var{op} b) @var{op} c}}. However, you should only write repeated
- operators without parentheses using @samp{+}, @samp{-}, @samp{*} and
- @samp{/}, because those cases are clear from algebra. So it is ok to
- write @code{a + b + c} or @code{a - b - c}, but never @code{a == b ==
- c} or @code{a % b % c}.
- Each C operator has a @dfn{precedence}, which is its rank in the
- grammatical order of the various operators. The operators with the
- highest precedence grab adjoining operands first; these expressions
- then become operands for operators of lower precedence.
- The precedence order of operators in C is fully specified, so any
- combination of operations leads to a well-defined nesting. We state
- only part of the full precedence ordering here because it is bad
- practice for C code to depend on the other cases. For cases not
- specified in this chapter, always use parentheses to make the nesting
- explicit.@footnote{Personal note from Richard Stallman: I wrote GCC without
- remembering anything about the C precedence order beyond what's stated
- here. I studied the full precedence table to write the parser, and
- promptly forgot it again. If you need to look up the full precedence order
- to understand some C code, fix the code with parentheses so nobody else
- needs to do that.}
- You can depend on this subsequence of the precedence ordering
- (stated from highest precedence to lowest):
- @enumerate
- @item
- Component access (@samp{.} and @samp{->}).
- @item
- Unary prefix operators.
- @item
- Unary postfix operators.
- @item
- Multiplication, division, and remainder (they have the same precedence).
- @item
- Addition and subtraction (they have the same precedence).
- @item
- Comparisons---but watch out!
- @item
- Logical operators @samp{&&} and @samp{||}---but watch out!
- @item
- Conditional expression with @samp{?} and @samp{:}.
- @item
- Assignments.
- @item
- Sequential execution (the comma operator, @samp{,}).
- @end enumerate
- Two of the lines in the above list say ``but watch out!'' That means
- that the line covers operators with subtly different precedence.
- Never depend on the grammar of C to decide how two comparisons nest;
- instead, always use parentheses to specify their nesting.
- You can let several @samp{&&} operators associate, or several
- @samp{||} operators, but always use parentheses to show how @samp{&&}
- and @samp{||} nest with each other. @xref{Logical Operators}.
- There is one other precedence ordering that code can depend on:
- @enumerate
- @item
- Unary postfix operators.
- @item
- Bitwise and shift operators---but watch out!
- @item
- Conditional expression with @samp{?} and @samp{:}.
- @end enumerate
- The caveat for bitwise and shift operators is like that for logical
- operators: you can let multiple uses of one bitwise operator
- associate, but always use parentheses to control nesting of dissimilar
- operators.
- These lists do not specify any precedence ordering between the bitwise
- and shift operators of the second list and the binary operators above
- conditional expressions in the first list. When they come together,
- parenthesize them. @xref{Bitwise Operations}.
- @node Order of Execution
- @chapter Order of Execution
- @cindex order of execution
- The order of execution of a C program is not always obvious, and not
- necessarily predictable. This chapter describes what you can count on.
- @menu
- * Reordering of Operands:: Operations in C are not necessarily computed
- in the order they are written.
- * Associativity and Ordering:: Some associative operations are performed
- in a particular order; others are not.
- * Sequence Points:: Some guarantees about the order of operations.
- * Postincrement and Ordering:: Ambiguous excution order with postincrement.
- * Ordering of Operands:: Evaluation order of operands
- and function arguments.
- * Optimization and Ordering:: Compiler optimizations can reorder operations
- only if it has no impact on program results.
- @end menu
- @node Reordering of Operands
- @section Reordering of Operands
- @cindex ordering of operands
- @cindex reordering of operands
- @cindex operand execution ordering
- The C language does not necessarily carry out operations within an
- expression in the order they appear in the code. For instance, in
- this expression,
- @example
- foo () + bar ()
- @end example
- @noindent
- @code{foo} might be called first or @code{bar} might be called first.
- If @code{foo} updates a datum and @code{bar} uses that datum, the
- results can be unpredictable.
- The unpredictable order of computation of subexpressions also makes a
- difference when one of them contains an assignment. We already saw
- this example of bad code,
- @example
- x = 20;
- printf ("%d %d\n", x, x = 4);
- @end example
- @noindent
- in which the second argument, @code{x}, has a different value
- depending on whether it is computed before or after the assignment in
- the third argument.
- @node Associativity and Ordering
- @section Associativity and Ordering
- @cindex associativity and ordering
- An associative binary operator, such as @code{+}, when used repeatedly
- can combine any number of operands. The operands' values may be
- computed in any order.
- If the values are integers and overflow can be ignored, they may be
- combined in any order. Thus, given four functions that return
- @code{unsigned int}, calling them and adding their results as here
- @example
- (foo () + bar ()) + (baz () + quux ())
- @end example
- @noindent
- may add up the results in any order.
- By contrast, arithmetic on signed integers, with overflow significant,
- is not really associative (@pxref{Integer Overflow}). Thus, the
- additions must be done in the order specified, obeying parentheses and
- left-association. That means computing @code{(foo () + bar ())} and
- @code{(baz () + quux ())} first (in either order), then adding the
- two.
- The same applies to arithmetic on floating-point values, since that
- too is not really associative. However, the GCC option
- @option{-funsafe-math-optimizations} allows the compiler to change the
- order of calculation when an associative operation (associative in
- exact mathematics) combines several operands. The option takes effect
- when compiling a module (@pxref{Compilation}). Changing the order
- of association can enable the program to pipeline the floating point
- operations.
- In all these cases, the four function calls can be done in any order.
- There is no right or wrong about that.
- @node Sequence Points
- @section Sequence Points
- @cindex sequence points
- @cindex full expression
- There are some points in the code where C makes limited guarantees
- about the order of operations. These are called @dfn{sequence
- points}. Here is where they occur:
- @itemize @bullet
- @item
- At the end of a @dfn{full expression}; that is to say, an expression
- that is not part of a larger expression. All side effects specified
- by that expression are carried out before execution moves
- on to subsequent code.
- @item
- At the end of the first operand of certain operators: @samp{,},
- @samp{&&}, @samp{||}, and @samp{?:}. All side effects specified by
- that expression are carried out before any execution of the
- next operand.
- The commas that separate arguments in a function call are @emph{not}
- comma operators, and they do not create sequence points. The rule
- for function arguments and the rule for operands are different
- (@pxref{Ordering of Operands}).
- @item
- Just before calling a function. All side effects specified by the
- argument expressions are carried out before calling the function.
- If the function to be called is not constant---that is, if it is
- computed by an expression---all side effects in that expression are
- carried out before calling the function.
- @end itemize
- The ordering imposed by a sequence point applies locally to a limited
- range of code, as stated above in each case. For instance, the
- ordering imposed by the comma operator does not apply to code outside
- that comma operator. Thus, in this code,
- @example
- (x = 5, foo (x)) + x * x
- @end example
- @noindent
- the sequence point of the comma operator orders @code{x = 5} before
- @code{foo (x)}, but @code{x * x} could be computed before or after
- them.
- @node Postincrement and Ordering
- @section Postincrement and Ordering
- @cindex postincrement and ordering
- @cindex ordering and postincrement
- Ordering requirements are loose with the postincrement and
- postdecrement operations (@pxref{Postincrement/Postdecrement}), which
- specify side effects to happen ``a little later.'' They must happen
- before the next sequence point, but that still leaves room for various
- meanings. In this expression,
- @example
- z = x++ - foo ()
- @end example
- @noindent
- it's unpredictable whether @code{x} gets incremented before or after
- calling the function @code{foo}. If @code{foo} refers to @code{x},
- it might see the old value or it might see the incremented value.
- In this perverse expression,
- @example
- x = x++
- @end example
- @noindent
- @code{x} will certainly be incremented but the incremented value may
- not stick. If the incrementation of @code{x} happens after the
- assignment to @code{x}, the incremented value will remain in place.
- But if the incrementation happens first, the assignment will overwrite
- that with the not-yet-incremented value, so the expression as a whole
- will leave @code{x} unchanged.
- @node Ordering of Operands
- @section Ordering of Operands
- @cindex ordering of operands
- @cindex operand ordering
- Operands and arguments can be computed in any order, but there are limits to
- this intermixing in GNU C:
- @itemize @bullet
- @item
- The operands of a binary arithmetic operator can be computed in either
- order, but they can't be intermixed: one of them has to come first,
- followed by the other. Any side effects in the operand that's computed
- first are executed before the other operand is computed.
- @item
- That applies to assignment operators too, except that in simple assignment
- the previous value of the left operand is unused.
- @item
- The arguments in a function call can be computed in any order, but
- they can't be intermixed. Thus, one argument is fully computed, then
- another, and so on until they are all done. Any side effects in one argument
- are executed before computation of another argument begins.
- @end itemize
- These rules don't cover side effects caused by postincrement and
- postdecrement operators---those can be deferred up to the next
- sequence point.
- If you want to get pedantic, the fact is that GCC can reorder the
- computations in many other ways provided that doesn't alter the result
- of running the program. However, because they don't alter the result
- of running the program, they are negligible, unless you are concerned
- with the values in certain variables at various times as seen by other
- processes. In those cases, you can use @code{volatile} to prevent
- optimizations that would make them behave strangely. @xref{volatile}.
- @node Optimization and Ordering
- @section Optimization and Ordering
- @cindex optimization and ordering
- @cindex ordering and optimization
- Sequence points limit the compiler's freedom to reorder operations
- arbitrarily, but optimizations can still reorder them if the compiler
- concludes that this won't alter the results. Thus, in this code,
- @example
- x++;
- y = z;
- x++;
- @end example
- @noindent
- there is a sequence point after each statement, so the code is
- supposed to increment @code{x} once before the assignment to @code{y}
- and once after. However, incrementing @code{x} has no effect on
- @code{y} or @code{z}, and setting @code{y} can't affect @code{x}, so
- the code could be optimized into this:
- @example
- y = z;
- x += 2;
- @end example
- Normally that has no effect except to make the program faster. But
- there are special situations where it can cause trouble due to things
- that the compiler cannot know about, such as shared memory. To limit
- optimization in those places, use the @code{volatile} type qualifier
- (@pxref{volatile}).
- @node Primitive Types
- @chapter Primitive Data Types
- @cindex primitive types
- @cindex types, primitive
- This chapter describes all the primitive data types of C---that is,
- all the data types that aren't built up from other types. They
- include the types @code{int} and @code{double} that we've already covered.
- @menu
- * Integer Types:: Description of integer types.
- * Floating-Point Data Types:: Description of floating-point types.
- * Complex Data Types:: Description of complex number types.
- * The Void Type:: A type indicating no value at all.
- * Other Data Types:: A brief summary of other types.
- * Type Designators:: Referring to a data type abstractly.
- @end menu
- These types are all made up of bytes (@pxref{Storage}).
- @node Integer Types
- @section Integer Data Types
- @cindex integer types
- @cindex types, integer
- Here we describe all the integer types and their basic
- characteristics. @xref{Integers in Depth}, for more information about
- the bit-level integer data representations and arithmetic.
- @menu
- * Basic Integers:: Overview of the various kinds of integers.
- * Signed and Unsigned Types:: Integers can either hold both negative and
- non-negative values, or only non-negative.
- * Narrow Integers:: When to use smaller integer types.
- * Integer Conversion:: Casting a value from one integer type
- to another.
- * Boolean Type:: An integer type for boolean values.
- * Integer Variations:: Sizes of integer types can vary
- across platforms.
- @end menu
- @node Basic Integers
- @subsection Basic Integers
- @findex char
- @findex int
- @findex short int
- @findex long int
- @findex long long int
- Integer data types in C can be signed or unsigned. An unsigned type
- can represent only positive numbers and zero. A signed type can
- represent both positive and negative numbers, in a range spread almost
- equally on both sides of zero.
- Aside from signedness, the integer data types vary in size: how many
- bytes long they are. The size determines how many different integer
- values the type can hold.
- Here's a list of the signed integer data types, with the sizes they
- have on most computers. Each has a corresponding unsigned type; see
- @ref{Signed and Unsigned Types}.
- @table @code
- @item signed char
- One byte (8 bits). This integer type is used mainly for integers that
- represent characters, as part of arrays or other data structures.
- @item short
- @itemx short int
- Two bytes (16 bits).
- @item int
- Four bytes (32 bits).
- @item long
- @itemx long int
- Four bytes (32 bits) or eight bytes (64 bits), depending on the
- platform. Typically it is 32 bits on 32-bit computers
- and 64 bits on 64-bit computers, but there are exceptions.
- @item long long
- @itemx long long int
- Eight bytes (64 bits). Supported in GNU C in the 1980s, and
- incorporated into standard C as of ISO C99.
- @end table
- You can omit @code{int} when you use @code{long} or @code{short}.
- This is harmless and customary.
- @node Signed and Unsigned Types
- @subsection Signed and Unsigned Types
- @cindex signed types
- @cindex unsigned types
- @cindex types, signed
- @cindex types, unsigned
- @findex signed
- @findex unsigned
- An unsigned integer type can represent only positive numbers and zero.
- A signed type can represent both positive and negative number, in a
- range spread almost equally on both sides of zero. For instance,
- @code{unsigned char} holds numbers from 0 to 255 (on most computers),
- while @code{signed char} holds numbers from @minus{}128 to 127. Each of
- these types holds 256 different possible values, since they are both 8
- bits wide.
- Write @code{signed} or @code{unsigned} before the type keyword to
- specify a signed or an unsigned type. However, the integer types
- other than @code{char} are signed by default; with them, @code{signed}
- is a no-op.
- Plain @code{char} may be signed or unsigned; this depends on the
- compiler, the machine in use, and its operating system.
- In many programs, it makes no difference whether @code{char} is
- signed. When it does matter, don't leave it to chance; write
- @code{signed char} or @code{unsigned char}.@footnote{Personal note from
- Richard Stallman: Eating with hackers at a fish restaurant, I ordered
- Arctic Char. When my meal arrived, I noted that the chef had not
- signed it. So I complained, ``This char is unsigned---I wanted a
- signed char!'' Or rather, I would have said this if I had thought of
- it fast enough.}
- @node Narrow Integers
- @subsection Narrow Integers
- The types that are narrower than @code{int} are rarely used for
- ordinary variables---we declare them @code{int} instead. This is
- because C converts those narrower types to @code{int} for any
- arithmetic. There is literally no reason to declare a local variable
- @code{char}, for instance.
- In particular, if the value is really a character, you should declare
- the variable @code{int}. Not @code{char}! Using that narrow type can
- force the compiler to truncate values for conversion, which is a
- waste. Furthermore, some functions return either a character value,
- or @minus{}1 for ``no character.'' Using @code{int} keeps those
- values distinct.
- The narrow integer types are useful as parts of other objects, such as
- arrays and structures. Compare these array declarations, whose sizes
- on 32-bit processors are shown:
- @example
- signed char ac[1000]; /* @r{1000 bytes} */
- short as[1000]; /* @r{2000 bytes} */
- int ai[1000]; /* @r{4000 bytes} */
- long long all[1000]; /* @r{8000 bytes} */
- @end example
- In addition, character strings must be made up of @code{char}s,
- because that's what all the standard library string functions expect.
- Thus, array @code{ac} could be used as a character string, but the
- others could not be.
- @node Integer Conversion
- @subsection Conversion among Integer Types
- C converts between integer types implicitly in many situations. It
- converts the narrow integer types, @code{char} and @code{short}, to
- @code{int} whenever they are used in arithmetic. Assigning a new
- value to an integer variable (or other lvalue) converts the value to
- the variable's type.
- You can also convert one integer type to another explicitly with a
- @dfn{cast} operator. @xref{Explicit Type Conversion}.
- The process of conversion to a wider type is straightforward: the
- value is unchanged. The only exception is when converting a negative
- value (in a signed type, obviously) to a wider unsigned type. In that
- case, the result is a positive value with the same bits
- (@pxref{Integers in Depth}).
- @cindex truncation
- Converting to a narrower type, also called @dfn{truncation}, involves
- discarding some of the value's bits. This is not considered overflow
- (@pxref{Integer Overflow}) because loss of significant bits is a
- normal consequence of truncation. Likewise for conversion between
- signed and unsigned types of the same width.
- More information about conversion for assignment is in
- @ref{Assignment Type Conversions}. For conversion for arithmetic,
- see @ref{Argument Promotions}.
- @node Boolean Type
- @subsection Boolean Type
- @cindex boolean type
- @cindex type, boolean
- @findex bool
- The unsigned integer type @code{bool} holds truth values: its possible
- values are 0 and 1. Converting any nonzero value to @code{bool}
- results in 1. For example:
- @example
- bool a = 0;
- bool b = 1;
- bool c = 4; /* @r{Stores the value 1 in @code{c}.} */
- @end example
- Unlike @code{int}, @code{bool} is not a keyword. It is defined in
- the header file @file{stdbool.h}.
- @node Integer Variations
- @subsection Integer Variations
- The integer types of C have standard @emph{names}, but what they
- @emph{mean} varies depending on the kind of platform in use:
- which kind of computer, which operating system, and which compiler.
- It may even depend on the compiler options used.
- Plain @code{char} may be signed or unsigned; this depends on the
- platform, too. Even for GNU C, there is no general rule.
- In theory, all of the integer types' sizes can vary. @code{char} is
- always considered one ``byte'' for C, but it is not necessarily an
- 8-bit byte; on some platforms it may be more than 8 bits. ISO C
- specifies only that none of these types is narrower than the ones
- above it in the list in @ref{Basic Integers}, and that @code{short}
- has at least 16 bits.
- It is possible that in the future GNU C will support platforms where
- @code{int} is 64 bits long. In practice, however, on today's real
- computers, there is little variation; you can rely on the table
- given previously (@pxref{Basic Integers}).
- To be completely sure of the size of an integer type,
- use the types @code{int16_t}, @code{int32_t} and @code{int64_t}.
- Their corresponding unsigned types add @samp{u} at the front.
- To define these, include the header file @file{stdint.h}.
- The GNU C Compiler compiles for some embedded controllers that use two
- bytes for @code{int}. On some, @code{int} is just one ``byte,'' and
- so is @code{short int}---but that ``byte'' may contain 16 bits or even
- 32 bits. These processors can't support an ordinary operating system
- (they may have their own specialized operating systems), and most C
- programs do not try to support them.
- @node Floating-Point Data Types
- @section Floating-Point Data Types
- @cindex floating-point types
- @cindex types, floating-point
- @findex double
- @findex float
- @findex long double
- @dfn{Floating point} is the binary analogue of scientific notation:
- internally it represents a number as a fraction and a binary exponent; the
- value is that fraction multiplied by the specified power of 2.
- For instance, to represent 6, the fraction would be 0.75 and the
- exponent would be 3; together they stand for the value @math{0.75 * 2@sup{3}},
- meaning 0.75 * 8. The value 1.5 would use 0.75 as the fraction and 1
- as the exponent. The value 0.75 would use 0.75 as the fraction and 0
- as the exponent. The value 0.375 would use 0.75 as the fraction and
- -1 as the exponent.
- These binary exponents are used by machine instructions. You can
- write a floating-point constant this way if you wish, using
- hexadecimal; but normally we write floating-point numbers in decimal.
- @xref{Floating Constants}.
- C has three floating-point data types:
- @table @code
- @item double
- ``Double-precision'' floating point, which uses 64 bits. This is the
- normal floating-point type, and modern computers normally do
- their floating-point computations in this type, or some wider type.
- Except when there is a special reason to do otherwise, this is the
- type to use for floating-point values.
- @item float
- ``Single-precision'' floating point, which uses 32 bits. It is useful
- for floating-point values stored in structures and arrays, to save
- space when the full precision of @code{double} is not needed. In
- addition, single-precision arithmetic is faster on some computers, and
- occasionally that is useful. But not often---most programs don't use
- the type @code{float}.
- C would be cleaner if @code{float} were the name of the type we
- use for most floating-point values; however, for historical reasons,
- that's not so.
- @item long double
- ``Extended-precision'' floating point is either 80-bit or 128-bit
- precision, depending on the machine in use. On some machines, which
- have no floating-point format wider than @code{double}, this is
- equivalent to @code{double}.
- @end table
- Floating-point arithmetic raises many subtle issues. @xref{Floating
- Point in Depth}, for more information.
- @node Complex Data Types
- @section Complex Data Types
- @cindex complex numbers
- @cindex types, complex
- @cindex @code{_Complex} keyword
- @cindex @code{__complex__} keyword
- @findex _Complex
- @findex __complex__
- Complex numbers can include both a real part and an imaginary part.
- The numeric constants covered above have real-numbered values. An
- imaginary-valued constant is an ordinary real-valued constant followed
- by @samp{i}.
- To declare numeric variables as complex, use the @code{_Complex}
- keyword.@footnote{For compatibility with older versions of GNU C, the
- keyword @code{__complex__} is also allowed. Going forward, however,
- use the new @code{_Complex} keyword as defined in ISO C11.} The
- standard C complex data types are floating point,
- @example
- _Complex float foo;
- _Complex double bar;
- _Complex long double quux;
- @end example
- @noindent
- but GNU C supports integer complex types as well.
- Since @code{_Complex} is a keyword just like @code{float} and
- @code{double} and @code{long}, the keywords can appear in any order,
- but the order shown above seems most logical.
- GNU C supports constants for complex values; for instance, @code{4.0 +
- 3.0i} has the value 4 + 3i as type @code{_Complex double}.
- @xref{Imaginary Constants}.
- To pull the real and imaginary parts of the number back out, GNU C
- provides the keywords @code{__real__} and @code{__imag__}:
- @example
- _Complex double foo = 4.0 + 3.0i;
- double a = __real__ foo; /* @r{@code{a} is now 4.0.} */
- double b = __imag__ foo; /* @r{@code{b} is now 3.0.} */
- @end example
- @noindent
- Standard C does not include these keywords, and instead relies on
- functions defined in @code{complex.h} for accessing the real and
- imaginary parts of a complex number: @code{crealf}, @code{creal}, and
- @code{creall} extract the real part of a float, double, or long double
- complex number, respectively; @code{cimagf}, @code{cimag}, and
- @code{cimagl} extract the imaginary part.
- @cindex complex conjugation
- GNU C also defines @samp{~} as an operator for complex conjugation,
- which means negating the imaginary part of a complex number:
- @example
- _Complex double foo = 4.0 + 3.0i;
- _Complex double bar = ~foo; /* @r{@code{bar} is now 4 @minus{} 3i.} */
- @end example
- @noindent
- For standard C compatibility, you can use the appropriate library
- function: @code{conjf}, @code{conj}, or @code{confl}.
- @node The Void Type
- @section The Void Type
- @cindex void type
- @cindex type, void
- @findex void
- The data type @code{void} is a dummy---it allows no operations. It
- really means ``no value at all.'' When a function is meant to return
- no value, we write @code{void} for its return type. Then
- @code{return} statements in that function should not specify a value
- (@pxref{return Statement}). Here's an example:
- @example
- void
- print_if_positive (double x, double y)
- @{
- if (x <= 0)
- return;
- if (y <= 0)
- return;
- printf ("Next point is (%f,%f)\n", x, y);
- @}
- @end example
- A @code{void}-returning function is comparable to what some other languages
- call a ``procedure'' instead of a ``function.''
- @c ??? Already presented
- @c @samp{%f} in an output template specifies to format a @code{double} value
- @c as a decimal number, using a decimal point if needed.
- @node Other Data Types
- @section Other Data Types
- Beyond the primitive types, C provides several ways to construct new
- data types. For instance, you can define @dfn{pointers}, values that
- represent the addresses of other data (@pxref{Pointers}). You can
- define @dfn{structures}, as in many other languages
- (@pxref{Structures}), and @dfn{unions}, which specify multiple ways
- to look at the same memory space (@pxref{Unions}). @dfn{Enumerations}
- are collections of named integer codes (@pxref{Enumeration Types}).
- @dfn{Array types} in C are used for allocating space for objects,
- but C does not permit operating on an array value as a whole. @xref{Arrays}.
- @node Type Designators
- @section Type Designators
- @cindex type designator
- Some C constructs require a way to designate a specific data type
- independent of any particular variable or expression which has that
- type. The way to do this is with a @dfn{type designator}. The
- constucts that need one include casts (@pxref{Explicit Type
- Conversion}) and @code{sizeof} (@pxref{Type Size}).
- We also use type designators to talk about the type of a value in C,
- so you will see many type designators in this manual. When we say,
- ``The value has type @code{int},'' @code{int} is a type designator.
- To make the designator for any type, imagine a variable declaration
- for a variable of that type and delete the variable name and the final
- semicolon.
- For example, to designate the type of full-word integers, we start
- with the declaration for a variable @code{foo} with that type,
- which is this:
- @example
- int foo;
- @end example
- @noindent
- Then we delete the variable name @code{foo} and the semicolon, leaving
- @code{int}---exactly the keyword used in such a declaration.
- Therefore, the type designator for this type is @code{int}.
- What about long unsigned integers? From the declaration
- @example
- unsigned long int foo;
- @end example
- @noindent
- we determine that the designator is @code{unsigned long int}.
- Following this procedure, the designator for any primitive type is
- simply the set of keywords which specifies that type in a declaration.
- The same is true for compound types such as structures, unions, and
- enumerations.
- Designators for pointer types do follow the rule of deleting the
- variable name and semicolon, but the result is not so simple.
- @xref{Pointer Type Designators}, as part of the chapter about
- pointers. @xref{Array Type Designators}), for designators for array
- types.
- To understand what type a designator stands for, imagine a variable
- name inserted into the right place in the designator to make a valid
- declaration. What type would that variable be declared as? That is the
- type the designator designates.
- @node Constants
- @chapter Constants
- @cindex constants
- A @dfn{constant} is an expression that stands for a specific value by
- explicitly representing the desired value. C allows constants for
- numbers, characters, and strings. We have already seen numeric and
- string constants in the examples.
- @menu
- * Integer Constants:: Literal integer values.
- * Integer Const Type:: Types of literal integer values.
- * Floating Constants:: Literal floating-point values.
- * Imaginary Constants:: Literal imaginary number values.
- * Invalid Numbers:: Avoiding preprocessing number misconceptions.
- * Character Constants:: Literal character values.
- * String Constants:: Literal string values.
- * UTF-8 String Constants:: Literal UTF-8 string values.
- * Unicode Character Codes:: Unicode characters represented
- in either UTF-16 or UTF-32.
- * Wide Character Constants:: Literal characters values larger than 8 bits.
- * Wide String Constants:: Literal string values made up of
- 16- or 32-bit characters.
- @end menu
- @node Integer Constants
- @section Integer Constants
- @cindex integer constants
- @cindex constants, integer
- An integer constant consists of a number to specify the value,
- followed optionally by suffix letters to specify the data type.
- The simplest integer constants are numbers written in base 10
- (decimal), such as @code{5}, @code{77}, and @code{403}. A decimal
- constant cannot start with the character @samp{0} (zero) because
- that makes the constant octal.
- You can get the effect of a negative integer constant by putting a
- minus sign at the beginning. Grammatically speaking, that is an
- arithmetic expression rather than a constant, but it behaves just like
- a true constant.
- Integer constants can also be written in octal (base 8), hexadecimal
- (base 16), or binary (base 2). An octal constant starts with the
- character @samp{0} (zero), followed by any number of octal digits
- (@samp{0} to @samp{7}):
- @example
- 0 // @r{zero}
- 077 // @r{63}
- 0403 // @r{259}
- @end example
- @noindent
- Pedantically speaking, the constant @code{0} is an octal constant, but
- we can think of it as decimal; it has the same value either way.
- A hexadecimal constant starts with @samp{0x} (upper or lower case)
- followed by hex digits (@samp{0} to @samp{9}, as well as @samp{a}
- through @samp{f} in upper or lower case):
- @example
- 0xff // @r{255}
- 0XA0 // @r{160}
- 0xffFF // @r{65535}
- @end example
- @cindex binary integer constants
- A binary constant starts with @samp{0b} (upper or lower case) followed
- by bits (each represented by the characters @samp{0} or @samp{1}):
- @example
- 0b101 // @r{5}
- @end example
- Binary constants are a GNU C extension, not part of the C standard.
- Sometimes a space is needed after an integer constant to avoid
- lexical confusion with the following tokens. @xref{Invalid Numbers}.
- @node Integer Const Type
- @section Integer Constant Data Types
- @cindex integer constant data types
- @cindex constant data types, integer
- @cindex types of integer constants
- The type of an integer constant is normally @code{int}, if the value
- fits in that type, but here are the complete rules. The type
- of an integer constant is the first one in this sequence that can
- properly represent the value,
- @enumerate
- @item
- @code{int}
- @item
- @code{unsigned int}
- @item
- @code{long int}
- @item
- @code{unsigned long int}
- @item
- @code{long long int}
- @item
- @code{unsigned long long int}
- @end enumerate
- @noindent
- and that isn't excluded by the following rules.
- If the constant has @samp{l} or @samp{L} as a suffix, that excludes the
- first two types (non-@code{long}).
- If the constant has @samp{ll} or @samp{LL} as a suffix, that excludes
- first four types (non-@code{long long}).
- If the constant has @samp{u} or @samp{U} as a suffix, that excludes
- the signed types.
- Otherwise, if the constant is decimal, that excludes the unsigned
- types.
- @c ### This said @code{unsigned int} is excluded.
- @c ### See 17 April 2016
- Here are some examples of the suffixes.
- @example
- 3000000000u // @r{three billion as @code{unsigned int}.}
- 0LL // @r{zero as a @code{long long int}.}
- 0403l // @r{259 as a @code{long int}.}
- @end example
- Suffixes in integer constants are rarely used. When the precise type
- is important, it is cleaner to convert explicitly (@pxref{Explicit
- Type Conversion}).
- @xref{Integer Types}.
- @node Floating Constants
- @section Floating-Point Constants
- @cindex floating-point constants
- @cindex constants, floating-point
- A floating-point constant must have either a decimal point, an
- exponent-of-ten, or both; they distinguish it from an integer
- constant.
- To indicate an exponent, write @samp{e} or @samp{E}. The exponent
- value follows. It is always written as a decimal number; it can
- optionally start with a sign. The exponent @var{n} means to multiply
- the constant's value by ten to the @var{n}th power.
- Thus, @samp{1500.0}, @samp{15e2}, @samp{15e+2}, @samp{15.0e2},
- @samp{1.5e+3}, @samp{.15e4}, and @samp{15000e-1} are six ways of
- writing a floating-point number whose value is 1500. They are all
- equivalent.
- Here are more examples with decimal points:
- @example
- 1.0
- 1000.
- 3.14159
- .05
- .0005
- @end example
- For each of them, here are some equivalent constants written with
- exponents:
- @example
- 1e0, 1.0000e0
- 100e1, 100e+1, 100E+1, 1e3, 10000e-1
- 3.14159e0
- 5e-2, .0005e+2, 5E-2, .0005E2
- .05e-2
- @end example
- A floating-point constant normally has type @code{double}. You can
- force it to type @code{float} by adding @samp{f} or @samp{F}
- at the end. For example,
- @example
- 3.14159f
- 3.14159e0f
- 1000.f
- 100E1F
- .0005f
- .05e-2f
- @end example
- Likewise, @samp{l} or @samp{L} at the end forces the constant
- to type @code{long double}.
- You can use exponents in hexadecimal floating constants, but since
- @samp{e} would be interpreted as a hexadecimal digit, the character
- @samp{p} or @samp{P} (for ``power'') indicates an exponent.
- The exponent in a hexadecimal floating constant is a possibly-signed
- decimal integer that specifies a power of 2 (@emph{not} 10 or 16) to
- multiply into the number.
- Here are some examples:
- @example
- @group
- 0xAp2 // @r{40 in decimal}
- 0xAp-1 // @r{5 in decimal}
- 0x2.0Bp4 // @r{16.75 decimal}
- 0xE.2p3 // @r{121 decimal}
- 0x123.ABCp0 // @r{291.6708984375 in decimal}
- 0x123.ABCp4 // @r{4666.734375 in decimal}
- 0x100p-8 // @r{1}
- 0x10p-4 // @r{1}
- 0x1p+4 // @r{16}
- 0x1p+8 // @r{256}
- @end group
- @end example
- @xref{Floating-Point Data Types}.
- @node Imaginary Constants
- @section Imaginary Constants
- @cindex imaginary constants
- @cindex complex constants
- @cindex constants, imaginary
- A complex number consists of a real part plus an imaginary part.
- (Either or both parts may be zero.) This section explains how to
- write numeric constants with imaginary values. By adding these to
- ordinary real-valued numeric constants, we can make constants with
- complex values.
- The simple way to write an imaginary-number constant is to attach the
- suffix @samp{i} or @samp{I}, or @samp{j} or @samp{J}, to an integer or
- floating-point constant. For example, @code{2.5fi} has type
- @code{_Complex float} and @code{3i} has type @code{_Complex int}.
- The four alternative suffix letters are all equivalent.
- @cindex _Complex_I
- The other way to write an imaginary constant is to multiply a real
- constant by @code{_Complex_I}, which represents the imaginary number
- i. Standard C doesn't support suffixing with @samp{i} or @samp{j}, so
- this clunky way is needed.
- To write a complex constant with a nonzero real part and a nonzero
- imaginary part, write the two separately and add them, like this:
- @example
- 4.0 + 3.0i
- @end example
- @noindent
- That gives the value 4 + 3i, with type @code{_Complex double}.
- Such a sum can include multiple real constants, or none. Likewise, it
- can include multiple imaginary constants, or none. For example:
- @example
- _Complex double foo, bar, quux;
- foo = 2.0i + 4.0 + 3.0i; /* @r{Imaginary part is 5.0.} */
- bar = 4.0 + 12.0; /* @r{Imaginary part is 0.0.} */
- quux = 3.0i + 15.0i; /* @r{Real part is 0.0.} */
- @end example
- @xref{Complex Data Types}.
- @node Invalid Numbers
- @section Invalid Numbers
- Some number-like constructs which are not really valid as numeric
- constants are treated as numbers in preprocessing directives. If
- these constructs appear outside of preprocessing, they are erroneous.
- @xref{Preprocessing Tokens}.
- Sometimes we need to insert spaces to separate tokens so that they
- won't be combined into a single number-like construct. For example,
- @code{0xE+12} is a preprocessing number that is not a valid numeric
- constant, so it is a syntax error. If what we want is the three
- tokens @code{@w{0xE + 12}}, we have to use those spaces as separators.
- @node Character Constants
- @section Character Constants
- @cindex character constants
- @cindex constants, character
- @cindex escape sequence
- A @dfn{character constant} is written with single quotes, as in
- @code{'@var{c}'}. In the simplest case, @var{c} is a single ASCII
- character that the constant should represent. The constant has type
- @code{int}, and its value is the character code of that character.
- For instance, @code{'a'} represents the character code for the letter
- @samp{a}: 97, that is.
- To put the @samp{'} character (single quote) in the character
- constant, @dfn{quote} it with a backslash (@samp{\}). This character
- constant looks like @code{'\''}. This sort of sequence, starting with
- @samp{\}, is called an @dfn{escape sequence}---the backslash character
- here functions as a kind of @dfn{escape character}.
- To put the @samp{\} character (backslash) in the character constant,
- quote it likewise with @samp{\} (another backslash). This character
- constant looks like @code{'\\'}.
- @cindex bell character
- @cindex @samp{\a}
- @cindex backspace
- @cindex @samp{\b}
- @cindex tab (ASCII character)
- @cindex @samp{\t}
- @cindex vertical tab
- @cindex @samp{\v}
- @cindex formfeed
- @cindex @samp{\f}
- @cindex newline
- @cindex @samp{\n}
- @cindex return (ASCII character)
- @cindex @samp{\r}
- @cindex escape (ASCII character)
- @cindex @samp{\e}
- Here are all the escape sequences that represent specific
- characters in a character constant. The numeric values shown are
- the corresponding ASCII character codes, as decimal numbers.
- @example
- '\a' @result{} 7 /* @r{alarm, @kbd{CTRL-g}} */
- '\b' @result{} 8 /* @r{backspace, @key{BS}, @kbd{CTRL-h}} */
- '\t' @result{} 9 /* @r{tab, @key{TAB}, @kbd{CTRL-i}} */
- '\n' @result{} 10 /* @r{newline, @kbd{CTRL-j}} */
- '\v' @result{} 11 /* @r{vertical tab, @kbd{CTRL-k}} */
- '\f' @result{} 12 /* @r{formfeed, @kbd{CTRL-l}} */
- '\r' @result{} 13 /* @r{carriage return, @key{RET}, @kbd{CTRL-m}} */
- '\e' @result{} 27 /* @r{escape character, @key{ESC}, @kbd{CTRL-[}} */
- '\\' @result{} 92 /* @r{backslash character, @kbd{\}} */
- '\'' @result{} 39 /* @r{singlequote character, @kbd{'}} */
- '\"' @result{} 34 /* @r{doublequote character, @kbd{"}} */
- '\?' @result{} 63 /* @r{question mark, @kbd{?}} */
- @end example
- @samp{\e} is a GNU C extension; to stick to standard C, write @samp{\33}.
- You can also write octal and hex character codes as
- @samp{\@var{octalcode}} or @samp{\x@var{hexcode}}. Decimal is not an
- option here, so octal codes do not need to start with @samp{0}.
- The character constant's value has type @code{int}. However, the
- character code is treated initially as a @code{char} value, which is
- then converted to @code{int}. If the character code is greater than
- 127 (@code{0177} in octal), the resulting @code{int} may be negative
- on a platform where the type @code{char} is 8 bits long and signed.
- @node String Constants
- @section String Constants
- @cindex string constants
- @cindex constants, string
- A @dfn{string constant} represents a series of characters. It starts
- with @samp{"} and ends with @samp{"}; in between are the contents of
- the string. Quoting special characters such as @samp{"}, @samp{\} and
- newline in the contents works in string constants as in character
- constants. In a string constant, @samp{'} does not need to be quoted.
- A string constant defines an array of characters which contains the
- specified characters followed by the null character (code 0). Using
- the string constant is equivalent to using the name of an array with
- those contents. In simple cases, the length in bytes of the string
- constant is one greater than the number of characters written in it.
- As with any array in C, using the string constant in an expression
- converts the array to a pointer (@pxref{Pointers}) to the array's
- first element (@pxref{Accessing Array Elements}). This pointer will
- have type @code{char *} because it points to an element of type
- @code{char}. @code{char *} is an example of a type designator for a
- pointer type (@pxref{Pointer Type Designators}). That type is used
- for strings generally, not just the strings expressed as constants
- in a program.
- Thus, the string constant @code{"Foo!"} is almost
- equivalent to declaring an array like this
- @example
- char string_array_1[] = @{'F', 'o', 'o', '!', '\0' @};
- @end example
- @noindent
- and then using @code{string_array_1} in the program. There
- are two differences, however:
- @itemize @bullet
- @item
- The string constant doesn't define a name for the array.
- @item
- The string constant is probably stored in a read-only area of memory.
- @end itemize
- Newlines are not allowed in the text of a string constant. The motive
- for this prohibition is to catch the error of omitting the closing
- @samp{"}. To put a newline in a constant string, write it as
- @samp{\n} in the string constant.
- A real null character in the source code inside a string constant
- causes a warning. To put a null character in the middle of a string
- constant, write @samp{\0} or @samp{\000}.
- Consecutive string constants are effectively concatenated. Thus,
- @example
- "Fo" "o!" @r{is equivalent to} "Foo!"
- @end example
- This is useful for writing a string containing multiple lines,
- like this:
- @example
- "This message is so long that it needs more than\n"
- "a single line of text. C does not allow a newline\n"
- "to represent itself in a string constant, so we have to\n"
- "write \\n to put it in the string. For readability of\n"
- "the source code, it is advisable to put line breaks in\n"
- "the source where they occur in the contents of the\n"
- "constant.\n"
- @end example
- The sequence of a backslash and a newline is ignored anywhere
- in a C program, and that includes inside a string constant.
- Thus, you can write multi-line string constants this way:
- @example
- "This is another way to put newlines in a string constant\n\
- and break the line after them in the source code."
- @end example
- @noindent
- However, concatenation is the recommended way to do this.
- You can also write perverse string constants like this,
- @example
- "Fo\
- o!"
- @end example
- @noindent
- but don't do that---write it like this instead:
- @example
- "Foo!"
- @end example
- Be careful to avoid passing a string constant to a function that
- modifies the string it receives. The memory where the string constant
- is stored may be read-only, which would cause a fatal @code{SIGSEGV}
- signal that normally terminates the function (@pxref{Signals}. Even
- worse, the memory may not be read-only. Then the function might
- modify the string constant, thus spoiling the contents of other string
- constants that are supposed to contain the same value and are unified
- by the compiler.
- @node UTF-8 String Constants
- @section UTF-8 String Constants
- @cindex UTF-8 String Constants
- Writing @samp{u8} immediately before a string constant, with no
- intervening space, means to represent that string in UTF-8 encoding as
- a sequence of bytes. UTF-8 represents ASCII characters with a single
- byte, and represents non-ASCII Unicode characters (codes 128 and up)
- as multibyte sequences. Here is an example of a UTF-8 constant:
- @example
- u8"A cónstàñt"
- @end example
- This constant occupies 13 bytes plus the terminating null,
- because each of the accented letters is a two-byte sequence.
- Concatenating an ordinary string with a UTF-8 string conceptually
- produces another UTF-8 string. However, if the ordinary string
- contains character codes 128 and up, the results cannot be relied on.
- @node Unicode Character Codes
- @section Unicode Character Codes
- @cindex Unicode character codes
- @cindex universal character names
- You can specify Unicode characters, for individual character constants
- or as part of string constants (@pxref{String Constants}), using
- escape sequences. Use the @samp{\u} escape sequence with a 16-bit
- hexadecimal Unicode character code. If the code value is too big for
- 16 bits, use the @samp{\U} escape sequence with a 32-bit hexadecimal
- Unicode character code. (These codes are called @dfn{universal
- character names}.) For example,
- @example
- \u6C34 /* @r{16-bit code (UTF-16)} */
- \U0010ABCD /* @r{32-bit code (UTF-32)} */
- @end example
- @noindent
- One way to use these is in UTF-8 string constants (@pxref{UTF-8 String
- Constants}). For instance,
- @example
- u8"fóó \u6C34 \U0010ABCD"
- @end example
- You can also use them in wide character constants (@pxref{Wide
- Character Constants}), like this:
- @example
- u'\u6C34' /* @r{16-bit code} */
- U'\U0010ABCD' /* @r{32-bit code} */
- @end example
- @noindent
- and in wide string constants (@pxref{Wide String Constants}), like
- this:
- @example
- u"\u6C34\u6C33" /* @r{16-bit code} */
- U"\U0010ABCD" /* @r{32-bit code} */
- @end example
- Codes in the range of @code{D800} through @code{DFFF} are not valid
- in Unicode. Codes less than @code{00A0} are also forbidden, except for
- @code{0024}, @code{0040}, and @code{0060}; these characters are
- actually ASCII control characters, and you can specify them with other
- escape sequences (@pxref{Character Constants}).
- @node Wide Character Constants
- @section Wide Character Constants
- @cindex wide character constants
- @cindex constants, wide character
- A @dfn{wide character constant} represents characters with more than 8
- bits of character code. This is an obscure feature that we need to
- document but that you probably won't ever use. If you're just
- learning C, you may as well skip this section.
- The original C wide character constant looks like @samp{L} (upper
- case!) followed immediately by an ordinary character constant (with no
- intervening space). Its data type is @code{wchar_t}, which is an
- alias defined in @file{stddef.h} for one of the standard integer
- types. Depending on the platform, it could be 16 bits or 32 bits. If
- it is 16 bits, these character constants use the UTF-16 form of
- Unicode; if 32 bits, UTF-32.
- There are also Unicode wide character constants which explicitly
- specify the width. These constants start with @samp{u} or @samp{U}
- instead of @samp{L}. @samp{u} specifies a 16-bit Unicode wide
- character constant, and @samp{U} a 32-bit Unicode wide character
- constant. Their types are, respectively, @code{char16_t} and
- @w{@code{char32_t}}; they are declared in the header file
- @file{uchar.h}. These character constants are valid even if
- @file{uchar.h} is not included, but some uses of them may be
- inconvenient without including it to declare those type names.
- The character represented in a wide character constant can be an
- ordinary ASCII character. @code{L'a'}, @code{u'a'} and @code{U'a'}
- are all valid, and they are all equal to @code{'a'}.
- In all three kinds of wide character constants, you can write a
- non-ASCII Unicode character in the constant itself; the constant's
- value is the character's Unicode character code. Or you can specify
- the Unicode character with an escape sequence (@pxref{Unicode
- Character Codes}).
- @node Wide String Constants
- @section Wide String Constants
- @cindex wide string constants
- @cindex constants, wide string
- A @dfn{wide string constant} stands for an array of 16-bit or 32-bit
- characters. They are rarely used; if you're just
- learning C, you may as well skip this section.
- There are three kinds of wide string constants, which differ in the
- data type used for each character in the string. Each wide string
- constant is equivalent to an array of integers, but the data type of
- those integers depends on the kind of wide string. Using the constant
- in an expression will convert the array to a pointer to its first
- element, as usual for arrays in C (@pxref{Accessing Array Elements}).
- For each kind of wide string constant, we state here what type that
- pointer will be.
- @table @code
- @item char16_t
- This is a 16-bit Unicode wide string constant: each element is a
- 16-bit Unicode character code with type @code{char16_t}, so the string
- has the pointer type @code{char16_t@ *}. (That is a type designator;
- @pxref{Pointer Type Designators}.) The constant is written as
- @samp{u} (which must be lower case) followed (with no intervening
- space) by a string constant with the usual syntax.
- @item char32_t
- This is a 32-bit Unicode wide string constant: each element is a
- 32-bit Unicode character code, and the string has type @code{char32_t@ *}.
- It's written as @samp{U} (which must be upper case) followed (with no
- intervening space) by a string constant with the usual syntax.
- @item wchar_t
- This is the original kind of wide string constant. It's written as
- @samp{L} (which must be upper case) followed (with no intervening
- space) by a string constant with the usual syntax, and the string has
- type @code{wchar_t@ *}.
- The width of the data type @code{wchar_t} depends on the target
- platform, which makes this kind of wide string somewhat less useful
- than the newer kinds.
- @end table
- @code{char16_t} and @code{char32_t} are declared in the header file
- @file{uchar.h}. @code{wchar_t} is declared in @file{stddef.h}.
- Consecutive wide string constants of the same kind concatenate, just
- like ordinary string constants. A wide string constant concatenated
- with an ordinary string constant results in a wide string constant.
- You can't concatenate two wide string constants of different kinds.
- You also can't concatenate a wide string constant (of any kind) with a
- UTF-8 string constant.
- @node Type Size
- @chapter Type Size
- @cindex type size
- @cindex size of type
- @findex sizeof
- Each data type has a @dfn{size}, which is the number of bytes
- (@pxref{Storage}) that it occupies in memory. To refer to the size in
- a C program, use @code{sizeof}. There are two ways to use it:
- @table @code
- @item sizeof @var{expression}
- This gives the size of @var{expression}, based on its data type. It
- does not calculate the value of @var{expression}, only its size, so if
- @var{expression} includes side effects or function calls, they do not
- happen. Therefore, @code{sizeof} is always a compile-time operation
- that has zero run-time cost.
- A value that is a bit field (@pxref{Bit Fields}) is not allowed as an
- operand of @code{sizeof}.
- For example,
- @example
- double a;
- i = sizeof a + 10;
- @end example
- @noindent
- sets @code{i} to 18 on most computers because @code{a} occupies 8 bytes.
- Here's how to determine the number of elements in an array
- @code{array}:
- @example
- (sizeof array / sizeof array[0])
- @end example
- @noindent
- The expression @code{sizeof array} gives the size of the array, not
- the size of a pointer to an element. However, if @var{expression} is
- a function parameter that was declared as an array, that
- variable really has a pointer type (@pxref{Array Parm Pointer}), so
- the result is the size of that pointer.
- @item sizeof (@var{type})
- This gives the size of @var{type}.
- For example,
- @example
- i = sizeof (double) + 10;
- @end example
- @noindent
- is equivalent to the previous example.
- You can't apply @code{sizeof} to an incomplete type (@pxref{Incomplete
- Types}), nor @code{void}. Using it on a function type gives 1 in GNU
- C, which makes adding an integer to a function pointer work as desired
- (@pxref{Pointer Arithmetic}).
- @end table
- @strong{Warning}: When you use @code{sizeof} with a type
- instead of an expression, you must write parentheses around the type.
- @strong{Warning}: When applying @code{sizeof} to the result of a cast
- (@pxref{Explicit Type Conversion}), you must write parentheses around
- the cast expression to avoid an ambiguity in the grammar of C@.
- Specifically,
- @example
- sizeof (int) -x
- @end example
- @noindent
- parses as
- @example
- (sizeof (int)) - x
- @end example
- @noindent
- If what you want is
- @example
- sizeof ((int) -x)
- @end example
- @noindent
- you must write it that way, with parentheses.
- The data type of the value of the @code{sizeof} operator is always one
- of the unsigned integer types; which one of those types depends on the
- machine. The header file @code{stddef.h} defines the typedef name
- @code{size_t} as an alias for this type. @xref{Defining Typedef
- Names}.
- @node Pointers
- @chapter Pointers
- @cindex pointers
- Among high-level languages, C is rather low level, close to the
- machine. This is mainly because it has explicit @dfn{pointers}. A
- pointer value is the numeric address of data in memory. The type of
- data to be found at that address is specified by the data type of the
- pointer itself. The unary operator @samp{*} gets the data that a
- pointer points to---this is called @dfn{dereferencing the pointer}.
- C also allows pointers to functions, but since there are some
- differences in how they work, we treat them later. @xref{Function
- Pointers}.
- @menu
- * Address of Data:: Using the ``address-of'' operator.
- * Pointer Types:: For each type, there is a pointer type.
- * Pointer Declarations:: Declaring variables with pointer types.
- * Pointer Type Designators:: Designators for pointer types.
- * Pointer Dereference:: Accessing what a pointer points at.
- * Null Pointers:: Pointers which do not point to any object.
- * Invalid Dereference:: Dereferencing null or invalid pointers.
- * Void Pointers:: Totally generic pointers, can cast to any.
- * Pointer Comparison:: Comparing memory address values.
- * Pointer Arithmetic:: Computing memory address values.
- * Pointers and Arrays:: Using pointer syntax instead of array syntax.
- * Pointer Arithmetic Low Level:: More about computing memory address values.
- * Pointer Increment/Decrement:: Incrementing and decrementing pointers.
- * Pointer Arithmetic Drawbacks:: A common pointer bug to watch out for.
- * Pointer-Integer Conversion:: Converting pointer types to integer types.
- * Printing Pointers:: Using @code{printf} for a pointer's value.
- @end menu
- @node Address of Data
- @section Address of Data
- @cindex address-of operator
- The most basic way to make a pointer is with the ``address-of''
- operator, @samp{&}. Let's suppose we have these variables available:
- @example
- int i;
- double a[5];
- @end example
- Now, @code{&i} gives the address of the variable @code{i}---a pointer
- value that points to @code{i}'s location---and @code{&a[3]} gives the
- address of the element 3 of @code{a}. (It is actually the fourth
- element in the array, since the first element has index 0.)
- The address-of operator is unusual because it operates on a place to
- store a value (an lvalue, @pxref{Lvalues}), not on the value currently
- stored there. (The left argument of a simple assignment is unusual in
- the same way.) You can use it on any lvalue except a bit field
- (@pxref{Bit Fields}) or a constructor (@pxref{Structure
- Constructors}).
- @node Pointer Types
- @section Pointer Types
- For each data type @var{t}, there is a type for pointers to type
- @var{t}. For these variables,
- @example
- int i;
- double a[5];
- @end example
- @itemize @bullet
- @item
- @code{i} has type @code{int}; we say
- @code{&i} is a ``pointer to @code{int}.''
- @item
- @code{a} has type @code{double[5]}; we say @code{&a} is a ``pointer to
- arrays of five @code{double}s.''
- @item
- @code{a[3]} has type @code{double}; we say @code{&a[3]} is a ``pointer
- to @code{double}.''
- @end itemize
- @node Pointer Declarations
- @section Pointer-Variable Declarations
- The way to declare that a variable @code{foo} points to type @var{t} is
- @example
- @var{t} *foo;
- @end example
- To remember this syntax, think ``if you dereference @code{foo}, using
- the @samp{*} operator, what you get is type @var{t}. Thus, @code{foo}
- points to type @var{t}.''
- Thus, we can declare variables that hold pointers to these three
- types, like this:
- @example
- int *ptri; /* @r{Pointer to @code{int}.} */
- double *ptrd; /* @r{Pointer to @code{double}.} */
- double (*ptrda)[5]; /* @r{Pointer to @code{double[5]}.} */
- @end example
- @samp{int *ptri;} means, ``if you dereference @code{ptri}, you get an
- @code{int}.'' @samp{double (*ptrda)[5];} means, ``if you dereference
- @code{ptrda}, then subscript it by an integer less than 5, you get a
- @code{double}.'' The parentheses express the point that you would
- dereference it first, then subscript it.
- Contrast the last one with this:
- @example
- double *aptrd[5]; /* @r{Array of five pointers to @code{double}.} */
- @end example
- @noindent
- Because @samp{*} has higher syntactic precedence than subscripting,
- you would subscript @code{aptrd} then dereference it. Therefore, it
- declares an array of pointers, not a pointer.
- @node Pointer Type Designators
- @section Pointer-Type Designators
- Every type in C has a designator; you make it by deleting the variable
- name and the semicolon from a declaration (@pxref{Type
- Designators}). Here are the designators for the pointer
- types of the example declarations in the previous section:
- @example
- int * /* @r{Pointer to @code{int}.} */
- double * /* @r{Pointer to @code{double}.} */
- double (*)[5] /* @r{Pointer to @code{double[5]}.} */
- @end example
- Remember, to understand what type a designator stands for, imagine the
- variable name that would be in the declaration, and figure out what
- type it would declare that variable with. @code{double (*)[5]} can
- only come from @code{double (*@var{variable})[5]}, so it's a pointer
- which, when dereferenced, gives an array of 5 @code{double}s.
- @node Pointer Dereference
- @section Dereferencing Pointers
- @cindex dereferencing pointers
- @cindex pointer dereferencing
- The main use of a pointer value is to @dfn{dereference it} (access the
- data it points at) with the unary @samp{*} operator. For instance,
- @code{*&i} is the value at @code{i}'s address---which is just
- @code{i}. The two expressions are equivalent, provided @code{&i} is
- valid.
- A pointer-dereference expression whose type is data (not a function)
- is an lvalue.
- Pointers become really useful when we store them somewhere and use
- them later. Here's a simple example to illustrate the practice:
- @example
- @{
- int i;
- int *ptr;
- ptr = &i;
- i = 5;
- @r{@dots{}}
- return *ptr; /* @r{Returns 5, fetched from @code{i}.} */
- @}
- @end example
- This shows how to declare the variable @code{ptr} as type
- @code{int *} (pointer to @code{int}), store a pointer value into it
- (pointing at @code{i}), and use it later to get the value of the
- object it points at (the value in @code{i}).
- If anyone can provide a useful example which is this basic,
- I would be grateful.
- @node Null Pointers
- @section Null Pointers
- @cindex null pointers
- @cindex pointers, null
- @c ???stdio loads sttddef
- A pointer value can be @dfn{null}, which means it does not point to
- any object. The cleanest way to get a null pointer is by writing
- @code{NULL}, a standard macro defined in @file{stddef.h}. You can
- also do it by casting 0 to the desired pointer type, as in
- @code{(char *) 0}. (The cast operator performs explicit type conversion;
- @xref{Explicit Type Conversion}.)
- You can store a null pointer in any lvalue whose data type
- is a pointer type:
- @example
- char *foo;
- foo = NULL;
- @end example
- These two, if consecutive, can be combined into a declaration with
- initializer,
- @example
- char *foo = NULL;
- @end example
- You can also explicitly cast @code{NULL} to the specific pointer type
- you want---it makes no difference.
- @example
- char *foo;
- foo = (char *) NULL;
- @end example
- To test whether a pointer is null, compare it with zero or
- @code{NULL}, as shown here:
- @example
- if (p != NULL)
- /* @r{@code{p} is not null.} */
- operate (p);
- @end example
- Since testing a pointer for not being null is basic and frequent, all
- but beginners in C will understand the conditional without need for
- @code{!= NULL}:
- @example
- if (p)
- /* @r{@code{p} is not null.} */
- operate (p);
- @end example
- @node Invalid Dereference
- @section Dereferencing Null or Invalid Pointers
- Trying to dereference a null pointer is an error. On most platforms,
- it generally causes a signal, usually @code{SIGSEGV}
- (@pxref{Signals}).
- @example
- char *foo = NULL;
- c = *foo; /* @r{This causes a signal and terminates.} */
- @end example
- @noindent
- Likewise a pointer that has the wrong alignment for the target data type
- (on most types of computer), or points to a part of memory that has
- not been allocated in the process's address space.
- The signal terminates the program, unless the program has arranged to
- handle the signal (@pxref{Signal Handling, The GNU C Library, , libc,
- The GNU C Library Reference Manual}).
- However, the signal might not happen if the dereference is optimized
- away. In the example above, if you don't subsequently use the value
- of @code{c}, GCC might optimize away the code for @code{*foo}. You
- can prevent such optimization using the @code{volatile} qualifier, as
- shown here:
- @example
- volatile char *p;
- volatile char c;
- c = *p;
- @end example
- You can use this to test whether @code{p} points to unallocated
- memory. Set up a signal handler first, so the signal won't terminate
- the program.
- @node Void Pointers
- @section Void Pointers
- @cindex void pointers
- @cindex pointers, void
- The peculiar type @code{void *}, a pointer whose target type is
- @code{void}, is used often in C@. It represents a pointer to
- we-don't-say-what. Thus,
- @example
- void *numbered_slot_pointer (int);
- @end example
- @noindent
- declares a function @code{numbered_slot_pointer} that takes an
- integer parameter and returns a pointer, but we don't say what type of
- data it points to.
- With type @code{void *}, you can pass the pointer around and test
- whether it is null. However, dereferencing it gives a @code{void}
- value that can't be used (@pxref{The Void Type}). To dereference the
- pointer, first convert it to some other pointer type.
- Assignments convert @code{void *} automatically to any other pointer
- type, if the left operand has a pointer type; for instance,
- @example
- @{
- int *p;
- /* @r{Converts return value to @code{int *}.} */
- p = numbered_slot_pointer (5);
- @r{@dots{}}
- @}
- @end example
- Passing an argument of type @code{void *} for a parameter that has a
- pointer type also converts. For example, supposing the function
- @code{hack} is declared to require type @code{float *} for its
- argument, this will convert the null pointer to that type.
- @example
- /* @r{Declare @code{hack} that way.}
- @r{We assume it is defined somewhere else.} */
- void hack (float *);
- @dots{}
- /* @r{Now call @code{hack}.} */
- @{
- /* @r{Converts return value of @code{numbered_slot_pointer}}
- @r{to @code{float *} to pass it to @code{hack}.} */
- hack (numbered_slot_pointer (5));
- @r{@dots{}}
- @}
- @end example
- You can also convert to another pointer type with an explicit cast
- (@pxref{Explicit Type Conversion}), like this:
- @example
- (int *) numbered_slot_pointer (5)
- @end example
- Here is an example which decides at run time which pointer
- type to convert to:
- @example
- void
- extract_int_or_double (void *ptr, bool its_an_int)
- @{
- if (its_an_int)
- handle_an_int (*(int *)ptr);
- else
- handle_a_double (*(double *)ptr);
- @}
- @end example
- The expression @code{*(int *)ptr} means to convert @code{ptr}
- to type @code{int *}, then dereference it.
- @node Pointer Comparison
- @section Pointer Comparison
- @cindex pointer comparison
- @cindex comparison, pointer
- Two pointer values are equal if they point to the same location, or if
- they are both null. You can test for this with @code{==} and
- @code{!=}. Here's a trivial example:
- @example
- @{
- int i;
- int *p, *q;
- p = &i;
- q = &i;
- if (p == q)
- printf ("This will be printed.\n");
- if (p != q)
- printf ("This won't be printed.\n");
- @}
- @end example
- Ordering comparisons such as @code{>} and @code{>=} operate on
- pointers by converting them to unsigned integers. The C standard says
- the two pointers must point within the same object in memory, but on
- GNU/Linux systems these operations simply compare the numeric values
- of the pointers.
- The pointer values to be compared should in principle have the same type, but
- they are allowed to differ in limited cases. First of all, if the two
- pointers' target types are nearly compatible (@pxref{Compatible
- Types}), the comparison is allowed.
- If one of the operands is @code{void *} (@pxref{Void Pointers}) and
- the other is another pointer type, the comparison operator converts
- the @code{void *} pointer to the other type so as to compare them.
- (In standard C, this is not allowed if the other type is a function
- pointer type, but that works in GNU C@.)
- Comparison operators also allow comparing the integer 0 with a pointer
- value. Thus works by converting 0 to a null pointer of the same type
- as the other operand.
- @node Pointer Arithmetic
- @section Pointer Arithmetic
- @cindex pointer arithmetic
- @cindex arithmetic, pointer
- Adding an integer (positive or negative) to a pointer is valid in C@.
- It assumes that the pointer points to an element in an array, and
- advances or retracts the pointer across as many array elements as the
- integer specifies. Here is an example, in which adding a positive
- integer advances the pointer to a later element in the same array.
- @example
- void
- incrementing_pointers ()
- @{
- int array[5] = @{ 45, 29, 104, -3, 123456 @};
- int elt0, elt1, elt4;
- int *p = &array[0];
- /* @r{Now @code{p} points at element 0. Fetch it.} */
- elt0 = *p;
- ++p;
- /* @r{Now @code{p} points at element 1. Fetch it.} */
- elt1 = *p;
- p += 3;
- /* @r{Now @code{p} points at element 4 (the last). Fetch it.} */
- elt4 = *p;
- printf ("elt0 %d elt1 %d elt4 %d.\n",
- elt0, elt1, elt4);
- /* @r{Prints elt0 45 elt1 29 elt4 123456.} */
- @}
- @end example
- Here's an example where adding a negative integer retracts the pointer
- to an earlier element in the same array.
- @example
- void
- decrementing_pointers ()
- @{
- int array[5] = @{ 45, 29, 104, -3, 123456 @};
- int elt0, elt3, elt4;
- int *p = &array[4];
- /* @r{Now @code{p} points at element 4 (the last). Fetch it.} */
- elt4 = *p;
- --p;
- /* @r{Now @code{p} points at element 3. Fetch it.} */
- elt3 = *p;
- p -= 3;
- /* @r{Now @code{p} points at element 0. Fetch it.} */
- elt0 = *p;
- printf ("elt0 %d elt3 %d elt4 %d.\n",
- elt0, elt3, elt4);
- /* @r{Prints elt0 45 elt3 -3 elt4 123456.} */
- @}
- @end example
- If one pointer value was made by adding an integer to another
- pointer value, it should be possible to subtract the pointer values
- and recover that integer. That works too in C@.
- @example
- void
- subtract_pointers ()
- @{
- int array[5] = @{ 45, 29, 104, -3, 123456 @};
- int *p0, *p3, *p4;
- int *p = &array[4];
- /* @r{Now @code{p} points at element 4 (the last). Save the value.} */
- p4 = p;
- --p;
- /* @r{Now @code{p} points at element 3. Save the value.} */
- p3 = p;
- p -= 3;
- /* @r{Now @code{p} points at element 0. Save the value.} */
- p0 = p;
- printf ("%d, %d, %d, %d\n",
- p4 - p0, p0 - p0, p3 - p0, p0 - p3);
- /* @r{Prints 4, 0, 3, -3.} */
- @}
- @end example
- The addition operation does not know where arrays are. All it does is
- add the integer (multiplied by object size) to the value of the
- pointer. When the initial pointer and the result point into a single
- array, the result is well-defined.
- @strong{Warning:} Only experts should do pointer arithmetic involving pointers
- into different memory objects.
- The difference between two pointers has type @code{int}, or
- @code{long} if necessary (@pxref{Integer Types}). The clean way to
- declare it is to use the typedef name @code{ptrdiff_t} defined in the
- file @file{stddef.h}.
- This definition of pointer subtraction is consistent with
- pointer-integer addition, in that @code{(p3 - p1) + p1} equals
- @code{p3}, as in ordinary algebra.
- In standard C, addition and subtraction are not allowed on @code{void
- *}, since the target type's size is not defined in that case.
- Likewise, they are not allowed on pointers to function types.
- However, these operations work in GNU C, and the ``size of the target
- type'' is taken as 1.
- @node Pointers and Arrays
- @section Pointers and Arrays
- @cindex pointers and arrays
- @cindex arrays and pointers
- The clean way to refer to an array element is
- @code{@var{array}[@var{index}]}. Another, complicated way to do the
- same job is to get the address of that element as a pointer, then
- dereference it: @code{* (&@var{array}[0] + @var{index})} (or
- equivalently @code{* (@var{array} + @var{index})}). This first gets a
- pointer to element zero, then increments it with @code{+} to point to
- the desired element, then gets the value from there.
- That pointer-arithmetic construct is the @emph{definition} of square
- brackets in C@. @code{@var{a}[@var{b}]} means, by definition,
- @code{*(@var{a} + @var{b})}. This definition uses @var{a} and @var{b}
- symmetrically, so one must be a pointer and the other an integer; it
- does not matter which comes first.
- Since indexing with square brackets is defined in terms of addition
- and dereference, that too is symmetrical. Thus, you can write
- @code{3[array]} and it is equivalent to @code{array[3]}. However, it
- would be foolish to write @code{3[array]}, since it has no advantage
- and could confuse people who read the code.
- It may seem like a discrepancy that the definition @code{*(@var{a} +
- @var{b})} requires a pointer, but @code{array[3]} uses an array value
- instead. Why is this valid? The name of the array, when used by
- itself as an expression (other than in @code{sizeof}), stands for a
- pointer to the arrays's zeroth element. Thus, @code{array + 3}
- converts @code{array} implicitly to @code{&array[0]}, and the result
- is a pointer to element 3, equivalent to @code{&array[3]}.
- Since square brackets are defined in terms of such addition,
- @code{array[3]} first converts @code{array} to a pointer. That's why
- it works to use an array directly in that construct.
- @node Pointer Arithmetic Low Level
- @section Pointer Arithmetic at Low Level
- @cindex pointer arithmetic, low level
- @cindex low level pointer arithmetic
- The behavior of pointer arithmetic is theoretically defined only when
- the pointer values all point within one object allocated in memory.
- But the addition and subtraction operators can't tell whether the
- pointer values are all within one object. They don't know where
- objects start and end. So what do they really do?
- Adding pointer @var{p} to integer @var{i} treats @var{p} as a memory
- address, which is in fact an integer---call it @var{pint}. It treats
- @var{i} as a number of elements of the type that @var{p} points to.
- These elements' sizes add up to @code{@var{i} * sizeof (*@var{p})}.
- So the sum, as an integer, is @code{@var{pint} + @var{i} * sizeof
- (*@var{p})}. This value is reinterpreted as a pointer like @var{p}.
- If the starting pointer value @var{p} and the result do not point at
- parts of the same object, the operation is not officially legitimate,
- and C code is not ``supposed'' to do it. But you can do it anyway,
- and it gives precisely the results described by the procedure above.
- In some special situations it can do something useful, but non-wizards
- should avoid it.
- Here's a function to offset a pointer value @emph{as if} it pointed to
- an object of any given size, by explicitly performing that calculation:
- @example
- #include <stdint.h>
- void *
- ptr_add (void *p, int i, int objsize)
- @{
- intptr_t p_address = (long) p;
- intptr_t totalsize = i * objsize;
- intptr_t new_address = p_address + totalsize;
- return (void *) new_address;
- @}
- @end example
- @noindent
- @cindex @code{intptr_t}
- This does the same job as @code{@var{p} + @var{i}} with the proper
- pointer type for @var{p}. It uses the type @code{intptr_t}, which is
- defined in the header file @file{stdint.h}. (In practice, @code{long
- long} would always work, but it is cleaner to use @code{intptr_t}.)
- @node Pointer Increment/Decrement
- @section Pointer Increment and Decrement
- @cindex pointer increment and decrement
- @cindex incrementing pointers
- @cindex decrementing pointers
- The @samp{++} operator adds 1 to a variable. We have seen it for
- integers (@pxref{Increment/Decrement}), but it works for pointers too.
- For instance, suppose we have a series of positive integers,
- terminated by a zero, and we want to add them all up.
- @example
- int
- sum_array_till_0 (int *p)
- @{
- int sum = 0;
- for (;;)
- @{
- /* @r{Fetch the next integer.} */
- int next = *p++;
- /* @r{Exit the loop if it's 0.} */
- if (next == 0)
- break;
- /* @r{Add it into running total.} */
- sum += next;
- @}
- return sum;
- @}
- @end example
- @noindent
- The statement @samp{break;} will be explained further on (@pxref{break
- Statement}). Used in this way, it immediately exits the surrounding
- @code{for} statement.
- @code{*p++} parses as @code{*(p++)}, because a postfix operator always
- takes precedence over a prefix operator. Therefore, it dereferences
- @code{p}, and increments @code{p} afterwards. Incrementing a variable
- means adding 1 to it, as in @code{p = p + 1}. Since @code{p} is a
- pointer, adding 1 to it advances it by the width of the datum it
- points to---in this case, one @code{int}. Therefore, each iteration
- of the loop picks up the next integer from the series and puts it into
- @code{next}.
- This @code{for}-loop has no initialization expression since @code{p}
- and @code{sum} are already initialized, it has no end-test since the
- @samp{break;} statement will exit it, and needs no expression to
- advance it since that's done within the loop by incrementing @code{p}
- and @code{sum}. Thus, those three expressions after @code{for} are
- left empty.
- Another way to write this function is by keeping the parameter value unchanged
- and using indexing to access the integers in the table.
- @example
- int
- sum_array_till_0_indexing (int *p)
- @{
- int i;
- int sum = 0;
- for (i = 0; ; i++)
- @{
- /* @r{Fetch the next integer.} */
- int next = p[i];
- /* @r{Exit the loop if it's 0.} */
- if (next == 0)
- break;
- /* @r{Add it into running total.} */
- sum += next;
- @}
- return sum;
- @}
- @end example
- In this program, instead of advancing @code{p}, we advance @code{i}
- and add it to @code{p}. (Recall that @code{p[i]} means @code{*(p +
- i)}.) Either way, it uses the same address to get the next integer.
- It makes no difference in this program whether we write @code{i++} or
- @code{++i}, because the value is not used. All that matters is the
- effect, to increment @code{i}.
- The @samp{--} operator also works on pointers; it can be used
- to scan backwards through an array, like this:
- @example
- int
- after_last_nonzero (int *p, int len)
- @{
- /* @r{Set up @code{q} to point just after the last array element.} */
- int *q = p + len;
- while (q != p)
- /* @r{Step @code{q} back until it reaches a nonzero element.} */
- if (*--q != 0)
- /* @r{Return the index of the element after that nonzero.} */
- return q - p + 1;
- return 0;
- @}
- @end example
- That function returns the length of the nonzero part of the
- array specified by its arguments; that is, the index of the
- first zero of the run of zeros at the end.
- @node Pointer Arithmetic Drawbacks
- @section Drawbacks of Pointer Arithmetic
- @cindex drawbacks of pointer arithmetic
- @cindex pointer arithmetic, drawbacks
- Pointer arithmetic is clean and elegant, but it is also the cause of a
- major security flaw in the C language. Theoretically, it is only
- valid to adjust a pointer within one object allocated as a unit in
- memory. However, if you unintentionally adjust a pointer across the
- bounds of the object and into some other object, the system has no way
- to detect this error.
- A bug which does that can easily result in clobbering part of another
- object. For example, with @code{array[-1]} you can read or write the
- nonexistent element before the beginning of an array---probably part
- of some other data.
- Combining pointer arithmetic with casts between pointer types, you can
- create a pointer that fails to be properly aligned for its type. For
- example,
- @example
- int a[2];
- char *pa = (char *)a;
- int *p = (int *)(pa + 1);
- @end example
- @noindent
- gives @code{p} a value pointing to an ``integer'' that includes part
- of @code{a[0]} and part of @code{a[1]}. Dereferencing that with
- @code{*p} can cause a fatal @code{SIGSEGV} signal or it can return the
- contents of that badly aligned @code{int} (@pxref{Signals}. If it
- ``works,'' it may be quite slow. It can also cause aliasing
- confusions (@pxref{Aliasing}).
- @strong{Warning:} Using improperly aligned pointers is risky---don't do it
- unless it is really necessary.
- @node Pointer-Integer Conversion
- @section Pointer-Integer Conversion
- @cindex pointer-integer conversion
- @cindex conversion between pointers and integers
- @cindex @code{uintptr_t}
- On modern computers, an address is simply a number. It occupies the
- same space as some size of integer. In C, you can convert a pointer
- to the appropriate integer types and vice versa, without losing
- information. The appropriate integer types are @code{uintptr_t} (an
- unsigned type) and @code{intptr_t} (a signed type). Both are defined
- in @file{stdint.h}.
- For instance,
- @example
- #include <stdint.h>
- #include <stdio.h>
- void
- print_pointer (void *ptr)
- @{
- uintptr_t converted = (uintptr_t) ptr;
- printf ("Pointer value is 0x%x\n",
- (unsigned int) converted);
- @}
- @end example
- @noindent
- The specification @samp{%x} in the template (the first argument) for
- @code{printf} means to represent this argument using hexadecimal
- notation. It's cleaner to use @code{uintptr_t}, since hexadecimal
- printing treats the number as unsigned, but it won't actually matter:
- all @code{printf} gets to see is the series of bits in the number.
- @strong{Warning:} Converting pointers to integers is risky---don't do
- it unless it is really necessary.
- @node Printing Pointers
- @section Printing Pointers
- To print the numeric value of a pointer, use the @samp{%p} specifier.
- For example:
- @example
- void
- print_pointer (void *ptr)
- @{
- printf ("Pointer value is %p\n", ptr);
- @}
- @end example
- The specification @samp{%p} works with any pointer type. It prints
- @samp{0x} followed by the address in hexadecimal, printed as the
- appropriate unsigned integer type.
- @node Structures
- @chapter Structures
- @cindex structures
- @findex struct
- @cindex fields in structures
- A @dfn{structure} is a user-defined data type that holds various
- @dfn{fields} of data. Each field has a name and a data type specified
- in the structure's definition.
- Here we define a structure suitable for storing a linked list of
- integers. Each list item will hold one integer, plus a pointer
- to the next item.
- @example
- struct intlistlink
- @{
- int datum;
- struct intlistlink *next;
- @};
- @end example
- The structure definition has a @dfn{type tag} so that the code can
- refer to this structure. The type tag here is @code{intlistlink}.
- The definition refers recursively to the same structure through that
- tag.
- You can define a structure without a type tag, but then you can't
- refer to it again. That is useful only in some special contexts, such
- as inside a @code{typedef} or a @code{union}.
- The contents of the structure are specified by the @dfn{field
- declarations} inside the braces. Each field in the structure needs a
- declaration there. The fields in one structure definition must have
- distinct names, but these names do not conflict with any other names
- in the program.
- A field declaration looks just like a variable declaration. You can
- combine field declarations with the same beginning, just as you can
- combine variable declarations.
- This structure has two fields. One, named @code{datum}, has type
- @code{int} and will hold one integer in the list. The other, named
- @code{next}, is a pointer to another @code{struct intlistlink}
- which would be the rest of the list. In the last list item, it would
- be @code{NULL}.
- This structure definition is recursive, since the type of the
- @code{next} field refers to the structure type. Such recursion is not
- a problem; in fact, you can use the type @code{struct intlistlink *}
- before the definition of the type @code{struct intlistlink} itself.
- That works because pointers to all kinds of structures really look the
- same at the machine level.
- After defining the structure, you can declare a variable of type
- @code{struct intlistlink} like this:
- @example
- struct intlistlink foo;
- @end example
- The structure definition itself can serve as the beginning of a
- variable declaration, so you can declare variables immediately after,
- like this:
- @example
- struct intlistlink
- @{
- int datum;
- struct intlistlink *next;
- @} foo;
- @end example
- @noindent
- But that is ugly. It is almost always clearer to separate the
- definition of the structure from its uses.
- Declaring a structure type inside a block (@pxref{Blocks}) limits
- the scope of the structure type name to that block. That means the
- structure type is recognized only within that block. Declaring it in
- a function parameter list, as here,
- @example
- int f (struct foo @{int a, b@} parm);
- @end example
- @noindent
- (assuming that @code{struct foo} is not already defined) limits the
- scope of the structure type @code{struct foo} to that parameter list;
- that is basically useless, so it triggers a warning.
- Standard C requires at least one field in a structure.
- GNU C does not require this.
- @menu
- * Referencing Fields:: Accessing field values in a structure object.
- * Dynamic Memory Allocation:: Allocating space for objects
- while the program is running.
- * Field Offset:: Memory layout of fields within a structure.
- * Structure Layout:: Planning the memory layout of fields.
- * Packed Structures:: Packing structure fields as close as possible.
- * Bit Fields:: Dividing integer fields
- into fields with fewer bits.
- * Bit Field Packing:: How bit fields pack together in integers.
- * const Fields:: Making structure fields immutable.
- * Zero Length:: Zero-length array as a variable-length object.
- * Flexible Array Fields:: Another approach to variable-length objects.
- * Overlaying Structures:: Casting one structure type
- over an object of another structure type.
- * Structure Assignment:: Assigning values to structure objects.
- * Unions:: Viewing the same object in different types.
- * Packing With Unions:: Using a union type to pack various types into
- the same memory space.
- * Cast to Union:: Casting a value one of the union's alternative
- types to the type of the union itself.
- * Structure Constructors:: Building new structure objects.
- * Unnamed Types as Fields:: Fields' types do not always need names.
- * Incomplete Types:: Types which have not been fully defined.
- * Intertwined Incomplete Types:: Defining mutually-recursive structue types.
- * Type Tags:: Scope of structure and union type tags.
- @end menu
- @node Referencing Fields
- @section Referencing Structure Fields
- @cindex referencing structure fields
- @cindex structure fields, referencing
- To make a structure useful, there has to be a way to examine and store
- its fields. The @samp{.} (period) operator does that; its use looks
- like @code{@var{object}.@var{field}}.
- Given this structure and variable,
- @example
- struct intlistlink
- @{
- int datum;
- struct intlistlink *next;
- @};
- struct intlistlink foo;
- @end example
- @noindent
- you can write @code{foo.datum} and @code{foo.next} to refer to the two
- fields in the value of @code{foo}. These fields are lvalues, so you
- can store values into them, and read the values out again.
- Most often, structures are dynamically allocated (see the next
- section), and we refer to the objects via pointers.
- @code{(*p).@var{field}} is somewhat cumbersome, so there is an
- abbreviation: @code{p->@var{field}}. For instance, assume the program
- contains this declaration:
- @example
- struct intlistlink *ptr;
- @end example
- @noindent
- You can write @code{ptr->datum} and @code{ptr->next} to refer
- to the two fields in the object that @code{ptr} points to.
- If a unary operator precedes an expression using @samp{->},
- the @samp{->} nests inside:
- @example
- -ptr->datum @r{is equivalent to} -(ptr->datum)
- @end example
- You can intermix @samp{->} and @samp{.} without parentheses,
- as shown here:
- @example
- struct @{ double d; struct intlistlink l; @} foo;
- @r{@dots{}}foo.l.next->next->datum@r{@dots{}}
- @end example
- @node Dynamic Memory Allocation
- @section Dynamic Memory Allocation
- @cindex dynamic memory allocation
- @cindex memory allocation, dynamic
- @cindex allocating memory dynamically
- To allocate an object dynamically, call the library function
- @code{malloc} (@pxref{Basic Allocation, The GNU C Library,, libc, The GNU C Library
- Reference Manual}). Here is how to allocate an object of type
- @code{struct intlistlink}. To make this code work, include the file
- @file{stdlib.h}, like this:
- @example
- #include <stddef.h> /* @r{Defines @code{NULL}.} */
- #include <stdlib.h> /* @r{Declares @code{malloc}.} */
- @dots{}
- struct intlistlink *
- alloc_intlistlink ()
- @{
- struct intlistlink *p;
- p = malloc (sizeof (struct intlistlink));
- if (p == NULL)
- fatal ("Ran out of storage");
- /* @r{Initialize the contents.} */
- p->datum = 0;
- p->next = NULL;
- return p;
- @}
- @end example
- @noindent
- @code{malloc} returns @code{void *}, so the assignment to @code{p}
- will automatically convert it to type @code{struct intlistlink *}.
- The return value of @code{malloc} is always sufficiently aligned
- (@pxref{Type Alignment}) that it is valid for any data type.
- The test for @code{p == NULL} is necessary because @code{malloc}
- returns a null pointer if it cannot get any storage. We assume that
- the program defines the function @code{fatal} to report a fatal error
- to the user.
- Here's how to add one more integer to the front of such a list:
- @example
- struct intlistlink *my_list = NULL;
- void
- add_to_mylist (int my_int)
- @{
- struct intlistlink *p = alloc_intlistlink ();
- p->datum = my_int;
- p->next = mylist;
- mylist = p;
- @}
- @end example
- The way to free the objects is by calling @code{free}. Here's
- a function to free all the links in one of these lists:
- @example
- void
- free_intlist (struct intlistlink *p)
- @{
- while (p)
- @{
- struct intlistlink *q = p;
- p = p->next;
- free (q);
- @}
- @}
- @end example
- We must extract the @code{next} pointer from the object before freeing
- it, because @code{free} can clobber the data that was in the object.
- For the same reason, the program must not use the list any more after
- freeing its elements. To make sure it won't, it is best to clear out
- the variable where the list was stored, like this:
- @example
- free_intlist (mylist);
- mylist = NULL;
- @end example
- @node Field Offset
- @section Field Offset
- @cindex field offset
- @cindex structure field offset
- @cindex offset of structure fields
- To determine the offset of a given field @var{field} in a structure
- type @var{type}, use the macro @code{offsetof}, which is defined in
- the file @file{stddef.h}. It is used like this:
- @example
- offsetof (@var{type}, @var{field})
- @end example
- Here is an example:
- @example
- struct foo
- @{
- int element;
- struct foo *next;
- @};
- offsetof (struct foo, next)
- /* @r{On most machines that is 4. It may be 8.} */
- @end example
- @node Structure Layout
- @section Structure Layout
- @cindex structure layout
- @cindex layout of structures
- The rest of this chapter covers advanced topics about structures. If
- you are just learning C, you can skip it.
- The precise layout of a @code{struct} type is crucial when using it to
- overlay hardware registers, to access data structures in shared
- memory, or to assemble and disassemble packets for network
- communication. It is also important for avoiding memory waste when
- the program makes many objects of that type. However, the layout
- depends on the target platform. Each platform has conventions for
- structure layout, which compilers need to follow.
- Here are the conventions used on most platforms.
- The structure's fields appear in the structure layout in the order
- they are declared. When possible, consecutive fields occupy
- consecutive bytes within the structure. However, if a field's type
- demands more alignment than it would get that way, C gives it the
- alignment it requires by leaving a gap after the previous field.
- Once all the fields have been laid out, it is possible to determine
- the structure's alignment and size. The structure's alignment is the
- maximum alignment of any of the fields in it. Then the structure's
- size is rounded up to a multiple of its alignment. That may require
- leaving a gap at the end of the structure.
- Here are some examples, where we assume that @code{char} has size and
- alignment 1 (always true), and @code{int} has size and alignment 4
- (true on most kinds of computers):
- @example
- struct foo
- @{
- char a, b;
- int c;
- @};
- @end example
- @noindent
- This structure occupies 8 bytes, with an alignment of 4. @code{a} is
- at offset 0, @code{b} is at offset 1, and @code{c} is at offset 4.
- There is a gap of 2 bytes before @code{c}.
- Contrast that with this structure:
- @example
- struct foo
- @{
- char a;
- int c;
- char b;
- @};
- @end example
- This structure has size 12 and alignment 4. @code{a} is at offset 0,
- @code{c} is at offset 4, and @code{b} is at offset 8. There are two
- gaps: three bytes before @code{c}, and three bytes at the end.
- These two structures have the same contents at the C level, but one
- takes 8 bytes and the other takes 12 bytes due to the ordering of the
- fields. A reliable way to avoid this sort of wastage is to order the
- fields by size, biggest fields first.
- @node Packed Structures
- @section Packed Structures
- @cindex packed structures
- @cindex @code{__attribute__((packed))}
- In GNU C you can force a structure to be laid out with no gaps by
- adding @code{__attribute__((packed))} after @code{struct} (or at the
- end of the structure type declaration). Here's an example:
- @example
- struct __attribute__((packed)) foo
- @{
- char a;
- int c;
- char b;
- @};
- @end example
- Without @code{__attribute__((packed))}, this structure occupies 12
- bytes (as described in the previous section), assuming 4-byte
- alignment for @code{int}. With @code{__attribute__((packed))}, it is
- only 6 bytes long---the sum of the lengths of its fields.
- Use of @code{__attribute__((packed))} often results in fields that
- don't have the normal alignment for their types. Taking the address
- of such a field can result in an invalid pointer because of its
- improper alignment. Dereferencing such a pointer can cause a
- @code{SIGSEGV} signal on a machine that doesn't, in general, allow
- unaligned pointers.
- @xref{Attributes}.
- @node Bit Fields
- @section Bit Fields
- @cindex bit fields
- A structure field declaration with an integer type can specify the
- number of bits the field should occupy. We call that a @dfn{bit
- field}. These are useful because consecutive bit fields are packed
- into a larger storage unit. For instance,
- @example
- unsigned char opcode: 4;
- @end example
- @noindent
- specifies that this field takes just 4 bits.
- Since it is unsigned, its possible values range
- from 0 to 15. A signed field with 4 bits, such as this,
- @example
- signed char small: 4;
- @end example
- @noindent
- can hold values from -8 to 7.
- You can subdivide a single byte into those two parts by writing
- @example
- unsigned char opcode: 4;
- signed char small: 4;
- @end example
- @noindent
- in the structure. With bit fields, these two numbers fit into
- a single @code{char}.
- Here's how to declare a one-bit field that can hold either 0 or 1:
- @example
- unsigned char special_flag: 1;
- @end example
- You can also use the @code{bool} type for bit fields:
- @example
- bool special_flag: 1;
- @end example
- Except when using @code{bool} (which is always unsigned,
- @pxref{Boolean Type}), always specify @code{signed} or @code{unsigned}
- for a bit field. There is a default, if that's not specified: the bit
- field is signed if plain @code{char} is signed, except that the option
- @option{-funsigned-bitfields} forces unsigned as the default. But it
- is cleaner not to depend on this default.
- Bit fields are special in that you cannot take their address with
- @samp{&}. They are not stored with the size and alignment appropriate
- for the specified type, so they cannot be addressed through pointers
- to that type.
- @node Bit Field Packing
- @section Bit Field Packing
- Programs to communicate with low-level hardware interfaces need to
- define bit fields laid out to match the hardware data. This section
- explains how to do that.
- Consecutive bit fields are packed together, but each bit field must
- fit within a single object of its specified type. In this example,
- @example
- unsigned short a : 3, b : 3, c : 3, d : 3, e : 3;
- @end example
- @noindent
- all five fields fit consecutively into one two-byte @code{short}.
- They need 15 bits, and one @code{short} provides 16. By contrast,
- @example
- unsigned char a : 3, b : 3, c : 3, d : 3, e : 3;
- @end example
- @noindent
- needs three bytes. It fits @code{a} and @code{b} into one
- @code{char}, but @code{c} won't fit in that @code{char} (they would
- add up to 9 bits). So @code{c} and @code{d} go into a second
- @code{char}, leaving a gap of two bits between @code{b} and @code{c}.
- Then @code{e} needs a third @code{char}. By contrast,
- @example
- unsigned char a : 3, b : 3;
- unsigned int c : 3;
- unsigned char d : 3, e : 3;
- @end example
- @noindent
- needs only two bytes: the type @code{unsigned int}
- allows @code{c} to straddle bytes that are in the same word.
- You can leave a gap of a specified number of bits by defining a
- nameless bit field. This looks like @code{@var{type} : @var{nbits};}.
- It is allocated space in the structure just as a named bit field would
- be allocated.
- You can force the following bit field to advance to the following
- aligned memory object with @code{@var{type} : 0;}.
- Both of these constructs can syntactically share @var{type} with
- ordinary bit fields. This example illustrates both:
- @example
- unsigned int a : 5, : 3, b : 5, : 0, c : 5, : 3, d : 5;
- @end example
- @noindent
- It puts @code{a} and @code{b} into one @code{int}, with a 3-bit gap
- between them. Then @code{: 0} advances to the next @code{int},
- so @code{c} and @code{d} fit into that one.
- These rules for packing bit fields apply to most target platforms,
- including all the usual real computers. A few embedded controllers
- have special layout rules.
- @node const Fields
- @section @code{const} Fields
- @cindex const fields
- @cindex structure fields, constant
- @c ??? Is this a C standard feature?
- A structure field declared @code{const} cannot be assigned to
- (@pxref{const}). For instance, let's define this modified version of
- @code{struct intlistlink}:
- @example
- struct intlistlink_ro /* @r{``ro'' for read-only.} */
- @{
- const int datum;
- struct intlistlink *next;
- @};
- @end example
- This structure can be used to prevent part of the code from modifying
- the @code{datum} field:
- @example
- /* @r{@code{p} has type @code{struct intlistlink *}.}
- @r{Convert it to @code{struct intlistlink_ro *}.} */
- struct intlistlink_ro *q
- = (struct intlistlink_ro *) p;
- q->datum = 5; /* @r{Error!} */
- p->datum = 5; /* @r{Valid since @code{*p} is}
- @r{not a @code{struct intlistlink_ro}.} */
- @end example
- A @code{const} field can get a value in two ways: by initialization of
- the whole structure, and by making a pointer-to-structure point to an object
- in which that field already has a value.
- Any @code{const} field in a structure type makes assignment impossible
- for structures of that type (@pxref{Structure Assignment}). That is
- because structure assignment works by assigning the structure's
- fields, one by one.
- @node Zero Length
- @section Arrays of Length Zero
- @cindex array of length zero
- @cindex zero-length arrays
- @cindex length-zero arrays
- GNU C allows zero-length arrays. They are useful as the last element
- of a structure that is really a header for a variable-length object.
- Here's an example, where we construct a variable-size structure
- to hold a line which is @code{this_length} characters long:
- @example
- struct line @{
- int length;
- char contents[0];
- @};
- struct line *thisline
- = ((struct line *)
- malloc (sizeof (struct line)
- + this_length));
- thisline->length = this_length;
- @end example
- In ISO C90, we would have to give @code{contents} a length of 1, which
- means either wasting space or complicating the argument to @code{malloc}.
- @node Flexible Array Fields
- @section Flexible Array Fields
- @cindex flexible array fields
- @cindex array fields, flexible
- The C99 standard adopted a more complex equivalent of zero-length
- array fields. It's called a @dfn{flexible array}, and it's indicated
- by omitting the length, like this:
- @example
- struct line
- @{
- int length;
- char contents[];
- @};
- @end example
- The flexible array has to be the last field in the structure, and there
- must be other fields before it.
- Under the C standard, a structure with a flexible array can't be part
- of another structure, and can't be an element of an array.
- GNU C allows static initialization of flexible array fields. The effect
- is to ``make the array long enough'' for the initializer.
- @example
- struct f1 @{ int x; int y[]; @} f1
- = @{ 1, @{ 2, 3, 4 @} @};
- @end example
- @noindent
- This defines a structure variable named @code{f1}
- whose type is @code{struct f1}. In C, a variable name or function name
- never conflicts with a structure type tag.
- Omitting the flexible array field's size lets the initializer
- determine it. This is allowed only when the flexible array is defined
- in the outermost structure and you declare a variable of that
- structure type. For example:
- @example
- struct foo @{ int x; int y[]; @};
- struct bar @{ struct foo z; @};
- struct foo a = @{ 1, @{ 2, 3, 4 @} @}; // @r{Valid.}
- struct bar b = @{ @{ 1, @{ 2, 3, 4 @} @} @}; // @r{Invalid.}
- struct bar c = @{ @{ 1, @{ @} @} @}; // @r{Valid.}
- struct foo d[1] = @{ @{ 1 @{ 2, 3, 4 @} @} @}; // @r{Invalid.}
- @end example
- @node Overlaying Structures
- @section Overlaying Different Structures
- @cindex overlaying structures
- @cindex structures, overlaying
- Be careful about using different structure types to refer to the same
- memory within one function, because GNU C can optimize code assuming
- it never does that. @xref{Aliasing}. Here's an example of the kind of
- aliasing that can cause the problem:
- @example
- struct a @{ int size; char *data; @};
- struct b @{ int size; char *data; @};
- struct a foo;
- struct b *q = (struct b *) &foo;
- @end example
- Here @code{q} points to the same memory that the variable @code{foo}
- occupies, but they have two different types. The two types
- @code{struct a} and @code{struct b} are defined alike, but they are
- not the same type. Interspersing references using the two types,
- like this,
- @example
- p->size = 0;
- q->size = 1;
- x = p->size;
- @end example
- @noindent
- allows GNU C to assume that @code{p->size} is still zero when it is
- copied into @code{x}. The compiler ``knows'' that @code{q} points to
- a @code{struct b} and this cannot overlap with a @code{struct a}.
- Other compilers might also do this optimization. The ISO C standard
- considers such code erroneous, precisely so that this optimization
- will be valid.
- @node Structure Assignment
- @section Structure Assignment
- @cindex structure assignment
- @cindex assigning structures
- Assignment operating on a structure type copies the structure. The
- left and right operands must have the same type. Here is an example:
- @example
- #include <stddef.h> /* @r{Defines @code{NULL}.} */
- #include <stdlib.h> /* @r{Declares @code{malloc}.} */
- @r{@dots{}}
- struct point @{ double x, y; @};
- struct point *
- copy_point (struct point point)
- @{
- struct point *p
- = (struct point *) malloc (sizeof (struct point));
- if (p == NULL)
- fatal ("Out of memory");
- *p = point;
- return p;
- @}
- @end example
- Notionally, assignment on a structure type works by copying each of
- the fields. Thus, if any of the fields has the @code{const}
- qualifier, that structure type does not allow assignment:
- @example
- struct point @{ const double x, y; @};
- struct point a, b;
- a = b; /* @r{Error!} */
- @end example
- @xref{Assignment Expressions}.
- @node Unions
- @section Unions
- @cindex unions
- @findex union
- A @dfn{union type} defines alternative ways of looking at the same
- piece of memory. Each alternative view is defined with a data type,
- and identified by a name. A union definition looks like this:
- @example
- union @var{name}
- @{
- @var{alternative declarations}@r{@dots{}}
- @};
- @end example
- Each alternative declaration looks like a structure field declaration,
- except that it can't be a bit field. For instance,
- @example
- union number
- @{
- long int integer;
- double float;
- @}
- @end example
- @noindent
- lets you store either an integer (type @code{long int}) or a floating
- point number (type @code{double}) in the same place in memory. The
- length and alignment of the union type are the maximum of all the
- alternatives---they do not have to be the same. In this union
- example, @code{double} probably takes more space than @code{long int},
- but that doesn't cause a problem in programs that use the union in the
- normal way.
- The members don't have to be different in data type. Sometimes
- each member pertains to a way the data will be used. For instance,
- @example
- union datum
- @{
- double latitude;
- double longitude;
- double height;
- double weight;
- int continent;
- @}
- @end example
- This union holds one of several kinds of data; most kinds are floating
- points, but the value can also be a code for a continent which is an
- integer. You @emph{could} use one member of type @code{double} to
- access all the values which have that type, but the different member
- names will make the program clearer.
- The alignment of a union type is the maximum of the alignments of the
- alternatives. The size of the union type is the maximum of the sizes
- of the alternatives, rounded up to a multiple of the alignment
- (because every type's size must be a multiple of its alignment).
- All the union alternatives start at the address of the union itself.
- If an alternative is shorter than the union as a whole, it occupies
- the first part of the union's storage, leaving the last part unused
- @emph{for that alternative}.
- @strong{Warning:} if the code stores data using one union alternative
- and accesses it with another, the results depend on the kind of
- computer in use. Only wizards should try to do this. However, when
- you need to do this, a union is a clean way to do it.
- Assignment works on any union type by copying the entire value.
- @node Packing With Unions
- @section Packing With Unions
- Sometimes we design a union with the intention of packing various
- kinds of objects into a certain amount of memory space. For example.
- @example
- union bytes8
- @{
- long long big_int_elt;
- double double_elt;
- struct @{ int first, second; @} two_ints;
- struct @{ void *first, *second; @} two_ptrs;
- @};
- union bytes8 *p;
- @end example
- This union makes it possible to look at 8 bytes of data that @code{p}
- points to as a single 8-byte integer (@code{p->big_int_elt}), as a
- single floating-point number (@code{p->double_elt}), as a pair of
- integers (@code{p->two_ints.first} and @code{p->two_ints.second}), or
- as a pair of pointers (@code{p->two_ptrs.first} and
- @code{p->two_ptrs.second}).
- To pack storage with such a union makes assumptions about the sizes of
- all the types involved. This particular union was written expecting a
- pointer to have the same size as @code{int}. On a machine where one
- pointer takes 8 bytes, the code using this union probably won't work
- as expected. The union, as such, will function correctly---if you
- store two values through @code{two_ints} and extract them through
- @code{two_ints}, you will get the same integers back---but the part of
- the program that expects the union to be 8 bytes long could
- malfunction, or at least use too much space.
- The above example shows one case where a @code{struct} type with no
- tag can be useful. Another way to get effectively the same result
- is with arrays as members of the union:
- @example
- union eight_bytes
- @{
- long long big_int_elt;
- double double_elt;
- int two_ints[2];
- void *two_ptrs[2];
- @};
- @end example
- @node Cast to Union
- @section Cast to a Union Type
- @cindex cast to a union
- @cindex union, casting to a
- In GNU C, you can explicitly cast any of the alternative types to the
- union type; for instance,
- @example
- (union eight_bytes) (long long) 5
- @end example
- @noindent
- makes a value of type @code{union eight_bytes} which gets its contents
- through the alternative named @code{big_int_elt}.
- The value being cast must exactly match the type of the alternative,
- so this is not valid:
- @example
- (union eight_bytes) 5 /* @r{Error! 5 is @code{int}.} */
- @end example
- A cast to union type looks like any other cast, except that the type
- specified is a union type. You can specify the type either with
- @code{union @var{tag}} or with a typedef name (@pxref{Defining
- Typedef Names}).
- Using the cast as the right-hand side of an assignment to a variable of
- union type is equivalent to storing in an alternative of the union:
- @example
- union foo u;
- u = (union foo) x @r{means} u.i = x
- u = (union foo) y @r{means} u.d = y
- @end example
- You can also use the union cast as a function argument:
- @example
- void hack (union foo);
- @r{@dots{}}
- hack ((union foo) x);
- @end example
- @node Structure Constructors
- @section Structure Constructors
- @cindex structure constructors
- @cindex constructors, structure
- You can construct a structure value by writing its type in
- parentheses, followed by an initializer that would be valid in a
- declaration for that type. For instance, given this declaration,
- @example
- struct foo @{int a; char b[2];@} structure;
- @end example
- @noindent
- you can create a @code{struct foo} value as follows:
- @example
- ((struct foo) @{x + y, 'a', 0@})
- @end example
- @noindent
- This specifies @code{x + y} for field @code{a},
- the character @samp{a} for field @code{b}'s element 0,
- and the null character for field @code{b}'s element 1.
- The parentheses around that constructor are to necessary, but we
- recommend writing them to make the nesting of the containing
- expression clearer.
- You can also show the nesting of the two by writing it like
- this:
- @example
- ((struct foo) @{x + y, @{'a', 0@} @})
- @end example
- Each of those is equivalent to writing the following statement
- expression (@pxref{Statement Exprs}):
- @example
- (@{
- struct foo temp = @{x + y, 'a', 0@};
- temp;
- @})
- @end example
- You can also create a union value this way, but it is not especially
- useful since that is equivalent to doing a cast:
- @example
- ((union whosis) @{@var{value}@})
- @r{is equivalent to}
- ((union whosis) (@var{value}))
- @end example
- @node Unnamed Types as Fields
- @section Unnamed Types as Fields
- @cindex unnamed structures
- @cindex unnamed unions
- @cindex structures, unnamed
- @cindex unions, unnamed
- A structure or a union can contain, as fields,
- unnamed structures and unions. Here's an example:
- @example
- struct
- @{
- int a;
- union
- @{
- int b;
- float c;
- @};
- int d;
- @} foo;
- @end example
- @noindent
- You can access the fields of the unnamed union within @code{foo} as if they
- were individual fields at the same level as the union definition:
- @example
- foo.a = 42;
- foo.b = 47;
- foo.c = 5.25; // @r{Overwrites the value in @code{foo.b}}.
- foo.d = 314;
- @end example
- Avoid using field names that could cause ambiguity. For example, with
- this definition:
- @example
- struct
- @{
- int a;
- struct
- @{
- int a;
- float b;
- @};
- @} foo;
- @end example
- @noindent
- it is impossible to tell what @code{foo.a} refers to. GNU C reports
- an error when a definition is ambiguous in this way.
- @node Incomplete Types
- @section Incomplete Types
- @cindex incomplete types
- @cindex types, incomplete
- A type that has not been fully defined is called an @dfn{incomplete
- type}. Structure and union types are incomplete when the code makes a
- forward reference, such as @code{struct foo}, before defining the
- type. An array type is incomplete when its length is unspecified.
- You can't use an incomplete type to declare a variable or field, or
- use it for a function parameter or return type. The operators
- @code{sizeof} and @code{_Alignof} give errors when used on an
- incomplete type.
- However, you can define a pointer to an incomplete type, and declare a
- variable or field with such a pointer type. In general, you can do
- everything with such pointers except dereference them. For example:
- @example
- extern void bar (struct mysterious_value *);
- void
- foo (struct mysterious_value *arg)
- @{
- bar (arg);
- @}
- @r{@dots{}}
- @{
- struct mysterious_value *p, **q;
- p = *q;
- foo (p);
- @}
- @end example
- @noindent
- These examples are valid because the code doesn't try to understand
- what @code{p} points to; it just passes the pointer around.
- (Presumably @code{bar} is defined in some other file that really does
- have a definition for @code{struct mysterious_value}.) However,
- dereferencing the pointer would get an error; that requires a
- definition for the structure type.
- @node Intertwined Incomplete Types
- @section Intertwined Incomplete Types
- When several structure types contain pointers to each other, you can
- define the types in any order because pointers to types that come
- later are incomplete types. Thus,
- Here is an example.
- @example
- /* @r{An employee record points to a group.} */
- struct employee
- @{
- char *name;
- @r{@dots{}}
- struct group *group; /* @r{incomplete type.} */
- @r{@dots{}}
- @};
- /* @r{An employee list points to employees.} */
- struct employee_list
- @{
- struct employee *this_one;
- struct employee_list *next; /* @r{incomplete type.} */
- @r{@dots{}}
- @};
- /* @r{A group points to one employee_list.} */
- struct group
- @{
- char *name;
- @r{@dots{}}
- struct employee_list *employees;
- @r{@dots{}}
- @};
- @end example
- @node Type Tags
- @section Type Tags
- @cindex type tags
- The name that follows @code{struct} (@pxref{Structures}), @code{union}
- (@pxref{Unions}, or @code{enum} (@pxref{Enumeration Types}) is called
- a @dfn{type tag}. In C, a type tag never conflicts with a variable
- name or function name; the type tags have a separate @dfn{name space}.
- Thus, there is no name conflict in this code:
- @example
- struct pair @{ int a, b; @};
- int pair = 1;
- @end example
- @noindent
- nor in this one:
- @example
- struct pair @{ int a, b; @} pair;
- @end example
- @noindent
- where @code{pair} is both a structure type tag and a variable name.
- However, @code{struct}, @code{union}, and @code{enum} share the same
- name space of tags, so this is a conflict:
- @example
- struct pair @{ int a, b; @};
- enum pair @{ c, d @};
- @end example
- @noindent
- and so is this:
- @example
- struct pair @{ int a, b; @};
- struct pair @{ int c, d; @};
- @end example
- When the code defines a type tag inside a block, the tag's scope is
- limited to that block (as for local variables). Two definitions for
- one type tag do not conflict if they are in different scopes; rather,
- each is valid in its scope. For example,
- @example
- struct pair @{ int a, b; @};
- void
- pair_up_doubles (int len, double array[])
- @{
- struct pair @{ double a, b; @};
- @r{@dots{}}
- @}
- @end example
- @noindent
- has two definitions for @code{struct pair} which do not conflict. The
- one inside the function applies only within the definition of
- @code{pair_up_doubles}. Within its scope, that definition
- @dfn{shadows} the outer definition.
- If @code{struct pair} appears inside the function body, before the
- inner definition, it refers to the outer definition---the only one
- that has been seen at that point. Thus, in this code,
- @example
- struct pair @{ int a, b; @};
- void
- pair_up_doubles (int len, double array[])
- @{
- struct two_pairs @{ struct pair *p, *q; @};
- struct pair @{ double a, b; @};
- @r{@dots{}}
- @}
- @end example
- @noindent
- the structure @code{two_pairs} has pointers to the outer definition of
- @code{struct pair}, which is probably not desirable.
- To prevent that, you can write @code{struct pair;} inside the function
- body as a variable declaration with no variables. This is a
- @dfn{forward declaration} of the type tag @code{pair}: it makes the
- type tag local to the current block, with the details of the type to
- come later. Here's an example:
- @example
- void
- pair_up_doubles (int len, double array[])
- @{
- /* @r{Forward declaration for @code{pair}.} */
- struct pair;
- struct two_pairs @{ struct pair *p, *q; @};
- /* @r{Give the details.} */
- struct pair @{ double a, b; @};
- @r{@dots{}}
- @}
- @end example
- However, the cleanest practice is to avoid shadowing type tags.
- @node Arrays
- @chapter Arrays
- @cindex array
- @cindex elements of arrays
- An @dfn{array} is a data object that holds a series of @dfn{elements},
- all of the same data type. Each element is identified by its numeric
- @var{index} within the array.
- We presented arrays of numbers in the sample programs early in this
- manual (@pxref{Array Example}). However, arrays can have elements of
- any data type, including pointers, structures, unions, and other
- arrays.
- If you know another programming language, you may suppose that you know all
- about arrays, but C arrays have special quirks, so in this chapter we
- collect all the information about arrays in C@.
- The elements of a C array are allocated consecutively in memory,
- with no gaps between them. Each element is aligned as required
- for its data type (@pxref{Type Alignment}).
- @menu
- * Accessing Array Elements:: How to access individual elements of an array.
- * Declaring an Array:: How to name and reserve space for a new array.
- * Strings:: A string in C is a special case of array.
- * Array Type Designators:: Referring to a specific array type.
- * Incomplete Array Types:: Naming, but not allocating, a new array.
- * Limitations of C Arrays:: Arrays are not first-class objects.
- * Multidimensional Arrays:: Arrays of arrays.
- * Constructing Array Values:: Assigning values to an entire array at once.
- * Arrays of Variable Length:: Declaring arrays of non-constant size.
- @end menu
- @node Accessing Array Elements
- @section Accessing Array Elements
- @cindex accessing array elements
- @cindex array elements, accessing
- If the variable @code{a} is an array, the @var{n}th element of
- @code{a} is @code{a[@var{n}]}. You can use that expression to access
- an element's value or to assign to it:
- @example
- x = a[5];
- a[6] = 1;
- @end example
- @noindent
- Since the variable @code{a} is an lvalue, @code{a[@var{n}]} is also an
- lvalue.
- The lowest valid index in an array is 0, @emph{not} 1, and the highest
- valid index is one less than the number of elements.
- The C language does not check whether array indices are in bounds, so
- if the code uses an out-of-range index, it will access memory outside the
- array.
- @strong{Warning:} Using only valid index values in C is the
- programmer's responsibility.
- Array indexing in C is not a primitive operation: it is defined in
- terms of pointer arithmetic and dereferencing. Now that we know
- @emph{what} @code{a[i]} does, we can ask @emph{how} @code{a[i]} does
- its job.
- In C, @code{@var{x}[@var{y}]} is an abbreviation for
- @code{*(@var{x}+@var{y})}. Thus, @code{a[i]} really means
- @code{*(a+i)}. @xref{Pointers and Arrays}.
- When an expression with array type (such as @code{a}) appears as part
- of a larger C expression, it is converted automatically to a pointer
- to element zero of that array. For instance, @code{a} in an
- expression is equivalent to @code{&a[0]}. Thus, @code{*(a+i)} is
- computed as @code{*(&a[0]+i)}.
- Now we can analyze how that expression gives us the desired element of
- the array. It makes a pointer to element 0 of @code{a}, advances it
- by the value of @code{i}, and dereferences that pointer.
- Another equivalent way to write the expression is @code{(&a[0])[i]}.
- @node Declaring an Array
- @section Declaring an Array
- @cindex declaring an array
- @cindex array, declaring
- To make an array declaration, write @code{[@var{length}]} after the
- name being declared. This construct is valid in the declaration of a
- variable, a function parameter, a function value type (the value can't
- be an array, but it can be a pointer to one), a structure field, or a
- union alternative.
- The surrounding declaration specifies the element type of the array;
- that can be any type of data, but not @code{void} or a function type.
- For instance,
- @example
- double a[5];
- @end example
- @noindent
- declares @code{a} as an array of 5 @code{double}s.
- @example
- struct foo bstruct[length];
- @end example
- @noindent
- declares @code{bstruct} as an array of @code{length} objects of type
- @code{struct foo}. A variable array size like this is allowed when
- the array is not file-scope.
- Other declaration constructs can nest within the array declaration
- construct. For instance:
- @example
- struct foo *b[length];
- @end example
- @noindent
- declares @code{b} as an array of @code{length} pointers to
- @code{struct foo}. This shows that the length need not be a constant
- (@pxref{Arrays of Variable Length}).
- @example
- double (*c)[5];
- @end example
- @noindent
- declares @code{c} as a pointer to an array of 5 @code{double}s, and
- @example
- char *(*f (int))[5];
- @end example
- @noindent
- declares @code{f} as a function taking an @code{int} argument and
- returning a pointer to an array of 5 strings (pointers to
- @code{char}s).
- @example
- double aa[5][10];
- @end example
- @noindent
- declares @code{aa} as an array of 5 elements, each of which is an
- array of 10 @code{double}s. This shows how to declare a
- multidimensional array in C (@pxref{Multidimensional Arrays}).
- All these declarations specify the array's length, which is needed in
- these cases in order to allocate storage for the array.
- @node Strings
- @section Strings
- @cindex string
- A string in C is a sequence of elements of type @code{char},
- terminated with the null character, the character with code zero.
- Programs often need to use strings with specific, fixed contents. To
- write one in a C program, use a @dfn{string constant} such as
- @code{"Take me to your leader!"}. The data type of a string constant
- is @code{char *}. For the full syntactic details of writing string
- constants, @ref{String Constants}.
- To declare a place to store a non-constant string, declare an array of
- @code{char}. Keep in mind that it must include one extra @code{char}
- for the terminating null. For instance,
- @example
- char text = @{ 'H', 'e', 'l', 'l', 'o', 0 @};
- @end example
- @noindent
- declares an array named @samp{text} with six elements---five letters
- and the terminating null character. An equivalent way to get the same
- result is this,
- @example
- char text = "Hello";
- @end example
- @noindent
- which copies the elements of the string constant, including @emph{its}
- terminating null character.
- @example
- char message[200];
- @end example
- @noindent
- declares an array long enough to hold a string of 199 ASCII characters
- plus the terminating null character.
- When you store a string into @code{message} be sure to check or prove
- that the length does not exceed its size. For example,
- @example
- void
- set_message (char *text)
- @{
- int i;
- for (i = 0; i < sizeof (message); i++)
- @{
- message[i] = text[i];
- if (text[i] == 0)
- return;
- @}
- fatal_error ("Message is too long for `message');
- @}
- @end example
- It's easy to do this with the standard library function
- @code{strncpy}, which fills out the whole destination array (up to a
- specified length) with null characters. Thus, if the last character
- of the destination is not null, the string did not fit. Many system
- libraries, including the GNU C library, hand-optimize @code{strncpy}
- to run faster than an explicit @code{for}-loop.
- Here's what the code looks like:
- @example
- void
- set_message (char *text)
- @{
- strncpy (message, text, sizeof (message));
- if (message[sizeof (message) - 1] != 0)
- fatal_error ("Message is too long for `message');
- @}
- @end example
- @xref{String and Array Utilities, The GNU C Library, , libc, The GNU C
- Library Reference Manual}, for more information about the standard
- library functions for operating on strings.
- You can avoid putting a fixed length limit on strings you construct or
- operate on by allocating the space for them dynamically.
- @xref{Dynamic Memory Allocation}.
- @node Array Type Designators
- @section Array Type Designators
- Every C type has a type designator, which you make by deleting the
- variable name and the semicolon from a declaration (@pxref{Type
- Designators}). The designators for array types follow this rule, but
- they may appear surprising.
- @example
- @r{type} int a[5]; @r{designator} int [5]
- @r{type} double a[5][3]; @r{designator} double [5][3]
- @r{type} struct foo *a[5]; @r{designator} struct foo *[5]
- @end example
- @node Incomplete Array Types
- @section Incomplete Array Types
- @cindex incomplete array types
- @cindex array types, incomplete
- An array is equivalent, for most purposes, to a pointer to its zeroth
- element. When that is true, the length of the array is irrelevant.
- The length needs to be known only for allocating space for the array, or
- for @code{sizeof} and @code{typeof} (@pxref{Auto Type}). Thus, in some
- contexts C allows
- @itemize @bullet
- @item
- An @code{extern} declaration says how to refer to a variable allocated
- elsewhere. It does not need to allocate space for the variable,
- so if it is an array, you can omit the length. For example,
- @example
- extern int foo[];
- @end example
- @item
- When declaring a function parameter as an array, the argument value
- passed to the function is really a pointer to the array's zeroth
- element. This value does not say how long the array really is, there
- is no need to declare it. For example,
- @example
- int
- func (int foo[])
- @end example
- @end itemize
- These declarations are examples of @dfn{incomplete} array types, types
- that are not fully specified. The incompleteness makes no difference
- for accessing elements of the array, but it matters for some other
- things. For instance, @code{sizeof} is not allowed on an incomplete
- type.
- With multidimensional arrays, only the first dimension can be omitted:
- @example
- extern struct chesspiece *funnyboard foo[][8];
- @end example
- In other words, the code doesn't have to say how many rows there are,
- but it must state how big each row is.
- @node Limitations of C Arrays
- @section Limitations of C Arrays
- @cindex limitations of C arrays
- @cindex first-class object
- Arrays have quirks in C because they are not ``first-class objects'':
- there is no way in C to operate on an array as a unit.
- The other composite objects in C, structures and unions, are
- first-class objects: a C program can copy a structure or union value
- in an assignment, or pass one as an argument to a function, or make a
- function return one. You can't do those things with an array in C@.
- That is because a value you can operate on never has an array type.
- An expression in C can have an array type, but that doesn't produce
- the array as a value. Instead it is converted automatically to a
- pointer to the array's element at index zero. The code can operate
- on the pointer, and through that on individual elements of the array,
- but it can't get and operate on the array as a unit.
- There are three exceptions to this conversion rule, but none of them
- offers a way to operate on the array as a whole.
- First, @samp{&} applied to an expression with array type gives you the
- address of the array, as an array type. However, you can't operate on the
- whole array that way---if you apply @samp{*} to get the array back,
- that expression converts, as usual, to a pointer to its zeroth
- element.
- Second, the operators @code{sizeof}, @code{_Alignof}, and
- @code{typeof} do not convert the array to a pointer; they leave it as
- an array. But they don't operate on the array's data---they only give
- information about its type.
- Third, a string constant used as an initializer for an array is not
- converted to a pointer---rather, the declaration copies the
- @emph{contents} of that string in that one special case.
- You @emph{can} copy the contents of an array, just not with an
- assignment operator. You can do it by calling the library function
- @code{memcpy} or @code{memmove} (@pxref{Copying and Concatenation, The
- GNU C Library, , libc, The GNU C Library Reference Manual}). Also,
- when a structure contains just an array, you can copy that structure.
- An array itself is an lvalue if it is a declared variable, or part of
- a structure or union that is an lvalue. When you construct an array
- from elements (@pxref{Constructing Array Values}), that array is not
- an lvalue.
- @node Multidimensional Arrays
- @section Multidimensional Arrays
- @cindex multidimensional arrays
- @cindex array, multidimensional
- Strictly speaking, all arrays in C are unidimensional. However, you
- can create an array of arrays, which is more or less equivalent to a
- multidimensional array. For example,
- @example
- struct chesspiece *board[8][8];
- @end example
- @noindent
- declares an array of 8 arrays of 8 pointers to @code{struct
- chesspiece}. This data type could represent the state of a chess
- game. To access one square's contents requires two array index
- operations, one for each dimension. For instance, you can write
- @code{board[row][column]}, assuming @code{row} and @code{column}
- are variables with integer values in the proper range.
- How does C understand @code{board[row][column]}? First of all,
- @code{board} is converted automatically to a pointer to the zeroth
- element (at index zero) of @code{board}. Adding @code{row} to that
- makes it point to the desired element. Thus, @code{board[row]}'s
- value is an element of @code{board}---an array of 8 pointers.
- However, as an expression with array type, it is converted
- automatically to a pointer to the array's zeroth element. The second
- array index operation, @code{[column]}, accesses the chosen element
- from that array.
- As this shows, pointer-to-array types are meaningful in C@.
- You can declare a variable that points to a row in a chess board
- like this:
- @example
- struct chesspiece *(*rowptr)[8];
- @end example
- @noindent
- This points to an array of 8 pointers to @code{struct chesspiece}.
- You can assign to it as follows:
- @example
- rowptr = &board[5];
- @end example
- The dimensions don't have to be equal in length. Here we declare
- @code{statepop} as an array to hold the population of each state in
- the United States for each year since 1900:
- @example
- #define NSTATES 50
- @{
- int nyears = current_year - 1900 + 1;
- int statepop[NSTATES][nyears];
- @r{@dots{}}
- @}
- @end example
- The variable @code{statepop} is an array of @code{NSTATES} subarrays,
- each indexed by the year (counting from 1900). Thus, to get the
- element for a particular state and year, we must subscript it first
- by the number that indicates the state, and second by the index for
- the year:
- @example
- statepop[state][year - 1900]
- @end example
- @cindex array, layout in memory
- The subarrays within the multidimensional array are allocated
- consecutively in memory, and within each subarray, its elements are
- allocated consecutively in memory. The most efficient way to process
- all the elements in the array is to scan the last subscript in the
- innermost loop. This means consecutive accesses go to consecutive
- memory locations, which optimizes use of the processor's memory cache.
- For example:
- @example
- int total = 0;
- float average;
- for (int state = 0; state < NSTATES, ++state)
- @{
- for (int year = 0; year < nyears; ++year)
- @{
- total += statepop[state][year];
- @}
- @}
- average = total / nyears;
- @end example
- C's layout for multidimensional arrays is different from Fortran's
- layout. In Fortran, a multidimensional array is not an array of
- arrays; rather, multidimensional arrays are a primitive feature, and
- it is the first index that varies most rapidly between consecutive
- memory locations. Thus, the memory layout of a 50x114 array in C
- matches that of a 114x50 array in Fortran.
- @node Constructing Array Values
- @section Constructing Array Values
- @cindex constructing array values
- @cindex array values, constructing
- You can construct an array from elements by writing them inside
- braces, and preceding all that with the array type's designator in
- parentheses. There is no need to specify the array length, since the
- number of elements determines that. The constructor looks like this:
- @example
- (@var{elttype}[]) @{ @var{elements} @};
- @end example
- Here is an example, which constructs an array of string pointers:
- @example
- (char *[]) @{ "x", "y", "z" @};
- @end example
- That's equivalent in effect to declaring an array with the same
- initializer, like this:
- @example
- char *array[] = @{ "x", "y", "z" @};
- @end example
- and then using the array.
- If all the elements are simple constant expressions, or made up of
- such, then the compound literal can be coerced to a pointer to its
- zeroth element and used to initialize a file-scope variable
- (@pxref{File-Scope Variables}), as shown here:
- @example
- char **foo = (char *[]) @{ "x", "y", "z" @};
- @end example
- @noindent
- The data type of @code{foo} is @code{char **}, which is a pointer
- type, not an array type. The declaration is equivalent to defining
- and then using an array-type variable:
- @example
- char *nameless_array[] = @{ "x", "y", "z" @};
- char **foo = &nameless_array[0];
- @end example
- @node Arrays of Variable Length
- @section Arrays of Variable Length
- @cindex array of variable length
- @cindex variable-length arrays
- In GNU C, you can declare variable-length arrays like any other
- arrays, but with a length that is not a constant expression. The
- storage is allocated at the point of declaration and deallocated when
- the block scope containing the declaration exits. For example:
- @example
- #include <stdio.h> /* @r{Defines @code{FILE}.} */
- #include <string.h> /* @r{Declares @code{str}.} */
- FILE *
- concat_fopen (char *s1, char *s2, char *mode)
- @{
- char str[strlen (s1) + strlen (s2) + 1];
- strcpy (str, s1);
- strcat (str, s2);
- return fopen (str, mode);
- @}
- @end example
- @noindent
- (This uses some standard library functions; see @ref{String and Array
- Utilities, , , libc, The GNU C Library Reference Manual}.)
- The length of an array is computed once when the storage is allocated
- and is remembered for the scope of the array in case it is used in
- @code{sizeof}.
- @strong{Warning:} don't allocate a variable-length array if the size
- might be very large (more than 100,000), or in a recursive function,
- because that is likely to cause stack overflow. Allocate the array
- dynamically instead (@pxref{Dynamic Memory Allocation}).
- Jumping or breaking out of the scope of the array name deallocates the
- storage. Jumping into the scope is not allowed; that gives an error
- message.
- You can also use variable-length arrays as arguments to functions:
- @example
- struct entry
- tester (int len, char data[len][len])
- @{
- @r{@dots{}}
- @}
- @end example
- As usual, a function argument declared with an array type
- is really a pointer to an array that already exists.
- Calling the function does not allocate the array, so there's no
- particular danger of stack overflow in using this construct.
- To pass the array first and the length afterward, use a forward
- declaration in the function's parameter list (another GNU extension).
- For example,
- @example
- struct entry
- tester (int len; char data[len][len], int len)
- @{
- @r{@dots{}}
- @}
- @end example
- The @code{int len} before the semicolon is a @dfn{parameter forward
- declaration}, and it serves the purpose of making the name @code{len}
- known when the declaration of @code{data} is parsed.
- You can write any number of such parameter forward declarations in the
- parameter list. They can be separated by commas or semicolons, but
- the last one must end with a semicolon, which is followed by the
- ``real'' parameter declarations. Each forward declaration must match
- a ``real'' declaration in parameter name and data type. ISO C11 does
- not support parameter forward declarations.
- @node Enumeration Types
- @chapter Enumeration Types
- @cindex enumeration types
- @cindex types, enumeration
- @cindex enumerator
- An @dfn{enumeration type} represents a limited set of integer values,
- each with a name. It is effectively equivalent to a primitive integer
- type.
- Suppose we have a list of possible emotional states to store in an
- integer variable. We can give names to these alternative values with
- an enumeration:
- @example
- enum emotion_state @{ neutral, happy, sad, worried,
- calm, nervous @};
- @end example
- @noindent
- (Never mind that this is a simplistic way to classify emotional states;
- it's just a code example.)
- The names inside the enumeration are called @dfn{enumerators}. The
- enumeration type defines them as constants, and their values are
- consecutive integers; @code{neutral} is 0, @code{happy} is 1,
- @code{sad} is 2, and so on. Alternatively, you can specify values for
- the enumerators explicitly like this:
- @example
- enum emotion_state @{ neutral = 2, happy = 5,
- sad = 20, worried = 10,
- calm = -5, nervous = -300 @};
- @end example
- Each enumerator which does not specify a value gets value zero
- (if it is at the beginning) or the next consecutive integer.
- @example
- /* @r{@code{neutral} is 0 by default,}
- @r{and @code{worried} is 21 by default.} */
- enum emotion_state @{ neutral,
- happy = 5, sad = 20, worried,
- calm = -5, nervous = -300 @};
- @end example
- If an enumerator is obsolete, you can specify that using it should
- cause a warning, by including an attribute in the enumerator's
- declaration. Here is how @code{happy} would look with this
- attribute:
- @example
- happy __attribute__
- ((deprecated
- ("impossible under plutocratic rule")))
- = 5,
- @end example
- @xref{Attributes}.
- You can declare variables with the enumeration type:
- @example
- enum emotion_state feelings_now;
- @end example
- In the C code itself, this is equivalent to declaring the variable
- @code{int}. (If all the enumeration values are positive, it is
- equivalent to @code{unsigned int}.) However, declaring it with the
- enumeration type has an advantage in debugging, because GDB knows it
- should display the current value of the variable using the
- corresponding name. If the variable's type is @code{int}, GDB can
- only show the value as a number.
- The identifier that follows @code{enum} is called a @dfn{type tag}
- since it distinguishes different enumeration types. Type tags are in
- a separate name space and belong to scopes like most other names in C@.
- @xref{Type Tags}, for explanation.
- You can predeclare an @code{enum} type tag like a structure or union
- type tag, like this:
- @example
- enum foo;
- @end example
- @noindent
- The @code{enum} type is incomplete until you finish defining it.
- You can optionally include a trailing comma at the end of a list of
- enumeration values:
- @example
- enum emotion_state @{ neutral, happy, sad, worried,
- calm, nervous, @};
- @end example
- @noindent
- This is useful in some macro definitions, since it enables you to
- assemble the list of enumerators without knowing which one is last.
- The extra comma does not change the meaning of the enumeration in any
- way.
- @node Defining Typedef Names
- @chapter Defining Typedef Names
- @cindex typedef names
- @findex typedef
- You can define a data type keyword as an alias for any type, and then
- use the alias syntactically like a built-in type keyword such as
- @code{int}. You do this using @code{typedef}, so these aliases are
- also called @dfn{typedef names}.
- @code{typedef} is followed by text that looks just like a variable
- declaration, but instead of declaring variables it defines data type
- keywords.
- Here's how to define @code{fooptr} as a typedef alias for the type
- @code{struct foo *}, then declare @code{x} and @code{y} as variables
- with that type:
- @example
- typedef struct foo *fooptr;
- fooptr x, y;
- @end example
- @noindent
- That declaration is equivalent to the following one:
- @example
- struct foo *x, *y;
- @end example
- You can define a typedef alias for any type. For instance, this makes
- @code{frobcount} an alias for type @code{int}:
- @example
- typedef int frobcount;
- @end example
- @noindent
- This doesn't define a new type distinct from @code{int}. Rather,
- @code{frobcount} is another name for the type @code{int}. Once the
- variable is declared, it makes no difference which name the
- declaration used.
- There is a syntactic difference, however, between @code{frobcount} and
- @code{int}: A typedef name cannot be used with
- @code{signed}, @code{unsigned}, @code{long} or @code{short}. It has
- to specify the type all by itself. So you can't write this:
- @example
- unsigned frobcount f1; /* @r{Error!} */
- @end example
- But you can write this:
- @example
- typedef unsigned int unsigned_frobcount;
- unsigned_frobcount f1;
- @end example
- In other words, a typedef name is not an alias for @emph{a keyword}
- such as @code{int}. It stands for a @emph{type}, and that could be
- the type @code{int}.
- Typedef names are in the same namespace as functions and variables, so
- you can't use the same name for a typedef and a function, or a typedef
- and a variable. When a typedef is declared inside a code block, it is
- in scope only in that block.
- @strong{Warning:} Avoid defining typedef names that end in @samp{_t},
- because many of these have standard meanings.
- You can redefine a typedef name to the exact same type as its first
- definition, but you cannot redefine a typedef name to a
- different type, even if the two types are compatible. For example, this
- is valid:
- @example
- typedef int frobcount;
- typedef int frotzcount;
- typedef frotzcount frobcount;
- typedef frobcount frotzcount;
- @end example
- @noindent
- because each typedef name is always defined with the same type
- (@code{int}), but this is not valid:
- @example
- enum foo @{f1, f2, f3@};
- typedef enum foo frobcount;
- typedef int frobcount;
- @end example
- @noindent
- Even though the type @code{enum foo} is compatible with @code{int},
- they are not the @emph{same} type.
- @node Statements
- @chapter Statements
- @cindex statements
- A @dfn{statement} specifies computations to be done for effect; it
- does not produce a value, as an expression would. In general a
- statement ends with a semicolon (@samp{;}), but blocks (which are
- statements, more or less) are an exception to that rule.
- @ifnottex
- @xref{Blocks}.
- @end ifnottex
- The places to use statements are inside a block, and inside a
- complex statement. A @dfn{complex statement} contains one or two
- components that are nested statements. Each such component must
- consist of one and only one statement. The way to put multiple
- statements in such a component is to group them into a @dfn{block}
- (@pxref{Blocks}), which counts as one statement.
- The following sections describe the various kinds of statement.
- @menu
- * Expression Statement:: Evaluate an expression, as a statement,
- usually done for a side effect.
- * if Statement:: Basic conditional execution.
- * if-else Statement:: Multiple branches for conditional execution.
- * Blocks:: Grouping multiple statements together.
- * return Statement:: Return a value from a function.
- * Loop Statements:: Repeatedly executing a statement or block.
- * switch Statement:: Multi-way conditional choices.
- * switch Example:: A plausible example of using @code{switch}.
- * Duffs Device:: A special way to use @code{switch}.
- * Case Ranges:: Ranges of values for @code{switch} cases.
- * Null Statement:: A statement that does nothing.
- * goto Statement:: Jump to another point in the source code,
- identified by a label.
- * Local Labels:: Labels with limited scope.
- * Labels as Values:: Getting the address of a label.
- * Statement Exprs:: A series of statements used as an expression.
- @end menu
- @node Expression Statement
- @section Expression Statement
- @cindex expression statement
- @cindex statement, expression
- The most common kind of statement in C is an @dfn{expression statement}.
- It consists of an expression followed by a
- semicolon. The expression's value is discarded, so the expressions
- that are useful are those that have side effects: assignment
- expressions, increment and decrement expressions, and function calls.
- Here are examples of expression statements:
- @smallexample
- x = 5; /* @r{Assignment expression.} */
- p++; /* @r{Increment expression.} */
- printf ("Done\n"); /* @r{Function call expression.} */
- *p; /* @r{Cause @code{SIGSEGV} signal if @code{p} is null.} */
- x + y; /* @r{Useless statement without effect.} */
- @end smallexample
- In very unusual circumstances we use an expression statement
- whose purpose is to get a fault if an address is invalid:
- @smallexample
- volatile char *p;
- @r{@dots{}}
- *p; /* @r{Cause signal if @code{p} is null.} */
- @end smallexample
- If the target of @code{p} is not declared @code{volatile}, the
- compiler might optimize away the memory access, since it knows that
- the value isn't really used. @xref{volatile}.
- @node if Statement
- @section @code{if} Statement
- @cindex @code{if} statement
- @cindex statement, @code{if}
- @findex if
- An @code{if} statement computes an expression to decide
- whether to execute the following statement or not.
- It looks like this:
- @example
- if (@var{condition})
- @var{execute-if-true}
- @end example
- The first thing this does is compute the value of @var{condition}. If
- that is true (nonzero), then it executes the statement
- @var{execute-if-true}. If the value of @var{condition} is false
- (zero), it doesn't execute @var{execute-if-true}; instead, it does
- nothing.
- This is a @dfn{complex statement} because it contains a component
- @var{if-true-substatement} that is a nested statement. It must be one
- and only one statement. The way to put multiple statements there is
- to group them into a @dfn{block} (@pxref{Blocks}).
- @node if-else Statement
- @section @code{if-else} Statement
- @cindex @code{if}@dots{}@code{else} statement
- @cindex statement, @code{if}@dots{}@code{else}
- @findex else
- An @code{if}-@code{else} statement computes an expression to decide
- which of two nested statements to execute.
- It looks like this:
- @example
- if (@var{condition})
- @var{if-true-substatement}
- else
- @var{if-false-substatement}
- @end example
- The first thing this does is compute the value of @var{condition}. If
- that is true (nonzero), then it executes the statement
- @var{if-true-substatement}. If the value of @var{condition} is false
- (zero), then it executes the statement @var{if-false-substatement} instead.
- This is a @dfn{complex statement} because it contains components
- @var{if-true-substatement} and @var{if-else-substatement} that are
- nested statements. Each must be one and only one statement. The way
- to put multiple statements in such a component is to group them into a
- @dfn{block} (@pxref{Blocks}).
- @node Blocks
- @section Blocks
- @cindex block
- @cindex compound statement
- A @dfn{block} is a construct that contains multiple statements of any
- kind. It begins with @samp{@{} and ends with @samp{@}}, and has a
- series of statements and declarations in between. Another name for
- blocks is @dfn{compound statements}.
- Is a block a statement? Yes and no. It doesn't @emph{look} like a
- normal statement---it does not end with a semicolon. But you can
- @emph{use} it like a statement; anywhere that a statement is required
- or allowed, you can write a block and consider that block a statement.
- So far it seems that a block is a kind of statement with an unusual
- syntax. But that is not entirely true: a function body is also a
- block, and that block is definitely not a statement. The text after a
- function header is not treated as a statement; only a function body is
- allowed there, and nothing else would be meaningful there.
- In a formal grammar we would have to choose---either a block is a kind
- of statement or it is not. But this manual is meant for humans, not
- for parser generators. The clearest answer for humans is, ``a block
- is a statement, in some ways.''
- @cindex nested block
- @cindex internal block
- A block that isn't a function body is called an @dfn{internal block}
- or a @dfn{nested block}. You can put a nested block directly inside
- another block, but more often the nested block is inside some complex
- statement, such as a @code{for} statement or an @code{if} statement.
- There are two uses for nested blocks in C:
- @itemize @bullet
- @item
- To specify the scope for local declarations. For instance, a local
- variable's scope is the rest of the innermost containing block.
- @item
- To write a series of statements where, syntactically, one statement is
- called for. For instance, the @var{execute-if-true} of an @code{if}
- statement is one statement. To put multiple statements there, they
- have to be wrapped in a block, like this:
- @example
- if (x < 0)
- @{
- printf ("x was negative\n");
- x = -x;
- @}
- @end example
- @end itemize
- This example (repeated from above) shows a nested block which serves
- both purposes: it includes two statements (plus a declaration) in the
- body of a @code{while} statement, and it provides the scope for the
- declaration of @code{q}.
- @example
- void
- free_intlist (struct intlistlink *p)
- @{
- while (p)
- @{
- struct intlistlink *q = p;
- p = p->next;
- free (q);
- @}
- @}
- @end example
- @node return Statement
- @section @code{return} Statement
- @cindex @code{return} statement
- @cindex statement, @code{return}
- @findex return
- The @code{return} statement makes the containing function return
- immediately. It has two forms. This one specifies no value to
- return:
- @example
- return;
- @end example
- @noindent
- That form is meant for functions whose return type is @code{void}
- (@pxref{The Void Type}). You can also use it in a function that
- returns nonvoid data, but that's a bad idea, since it makes the
- function return garbage.
- The form that specifies a value looks like this:
- @example
- return @var{value};
- @end example
- @noindent
- which computes the expression @var{value} and makes the function
- return that. If necessary, the value undergoes type conversion to
- the function's declared return value type, which works like
- assigning the value to a variable of that type.
- @node Loop Statements
- @section Loop Statements
- @cindex loop statements
- @cindex statements, loop
- @cindex iteration
- You can use a loop statement when you need to execute a series of
- statements repeatedly, making an @dfn{iteration}. C provides several
- different kinds of loop statements, described in the following
- subsections.
- Every kind of loop statement is a complex statement because contains a
- component, here called @var{body}, which is a nested statement.
- Most often the body is a block.
- @menu
- * while Statement:: Loop as long as a test expression is true.
- * do-while Statement:: Execute a loop once, with further looping
- as long as a test expression is true.
- * break Statement:: End a loop immediately.
- * for Statement:: Iterative looping.
- * Example of for:: An example of iterative looping.
- * Omitted for-Expressions:: for-loop expression options.
- * for-Index Declarations:: for-loop declaration options.
- * continue Statement:: Begin the next cycle of a loop.
- @end menu
- @node while Statement
- @subsection @code{while} Statement
- @cindex @code{while} statement
- @cindex statement, @code{while}
- @findex while
- The @code{while} statement is the simplest loop construct.
- It looks like this:
- @example
- while (@var{test})
- @var{body}
- @end example
- Here, @var{body} is a statement (often a nested block) to repeat, and
- @var{test} is the test expression that controls whether to repeat it again.
- Each iteration of the loop starts by computing @var{test} and, if it
- is true (nonzero), that means the loop should execute @var{body} again
- and then start over.
- Here's an example of advancing to the last structure in a chain of
- structures chained through the @code{next} field:
- @example
- #include <stddef.h> /* @r{Defines @code{NULL}.} */
- @r{@dots{}}
- while (chain->next != NULL)
- chain = chain->next;
- @end example
- @noindent
- This code assumes the chain isn't empty to start with; if the chain is
- empty (that is, if @code{chain} is a null pointer), the code gets a
- @code{SIGSEGV} signal trying to dereference that null pointer (@pxref{Signals}).
- @node do-while Statement
- @subsection @code{do-while} Statement
- @cindex @code{do}--@code{while} statement
- @cindex statement, @code{do}--@code{while}
- @findex do
- The @code{do}--@code{while} statement is a simple loop construct that
- performs the test at the end of the iteration.
- @example
- do
- @var{body}
- while (@var{test});
- @end example
- Here, @var{body} is a statement (possibly a block) to repeat, and
- @var{test} is an expression that controls whether to repeat it again.
- Each iteration of the loop starts by executing @var{body}. Then it
- computes @var{test} and, if it is true (nonzero), that means to go
- back and start over with @var{body}. If @var{test} is false (zero),
- then the loop stops repeating and execution moves on past it.
- @node break Statement
- @subsection @code{break} Statement
- @cindex @code{break} statement
- @cindex statement, @code{break}
- @findex break
- The @code{break} statement looks like @samp{break;}. Its effect is to
- exit immediately from the innermost loop construct or @code{switch}
- statement (@pxref{switch Statement}).
- For example, this loop advances @code{p} until the next null
- character or newline.
- @example
- while (*p)
- @{
- /* @r{End loop if we have reached a newline.} */
- if (*p == '\n')
- break;
- p++
- @}
- @end example
- When there are nested loops, the @code{break} statement exits from the
- innermost loop containing it.
- @example
- struct list_if_tuples
- @{
- struct list_if_tuples next;
- int length;
- data *contents;
- @};
- void
- process_all_elements (struct list_if_tuples *list)
- @{
- while (list)
- @{
- /* @r{Process all the elements in this node's vector,}
- @r{stopping when we reach one that is null.} */
- for (i = 0; i < list->length; i++
- @{
- /* @r{Null element terminates this node's vector.} */
- if (list->contents[i] == NULL)
- /* @r{Exit the @code{for} loop.} */
- break;
- /* @r{Operate on the next element.} */
- process_element (list->contents[i]);
- @}
- list = list->next;
- @}
- @}
- @end example
- The only way in C to exit from an outer loop is with
- @code{goto} (@pxref{goto Statement}).
- @node for Statement
- @subsection @code{for} Statement
- @cindex @code{for} statement
- @cindex statement, @code{for}
- @findex for
- A @code{for} statement uses three expressions written inside a
- parenthetical group to define the repetition of the loop. The first
- expression says how to prepare to start the loop. The second says how
- to test, before each iteration, whether to continue looping. The
- third says how to advance, at the end of an iteration, for the next
- iteration. All together, it looks like this:
- @example
- for (@var{start}; @var{continue-test}; @var{advance})
- @var{body}
- @end example
- The first thing the @code{for} statement does is compute @var{start}.
- The next thing it does is compute the expression @var{continue-test}.
- If that expression is false (zero), the @code{for} statement finishes
- immediately, so @var{body} is executed zero times.
- However, if @var{continue-test} is true (nonzero), the @code{for}
- statement executes @var{body}, then @var{advance}. Then it loops back
- to the not-quite-top to test @var{continue-test} again. But it does
- not compute @var{start} again.
- @node Example of for
- @subsection Example of @code{for}
- Here is the @code{for} statement from the iterative Fibonacci
- function:
- @example
- int i;
- for (i = 1; i < n; ++i)
- /* @r{If @code{n} is 1 or less, the loop runs zero times,} */
- /* @r{since @code{i < n} is false the first time.} */
- @{
- /* @r{Now @var{last} is @code{fib (@var{i})}}
- @r{and @var{prev} is @code{fib (@var{i} @minus{} 1)}.} */
- /* @r{Compute @code{fib (@var{i} + 1)}.} */
- int next = prev + last;
- /* @r{Shift the values down.} */
- prev = last;
- last = next;
- /* @r{Now @var{last} is @code{fib (@var{i} + 1)}}
- @r{and @var{prev} is @code{fib (@var{i})}.}
- @r{But that won't stay true for long,}
- @r{because we are about to increment @var{i}.} */
- @}
- @end example
- In this example, @var{start} is @code{i = 1}, meaning set @code{i} to
- 1. @var{continue-test} is @code{i < n}, meaning keep repeating the
- loop as long as @code{i} is less than @code{n}. @var{advance} is
- @code{i++}, meaning increment @code{i} by 1. The body is a block
- that contains a declaration and two statements.
- @node Omitted for-Expressions
- @subsection Omitted @code{for}-Expressions
- A fully-fleshed @code{for} statement contains all these parts,
- @example
- for (@var{start}; @var{continue-test}; @var{advance})
- @var{body}
- @end example
- @noindent
- but you can omit any of the three expressions inside the parentheses.
- The parentheses and the two semicolons are required syntactically, but
- the expressions between them may be missing. A missing expression
- means this loop doesn't use that particular feature of the @code{for}
- statement.
- Instead of using @var{start}, you can do the loop preparation
- before the @code{for} statement: the effect is the same. So we
- could have written the beginning of the previous example this way:
- @example
- int i = 0;
- for (; i < n; ++i)
- @end example
- @noindent
- instead of this way:
- @example
- int i;
- for (i = 0; i < n; ++i)
- @end example
- Omitting @var{continue-test} means the loop runs forever (or until
- something else causes exit from it). Statements inside the loop can
- test conditions for termination and use @samp{break;} to exit. This
- is more flexible since you can put those tests anywhere in the loop,
- not solely at the beginning.
- Putting an expression in @var{advance} is almost equivalent to writing
- it at the end of the loop body; it does almost the same thing. The
- only difference is for the @code{continue} statement (@pxref{continue
- Statement}). So we could have written this:
- @example
- for (i = 0; i < n;)
- @{
- @r{@dots{}}
- ++i;
- @}
- @end example
- @noindent
- instead of this:
- @example
- for (i = 0; i < n; ++i)
- @{
- @r{@dots{}}
- @}
- @end example
- The choice is mainly a matter of what is more readable for
- programmers. However, there is also a syntactic difference:
- @var{advance} is an expression, not a statement. It can't include
- loops, blocks, declarations, etc.
- @node for-Index Declarations
- @subsection @code{for}-Index Declarations
- You can declare loop-index variables directly in the @var{start}
- portion of the @code{for}-loop, like this:
- @example
- for (int i = 0; i < n; ++i)
- @{
- @r{@dots{}}
- @}
- @end example
- This kind of @var{start} is limited to a single declaration; it can
- declare one or more variables, separated by commas, all of which are
- the same @var{basetype} (@code{int}, in this example):
- @example
- for (int i = 0, j = 1, *p = NULL; i < n; ++i, ++j, ++p)
- @{
- @r{@dots{}}
- @}
- @end example
- @noindent
- The scope of these variables is the @code{for} statement as a whole.
- See @ref{Variable Declarations} for a explanation of @var{basetype}.
- Variables declared in @code{for} statements should have initializers.
- Omitting the initialization gives the variables unpredictable initial
- values, so this code is erroneous.
- @example
- for (int i; i < n; ++i)
- @{
- @r{@dots{}}
- @}
- @end example
- @node continue Statement
- @subsection @code{continue} Statement
- @cindex @code{continue} statement
- @cindex statement, @code{continue}
- @findex continue
- The @code{continue} statement looks like @samp{continue;}, and its
- effect is to jump immediately to the end of the innermost loop
- construct. If it is a @code{for}-loop, the next thing that happens
- is to execute the loop's @var{advance} expression.
- For example, this loop increments @code{p} until the next null character
- or newline, and operates (in some way not shown) on all the characters
- in the line except for spaces. All it does with spaces is skip them.
- @example
- for (;*p; ++p)
- @{
- /* @r{End loop if we have reached a newline.} */
- if (*p == '\n')
- break;
- /* @r{Pay no attention to spaces.} */
- if (*p == ' ')
- continue;
- /* @r{Operate on the next character.} */
- @r{@dots{}}
- @}
- @end example
- @noindent
- Executing @samp{continue;} skips the loop body but it does not
- skip the @var{advance} expression, @code{p++}.
- We could also write it like this:
- @example
- for (;*p; ++p)
- @{
- /* @r{Exit if we have reached a newline.} */
- if (*p == '\n')
- break;
- /* @r{Pay no attention to spaces.} */
- if (*p != ' ')
- @{
- /* @r{Operate on the next character.} */
- @r{@dots{}}
- @}
- @}
- @end example
- The advantage of using @code{continue} is that it reduces the
- depth of nesting.
- Contrast @code{continue} with the @code{break} statement. @xref{break
- Statement}.
- @node switch Statement
- @section @code{switch} Statement
- @cindex @code{switch} statement
- @cindex statement, @code{switch}
- @findex switch
- @findex case
- @findex default
- The @code{switch} statement selects code to run according to the value
- of an expression. The expression, in parentheses, follows the keyword
- @code{switch}. After that come all the cases to select among,
- inside braces. It looks like this:
- @example
- switch (@var{selector})
- @{
- @var{cases}@r{@dots{}}
- @}
- @end example
- A case can look like this:
- @example
- case @var{value}:
- @var{statements}
- break;
- @end example
- @noindent
- which means ``come here if @var{selector} happens to have the value
- @var{value},'' or like this (a GNU C extension):
- @example
- case @var{rangestart} ... @var{rangeend}:
- @var{statements}
- break;
- @end example
- @noindent
- which means ``come here if @var{selector} happens to have a value
- between @var{rangestart} and @var{rangeend} (inclusive).'' @xref{Case
- Ranges}.
- The values in @code{case} labels must reduce to integer constants.
- They can use arithmetic, and @code{enum} constants, but they cannot
- refer to data in memory, because they have to be computed at compile
- time. It is an error if two @code{case} labels specify the same
- value, or ranges that overlap, or if one is a range and the other is a
- value in that range.
- You can also define a default case to handle ``any other value,'' like
- this:
- @example
- default:
- @var{statements}
- break;
- @end example
- If the @code{switch} statement has no @code{default:} label, then it
- does nothing when the value matches none of the cases.
- The brace-group inside the @code{switch} statement is a block, and you
- can declare variables with that scope just as in any other block
- (@pxref{Blocks}). However, initializers in these declarations won't
- necessarily be executed every time the @code{switch} statement runs,
- so it is best to avoid giving them initializers.
- @code{break;} inside a @code{switch} statement exits immediately from
- the @code{switch} statement. @xref{break Statement}.
- If there is no @code{break;} at the end of the code for a case,
- execution continues into the code for the following case. This
- happens more often by mistake than intentionally, but since this
- feature is used in real code, we cannot eliminate it.
- @strong{Warning:} When one case is intended to fall through to the
- next, write a comment like @samp{falls through} to say it's
- intentional. That way, other programmers won't assume it was an error
- and ``fix'' it erroneously.
- Consecutive @code{case} statements could, pedantically, be considered
- an instance of falling through, but we don't consider or treat them that
- way because they won't confuse anyone.
- @node switch Example
- @section Example of @code{switch}
- Here's an example of using the @code{switch} statement
- to distinguish among characters:
- @cindex counting vowels and punctuation
- @example
- struct vp @{ int vowels, punct; @};
- struct vp
- count_vowels_and_punct (char *string)
- @{
- int c;
- int vowels = 0;
- int punct = 0;
- /* @r{Don't change the parameter itself.} */
- /* @r{That helps in debugging.} */
- char *p = string;
- struct vp value;
- while (c = *p++)
- switch (c)
- @{
- case 'y':
- case 'Y':
- /* @r{We assume @code{y_is_consonant} will check surrounding
- letters to determine whether this y is a vowel.} */
- if (y_is_consonant (p - 1))
- break;
- /* @r{Falls through} */
- case 'a':
- case 'e':
- case 'i':
- case 'o':
- case 'u':
- case 'A':
- case 'E':
- case 'I':
- case 'O':
- case 'U':
- vowels++;
- break;
- case '.':
- case ',':
- case ':':
- case ';':
- case '?':
- case '!':
- case '\"':
- case '\'':
- punct++;
- break;
- @}
- value.vowels = vowels;
- value.punct = punct;
- return value;
- @}
- @end example
- @node Duffs Device
- @section Duff's Device
- @cindex Duff's device
- The cases in a @code{switch} statement can be inside other control
- constructs. For instance, we can use a technique known as @dfn{Duff's
- device} to optimize this simple function,
- @example
- void
- copy (char *to, char *from, int count)
- @{
- while (count > 0)
- *to++ = *from++, count--;
- @}
- @end example
- @noindent
- which copies memory starting at @var{from} to memory starting at
- @var{to}.
- Duff's device involves unrolling the loop so that it copies
- several characters each time around, and using a @code{switch} statement
- to enter the loop body at the proper point:
- @example
- void
- copy (char *to, char *from, int count)
- @{
- if (count <= 0)
- return;
- int n = (count + 7) / 8;
- switch (count % 8)
- @{
- do @{
- case 0: *to++ = *from++;
- case 7: *to++ = *from++;
- case 6: *to++ = *from++;
- case 5: *to++ = *from++;
- case 4: *to++ = *from++;
- case 3: *to++ = *from++;
- case 2: *to++ = *from++;
- case 1: *to++ = *from++;
- @} while (--n > 0);
- @}
- @}
- @end example
- @node Case Ranges
- @section Case Ranges
- @cindex case ranges
- @cindex ranges in case statements
- You can specify a range of consecutive values in a single @code{case} label,
- like this:
- @example
- case @var{low} ... @var{high}:
- @end example
- @noindent
- This has the same effect as the proper number of individual @code{case}
- labels, one for each integer value from @var{low} to @var{high}, inclusive.
- This feature is especially useful for ranges of ASCII character codes:
- @example
- case 'A' ... 'Z':
- @end example
- @strong{Be careful:} with integers, write spaces around the @code{...}
- to prevent it from being parsed wrong. For example, write this:
- @example
- case 1 ... 5:
- @end example
- @noindent
- rather than this:
- @example
- case 1...5:
- @end example
- @node Null Statement
- @section Null Statement
- @cindex null statement
- @cindex statement, null
- A @dfn{null statement} is just a semicolon. It does nothing.
- A null statement is a placeholder for use where a statement is
- grammatically required, but there is nothing to be done. For
- instance, sometimes all the work of a @code{for}-loop is done in the
- @code{for}-header itself, leaving no work for the body. Here is an
- example that searches for the first newline in @code{array}:
- @example
- for (p = array; *p != '\n'; p++)
- ;
- @end example
- @node goto Statement
- @section @code{goto} Statement and Labels
- @cindex @code{goto} statement
- @cindex statement, @code{goto}
- @cindex label
- @findex goto
- The @code{goto} statement looks like this:
- @example
- goto @var{label};
- @end example
- @noindent
- Its effect is to transfer control immediately to another part of the
- current function---where the label named @var{label} is defined.
- An ordinary label definition looks like this:
- @example
- @var{label}:
- @end example
- @noindent
- and it can appear before any statement. You can't use @code{default}
- as a label, since that has a special meaning for @code{switch}
- statements.
- An ordinary label doesn't need a separate declaration; defining it is
- enough.
- Here's an example of using @code{goto} to implement a loop
- equivalent to @code{do}--@code{while}:
- @example
- @{
- loop_restart:
- @var{body}
- if (@var{condition})
- goto loop_restart;
- @}
- @end example
- The name space of labels is separate from that of variables and functions.
- Thus, there is no error in using a single name in both ways:
- @example
- @{
- int foo; // @r{Variable @code{foo}.}
- foo: // @r{Label @code{foo}.}
- @var{body}
- if (foo > 0) // @r{Variable @code{foo}.}
- goto foo; // @r{Label @code{foo}.}
- @}
- @end example
- Blocks have no effect on ordinary labels; each label name is defined
- throughout the whole of the function it appears in. It looks strange to
- jump into a block with @code{goto}, but it works. For example,
- @example
- if (x < 0)
- goto negative;
- if (y < 0)
- @{
- negative:
- printf ("Negative\n");
- return;
- @}
- @end example
- If the goto jumps into the scope of a variable, it does not
- initialize the variable. For example, if @code{x} is negative,
- @example
- if (x < 0)
- goto negative;
- if (y < 0)
- @{
- int i = 5;
- negative:
- printf ("Negative, and i is %d\n", i);
- return;
- @}
- @end example
- @noindent
- prints junk because @code{i} was not initialized.
- If the block declares a variable-length automatic array, jumping into
- it gives a compilation error. However, jumping out of the scope of a
- variable-length array works fine, and deallocates its storage.
- A label can't come directly before a declaration, so the code can't
- jump directly to one. For example, this is not allowed:
- @example
- @{
- goto foo;
- foo:
- int x = 5;
- bar(&x);
- @}
- @end example
- @noindent
- The workaround is to add a statement, even an empty statement,
- directly after the label. For example:
- @example
- @{
- goto foo;
- foo:
- ;
- int x = 5;
- bar(&x);
- @}
- @end example
- Likewise, a label can't be the last thing in a block. The workaround
- solution is the same: add a semicolon after the label.
- These unnecessary restrictions on labels make no sense, and ought in
- principle to be removed; but they do only a little harm since labels
- and @code{goto} are rarely the best way to write a program.
- These examples are all artificial; it would be more natural to
- write them in other ways, without @code{goto}. For instance,
- the clean way to write the example that prints @samp{Negative} is this:
- @example
- if (x < 0 || y < 0)
- @{
- printf ("Negative\n");
- return;
- @}
- @end example
- @noindent
- It is hard to construct simple examples where @code{goto} is actually
- the best way to write a program. Its rare good uses tend to be in
- complex code, thus not apt for the purpose of explaining the meaning
- of @code{goto}.
- The only good time to use @code{goto} is when it makes the code
- simpler than any alternative. Jumping backward is rarely desirable,
- because usually the other looping and control constructs give simpler
- code. Using @code{goto} to jump forward is more often desirable, for
- instance when a function needs to do some processing in an error case
- and errors can occur at various different places within the function.
- @node Local Labels
- @section Locally Declared Labels
- @cindex local labels
- @cindex macros, local labels
- @findex __label__
- In GNU C you can declare @dfn{local labels} in any nested block
- scope. A local label is used in a @code{goto} statement just like an
- ordinary label, but you can only reference it within the block in
- which it was declared.
- A local label declaration looks like this:
- @example
- __label__ @var{label};
- @end example
- @noindent
- or
- @example
- __label__ @var{label1}, @var{label2}, @r{@dots{}};
- @end example
- Local label declarations must come at the beginning of the block,
- before any ordinary declarations or statements.
- The label declaration declares the label @emph{name}, but does not define
- the label itself. That's done in the usual way, with
- @code{@var{label}:}, before one of the statements in the block.
- The local label feature is useful for complex macros. If a macro
- contains nested loops, a @code{goto} can be useful for breaking out of
- them. However, an ordinary label whose scope is the whole function
- cannot be used: if the macro can be expanded several times in one
- function, the label will be multiply defined in that function. A
- local label avoids this problem. For example:
- @example
- #define SEARCH(value, array, target) \
- do @{ \
- __label__ found; \
- __auto_type _SEARCH_target = (target); \
- __auto_type _SEARCH_array = (array); \
- int i, j; \
- int value; \
- for (i = 0; i < max; i++) \
- for (j = 0; j < max; j++) \
- if (_SEARCH_array[i][j] == _SEARCH_target) \
- @{ (value) = i; goto found; @} \
- (value) = -1; \
- found:; \
- @} while (0)
- @end example
- This could also be written using a statement expression
- (@pxref{Statement Exprs}):
- @example
- #define SEARCH(array, target) \
- (@{ \
- __label__ found; \
- __auto_type _SEARCH_target = (target); \
- __auto_type _SEARCH_array = (array); \
- int i, j; \
- int value; \
- for (i = 0; i < max; i++) \
- for (j = 0; j < max; j++) \
- if (_SEARCH_array[i][j] == _SEARCH_target) \
- @{ value = i; goto found; @} \
- value = -1; \
- found: \
- value; \
- @})
- @end example
- Ordinary labels are visible throughout the function where they are
- defined, and only in that function. However, explicitly declared
- local labels of a block are visible in nested functions declared
- within that block. @xref{Nested Functions}, for details.
- @xref{goto Statement}.
- @node Labels as Values
- @section Labels as Values
- @cindex labels as values
- @cindex computed gotos
- @cindex goto with computed label
- @cindex address of a label
- In GNU C, you can get the address of a label defined in the current
- function (or a local label defined in the containing function) with
- the unary operator @samp{&&}. The value has type @code{void *}. This
- value is a constant and can be used wherever a constant of that type
- is valid. For example:
- @example
- void *ptr;
- @r{@dots{}}
- ptr = &&foo;
- @end example
- To use these values requires a way to jump to one. This is done
- with the computed goto statement@footnote{The analogous feature in
- Fortran is called an assigned goto, but that name seems inappropriate in
- C, since you can do more with label addresses than store them in special label
- variables.}, @code{goto *@var{exp};}. For example,
- @example
- goto *ptr;
- @end example
- @noindent
- Any expression of type @code{void *} is allowed.
- @xref{goto Statement}.
- @menu
- * Label Value Uses:: Examples of using label values.
- * Label Value Caveats:: Limitations of label values.
- @end menu
- @node Label Value Uses
- @subsection Label Value Uses
- One use for label-valued constants is to initialize a static array to
- serve as a jump table:
- @example
- static void *array[] = @{ &&foo, &&bar, &&hack @};
- @end example
- Then you can select a label with indexing, like this:
- @example
- goto *array[i];
- @end example
- @noindent
- Note that this does not check whether the subscript is in bounds---array
- indexing in C never checks that.
- You can make the table entries offsets instead of addresses
- by subtracting one label from the others. Here is an example:
- @example
- static const int array[] = @{ &&foo - &&foo, &&bar - &&foo,
- &&hack - &&foo @};
- goto *(&&foo + array[i]);
- @end example
- @noindent
- Using offsets is preferable in shared libraries, as it avoids the need
- for dynamic relocation of the array elements; therefore, the array can
- be read-only.
- An array of label values or offsets serves a purpose much like that of
- the @code{switch} statement. The @code{switch} statement is cleaner,
- so use @code{switch} by preference when feasible.
- Another use of label values is in an interpreter for threaded code.
- The labels within the interpreter function can be stored in the
- threaded code for super-fast dispatching.
- @node Label Value Caveats
- @subsection Label Value Caveats
- Jumping to a label defined in another function does not work.
- It can cause unpredictable results.
- The best way to avoid this is to store label values only in
- automatic variables, or static variables whose names are declared
- within the function. Never pass them as arguments.
- @cindex cloning
- An optimization known as @dfn{cloning} generates multiple simplified
- variants of a function's code, for use with specific fixed arguments.
- Using label values in certain ways, such as saving the address in one
- call to the function and using it again in another call, would make cloning
- give incorrect results. These functions must disable cloning.
- Inlining calls to the function would also result in multiple copies of
- the code, each with its own value of the same label. Using the label
- in a computed goto is no problem, because the computed goto inhibits
- inlining. However, using the label value in some other way, such as
- an indication of where an error occurred, would be optimized wrong.
- These functions must disable inlining.
- To prevent inlining or cloning of a function, specify
- @code{__attribute__((__noinline__,__noclone__))} in its definition.
- @xref{Attributes}.
- When a function uses a label value in a static variable initializer,
- that automatically prevents inlining or cloning the function.
- @node Statement Exprs
- @section Statements and Declarations in Expressions
- @cindex statements inside expressions
- @cindex declarations inside expressions
- @cindex expressions containing statements
- @c the above section title wrapped and causes an underfull hbox.. i
- @c changed it from "within" to "in". --mew 4feb93
- A block enclosed in parentheses can be used as an expression in GNU
- C@. This provides a way to use local variables, loops and switches within
- an expression. We call it a @dfn{statement expression}.
- Recall that a block is a sequence of statements
- surrounded by braces. In this construct, parentheses go around the
- braces. For example:
- @example
- (@{ int y = foo (); int z;
- if (y > 0) z = y;
- else z = - y;
- z; @})
- @end example
- @noindent
- is a valid (though slightly more complex than necessary) expression
- for the absolute value of @code{foo ()}.
- The last statement in the block should be an expression statement; an
- expression followed by a semicolon, that is. The value of this
- expression serves as the value of statement expression. If the last
- statement is anything else, the statement expression's value is
- @code{void}.
- This feature is mainly useful in making macro definitions compute each
- operand exactly once. @xref{Macros and Auto Type}.
- Statement expressions are not allowed in expressions that must be
- constant, such as the value for an enumerator, the width of a
- bit-field, or the initial value of a static variable.
- Jumping into a statement expression---with @code{goto}, or using a
- @code{switch} statement outside the statement expression---is an
- error. With a computed @code{goto} (@pxref{Labels as Values}), the
- compiler can't detect the error, but it still won't work.
- Jumping out of a statement expression is permitted, but since
- subexpressions in C are not computed in a strict order, it is
- unpredictable which other subexpressions will have been computed by
- then. For example,
- @example
- foo (), ((@{ bar1 (); goto a; 0; @}) + bar2 ()), baz();
- @end example
- @noindent
- calls @code{foo} and @code{bar1} before it jumps, and never
- calls @code{baz}, but may or may not call @code{bar2}. If @code{bar2}
- does get called, that occurs after @code{foo} and before @code{bar1}.
- @node Variables
- @chapter Variables
- @cindex variables
- Every variable used in a C program needs to be made known by a
- @dfn{declaration}. It can be used only after it has been declared.
- It is an error to declare a variable name more than once in the same
- scope; an exception is that @code{extern} declarations and tentative
- definitions can coexist with another declaration of the same
- variable.
- Variables can be declared anywhere within a block or file. (Older
- versions of C required that all variable declarations within a block
- occur before any statements.)
- Variables declared within a function or block are @dfn{local} to
- it. This means that the variable name is visible only until the end
- of that function or block, and the memory space is allocated only
- while control is within it.
- Variables declared at the top level in a file are called @dfn{file-scope}.
- They are assigned fixed, distinct memory locations, so they retain
- their values for the whole execution of the program.
- @menu
- * Variable Declarations:: Name a variable and and reserve space for it.
- * Initializers:: Assigning inital values to variables.
- * Designated Inits:: Assigning initial values to array elements
- at particular array indices.
- * Auto Type:: Obtaining the type of a variable.
- * Local Variables:: Variables declared in function definitions.
- * File-Scope Variables:: Variables declared outside of
- function definitions.
- * Static Local Variables:: Variables declared within functions,
- but with permanent storage allocation.
- * Extern Declarations:: Declaring a variable
- which is allocated somewhere else.
- * Allocating File-Scope:: When is space allocated
- for file-scope variables?
- * auto and register:: Historically used storage directions.
- * Omitting Types:: The bad practice of declaring variables
- with implicit type.
- @end menu
- @node Variable Declarations
- @section Variable Declarations
- @cindex variable declarations
- @cindex declaration of variables
- Here's what a variable declaration looks like:
- @example
- @var{keywords} @var{basetype} @var{decorated-variable} @r{[}= @var{init}@r{]};
- @end example
- The @var{keywords} specify how to handle the scope of the variable
- name and the allocation of its storage. Most declarations have
- no keywords because the defaults are right for them.
- C allows these keywords to come before or after @var{basetype}, or
- even in the middle of it as in @code{unsigned static int}, but don't
- do that---it would surprise other programmers. Always write the
- keywords first.
- The @var{basetype} can be any of the predefined types of C, or a type
- keyword defined with @code{typedef}. It can also be @code{struct
- @var{tag}}, @code{union @var{tag}}, or @code{enum @var{tag}}. In
- addition, it can include type qualifiers such as @code{const} and
- @code{volatile} (@pxref{Type Qualifiers}).
- In the simplest case, @var{decorated-variable} is just the variable
- name. That declares the variable with the type specified by
- @var{basetype}. For instance,
- @example
- int foo;
- @end example
- @noindent
- uses @code{int} as the @var{basetype} and @code{foo} as the
- @var{decorated-variable}. It declares @code{foo} with type
- @code{int}.
- @example
- struct tree_node foo;
- @end example
- @noindent
- declares @code{foo} with type @code{struct tree_node}.
- @menu
- * Declaring Arrays and Pointers:: Declaration syntax for variables of
- array and pointer types.
- * Combining Variable Declarations:: More than one variable declaration
- in a single statement.
- @end menu
- @node Declaring Arrays and Pointers
- @subsection Declaring Arrays and Pointers
- @cindex declaring arrays and pointers
- @cindex array, declaring
- @cindex pointers, declaring
- To declare a variable that is an array, write
- @code{@var{variable}[@var{length}]} for @var{decorated-variable}:
- @example
- int foo[5];
- @end example
- To declare a variable that has a pointer type, write
- @code{*@var{variable}} for @var{decorated-variable}:
- @example
- struct list_elt *foo;
- @end example
- These constructs nest. For instance,
- @example
- int foo[3][5];
- @end example
- @noindent
- declares @code{foo} as an array of 3 arrays of 5 integers each,
- @example
- struct list_elt *foo[5];
- @end example
- @noindent
- declares @code{foo} as an array of 5 pointers to structures, and
- @example
- struct list_elt **foo;
- @end example
- @noindent
- declares @code{foo} as a pointer to a pointer to a structure.
- @example
- int **(*foo[30])(int, double);
- @end example
- @noindent
- declares @code{foo} as an array of 30 pointers to functions
- (@pxref{Function Pointers}), each of which must accept two arguments
- (one @code{int} and one @code{double}) and return type @code{int **}.
- @example
- void
- bar (int size)
- @{
- int foo[size];
- @r{@dots{}}
- @}
- @end example
- @noindent
- declares @code{foo} as an array of integers with a size specified at
- run time when the function @code{bar} is called.
- @node Combining Variable Declarations
- @subsection Combining Variable Declarations
- @cindex combining variable declarations
- @cindex variable declarations, combining
- @cindex declarations, combining
- When multiple declarations have the same @var{keywords} and
- @var{basetype}, you can combine them using commas. Thus,
- @example
- @var{keywords} @var{basetype}
- @var{decorated-variable-1} @r{[}= @var{init1}@r{]},
- @var{decorated-variable-2} @r{[}= @var{init2}@r{]};
- @end example
- @noindent
- is equivalent to
- @example
- @var{keywords} @var{basetype}
- @var{decorated-variable-1} @r{[}= @var{init1}@r{]};
- @var{keywords} @var{basetype}
- @var{decorated-variable-2} @r{[}= @var{init2}@r{]};
- @end example
- Here are some simple examples:
- @example
- int a, b;
- int a = 1, b = 2;
- int a, *p, array[5];
- int a = 0, *p = &a, array[5] = @{1, 2@};
- @end example
- @noindent
- In the last two examples, @code{a} is an @code{int}, @code{p} is a
- pointer to @code{int}, and @code{array} is an array of 5 @code{int}s.
- Since the initializer for @code{array} specifies only two elements,
- the other three elements are initialized to zero.
- @node Initializers
- @section Initializers
- @cindex initializers
- A variable's declaration, unless it is @code{extern}, should also
- specify its initial value. For numeric and pointer-type variables,
- the initializer is an expression for the value. If necessary, it is
- converted to the variable's type, just as in an assignment.
- You can also initialize a local structure-type (@pxref{Structures}) or
- local union-type (@pxref{Unions}) variable this way, from an
- expression whose value has the same type. But you can't initialize an
- array this way (@pxref{Arrays}), since arrays are not first-class
- objects in C (@pxref{Limitations of C Arrays}) and there is no array
- assignment.
- You can initialize arrays and structures componentwise,
- with a list of the elements or components. You can initialize
- a union with any one of its alternatives.
- @itemize @bullet
- @item
- A component-wise initializer for an array consists of element values
- surrounded by @samp{@{@r{@dots{}}@}}. If the values in the initializer
- don't cover all the elements in the array, the remaining elements are
- initialized to zero.
- You can omit the size of the array when you declare it, and let
- the initializer specify the size:
- @example
- int array[] = @{ 3, 9, 12 @};
- @end example
- @item
- A component-wise initializer for a structure consists of field values
- surrounded by @samp{@{@r{@dots{}}@}}. Write the field values in the same
- order as the fields are declared in the structure. If the values in
- the initializer don't cover all the fields in the structure, the
- remaining fields are initialized to zero.
- @item
- The initializer for a union-type variable has the form @code{@{
- @var{value} @}}, where @var{value} initializes the @emph{first alternative}
- in the union definition.
- @end itemize
- For an array of arrays, a structure containing arrays, an array of
- structures, etc., you can nest these constructs. For example,
- @example
- struct point @{ double x, y; @};
- struct point series[]
- = @{ @{0, 0@}, @{1.5, 2.8@}, @{99, 100.0004@} @};
- @end example
- You can omit a pair of inner braces if they contain the right
- number of elements for the sub-value they initialize, so that
- no elements or fields need to be filled in with zeros.
- But don't do that very much, as it gets confusing.
- An array of @code{char} can be initialized using a string constant.
- Recall that the string constant includes an implicit null character at
- the end (@pxref{String Constants}). Using a string constant as
- initializer means to use its contents as the initial values of the
- array elements. Here are examples:
- @example
- char text[6] = "text!"; /* @r{Includes the null.} */
- char text[5] = "text!"; /* @r{Excludes the null.} */
- char text[] = "text!"; /* @r{Gets length 6.} */
- char text[]
- = @{ 't', 'e', 'x', 't', '!', 0 @}; /* @r{same as above.} */
- char text[] = @{ "text!" @}; /* @r{Braces are optional.} */
- @end example
- @noindent
- and this kind of initializer can be nested inside braces to initialize
- structures or arrays that contain a @code{char}-array.
- In like manner, you can use a wide string constant to initialize
- an array of @code{wchar_t}.
- @node Designated Inits
- @section Designated Initializers
- @cindex initializers with labeled elements
- @cindex labeled elements in initializers
- @cindex case labels in initializers
- @cindex designated initializers
- In a complex structure or long array, it's useful to indicate
- which field or element we are initializing.
- To designate specific array elements during initialization, include
- the array index in brackets, and an assignment operator, for each
- element:
- @example
- int foo[10] = @{ [3] = 42, [7] = 58 @};
- @end example
- @noindent
- This does the same thing as:
- @example
- int foo[10] = @{ 0, 0, 0, 42, 0, 0, 0, 58, 0, 0 @};
- @end example
- The array initialization can include non-designated element values
- alongside designated indices; these follow the expected ordering
- of the array initialization, so that
- @example
- int foo[10] = @{ [3] = 42, 43, 44, [7] = 58 @};
- @end example
- @noindent
- does the same thing as:
- @example
- int foo[10] = @{ 0, 0, 0, 42, 43, 44, 0, 58, 0, 0 @};
- @end example
- Note that you can only use constant expressions as array index values,
- not variables.
- If you need to initialize a subsequence of sequential array elements to
- the same value, you can specify a range:
- @example
- int foo[100] = @{ [0 ... 19] = 42, [20 ... 99] = 43 @};
- @end example
- @noindent
- Using a range this way is a GNU C extension.
- When subsequence ranges overlap, each element is initialized by the
- last specification that applies to it. Thus, this initialization is
- equivalent to the previous one.
- @example
- int foo[100] = @{ [0 ... 99] = 43, [0 ... 19] = 42 @};
- @end example
- @noindent
- as the second overrides the first for elements 0 through 19.
- The value used to initialize a range of elements is evaluated only
- once, for the first element in the range. So for example, this code
- @example
- int random_values[100]
- = @{ [0 ... 99] = get_random_number() @};
- @end example
- @noindent
- would initialize all 100 elements of the array @code{random_values} to
- the same value---probably not what is intended.
- Similarly, you can initialize specific fields of a structure variable
- by specifying the field name prefixed with a dot:
- @example
- struct point @{ int x; int y; @};
- struct point foo = @{ .y = 42; @};
- @end example
- @noindent
- The same syntax works for union variables as well:
- @example
- union int_double @{ int i; double d; @};
- union int_double foo = @{ .d = 34 @};
- @end example
- @noindent
- This casts the integer value 34 to a double and stores it
- in the union variable @code{foo}.
- You can designate both array elements and structure elements in
- the same initialization; for example, here's an array of point
- structures:
- @example
- struct point point_array[10] = @{ [4].y = 32, [6].y = 39 @};
- @end example
- Along with the capability to specify particular array and structure
- elements to initialize comes the possibility of initializing the same
- element more than once:
- @example
- int foo[10] = @{ [4] = 42, [4] = 98 @};
- @end example
- @noindent
- In such a case, the last initialization value is retained.
- @node Auto Type
- @section Referring to a Type with @code{__auto_type}
- @findex __auto_type
- @findex typeof
- @cindex macros, types of arguments
- You can declare a variable copying the type from
- the initializer by using @code{__auto_type} instead of a particular type.
- Here's an example:
- @example
- #define max(a,b) \
- (@{ __auto_type _a = (a); \
- __auto_type _b = (b); \
- _a > _b ? _a : _b @})
- @end example
- This defines @code{_a} to be of the same type as @code{a}, and
- @code{_b} to be of the same type as @code{b}. This is a useful thing
- to do in a macro that ought to be able to handle any type of data
- (@pxref{Macros and Auto Type}).
- The original GNU C method for obtaining the type of a value is to use
- @code{typeof}, which takes as an argument either a value or the name of
- a type. The previous example could also be written as:
- @example
- #define max(a,b) \
- (@{ typeof(a) _a = (a); \
- typeof(b) _b = (b); \
- _a > _b ? _a : _b @})
- @end example
- @code{typeof} is more flexible than @code{__auto_type}; however, the
- principal use case for @code{typeof} is in variable declarations with
- initialization, which is exactly what @code{__auto_type} handles.
- @node Local Variables
- @section Local Variables
- @cindex local variables
- @cindex variables, local
- Declaring a variable inside a function definition (@pxref{Function
- Definitions}) makes the variable name @dfn{local} to the containing
- block---that is, the containing pair of braces. More precisely, the
- variable's name is visible starting just after where it appears in the
- declaration, and its visibility continues until the end of the block.
- Local variables in C are generally @dfn{automatic} variables: each
- variable's storage exists only from the declaration to the end of the
- block. Execution of the declaration allocates the storage, computes
- the initial value, and stores it in the variable. The end of the
- block deallocates the storage.@footnote{Due to compiler optimizations,
- allocation and deallocation don't necessarily really happen at
- those times.}
- @strong{Warning:} Two declarations for the same local variable
- in the same scope are an error.
- @strong{Warning:} Automatic variables are stored in the run-time stack.
- The total space for the program's stack may be limited; therefore,
- in using very large arrays, it may be necessary to allocate
- them in some other way to stop the program from crashing.
- @strong{Warning:} If the declaration of an automatic variable does not
- specify an initial value, the variable starts out containing garbage.
- In this example, the value printed could be anything at all:
- @example
- @{
- int i;
- printf ("Print junk %d\n", i);
- @}
- @end example
- In a simple test program, that statement is likely to print 0, simply
- because every process starts with memory zeroed. But don't rely on it
- to be zero---that is erroneous.
- @strong{Note:} Make sure to store a value into each local variable (by
- assignment, or by initialization) before referring to its value.
- @node File-Scope Variables
- @section File-Scope Variables
- @cindex file-scope variables
- @cindex global variables
- @cindex variables, file-scope
- @cindex variables, global
- A variable declaration at the top level in a file (not inside a
- function definition) declares a @dfn{file-scope variable}. Loading a
- program allocates the storage for all the file-scope variables in it,
- and initializes them too.
- Each file-scope variable is either @dfn{static} (limited to one
- compilation module) or @dfn{global} (shared with all compilation
- modules in the program). To make the variable static, write the
- keyword @code{static} at the start of the declaration. Omitting
- @code{static} makes the variable global.
- The initial value for a file-scope variable can't depend on the
- contents of storage, and can't call any functions.
- @example
- int foo = 5; /* @r{Valid.} */
- int bar = foo; /* @r{Invalid!} */
- int bar = sin (1.0); /* @r{Invalid!} */
- @end example
- But it can use the address of another file-scope variable:
- @example
- int foo;
- int *bar = &foo; /* @r{Valid.} */
- int arr[5];
- int *bar3 = &arr[3]; /* @r{Valid.} */
- int *bar4 = arr + 4; /* @r{Valid.} */
- @end example
- It is valid for a module to have multiple declarations for a
- file-scope variable, as long as they are all global or all static, but
- at most one declaration can specify an initial value for it.
- @node Static Local Variables
- @section Static Local Variables
- @cindex static local variables
- @cindex variables, static local
- @findex static
- The keyword @code{static} in a local variable declaration says to
- allocate the storage for the variable permanently, just like a
- file-scope variable, even if the declaration is within a function.
- Here's an example:
- @example
- int
- increment_counter ()
- @{
- static int counter = 0;
- return ++counter;
- @}
- @end example
- The scope of the name @code{counter} runs from the declaration to the
- end of the containing block, just like an automatic local variable,
- but its storage is permanent, so the value persists from one call to
- the next. As a result, each call to @code{increment_counter}
- returns a different, unique value.
- The initial value of a static local variable has the same limitations
- as for file-scope variables: it can't depend on the contents of
- storage or call any functions. It can use the address of a file-scope
- variable or a static local variable, because those addresses are
- determined before the program runs.
- @node Extern Declarations
- @section @code{extern} Declarations
- @cindex @code{extern} declarations
- @cindex declarations, @code{extern}
- @findex extern
- An @code{extern} declaration is used to refer to a global variable
- whose principal declaration comes elsewhere---in the same module, or in
- another compilation module. It looks like this:
- @example
- extern @var{basetype} @var{decorated-variable};
- @end example
- Its meaning is that, in the current scope, the variable name refers to
- the file-scope variable of that name---which needs to be declared in a
- non-@code{extern}, non-@code{static} way somewhere else.
- For instance, if one compilation module has this global variable
- declaration
- @example
- int error_count = 0;
- @end example
- @noindent
- then other compilation modules can specify this
- @example
- extern int error_count;
- @end example
- @noindent
- to allow reference to the same variable.
- The usual place to write an @code{extern} declaration is at top level
- in a source file, but you can write an @code{extern} declaration
- inside a block to make a global or static file-scope variable
- accessible in that block.
- Since an @code{extern} declaration does not allocate space for the
- variable, it can omit the size of an array:
- @example
- extern int array[];
- @end example
- You can use @code{array} normally in all contexts where it is
- converted automatically to a pointer. However, to use it as the
- operand of @code{sizeof} is an error, since the size is unknown.
- It is valid to have multiple @code{extern} declarations for the same
- variable, even in the same scope, if they give the same type. They do
- not conflict---they agree. For an array, it is legitimate for some
- @code{extern} declarations can specify the size while others omit it.
- However, if two declarations give different sizes, that is an error.
- Likewise, you can use @code{extern} declarations at file scope
- (@pxref{File-Scope Variables}) followed by an ordinary global
- (non-static) declaration of the same variable. They do not conflict,
- because they say compatible things about the same meaning of the variable.
- @node Allocating File-Scope
- @section Allocating File-Scope Variables
- @cindex allocation file-scope variables
- @cindex file-scope variables, allocating
- Some file-scope declarations allocate space for the variable, and some
- don't.
- A file-scope declaration with an initial value @emph{must} allocate
- space for the variable; if there are two of such declarations for the
- same variable, even in different compilation modules, they conflict.
- An @code{extern} declaration @emph{never} allocates space for the variable.
- If all the top-level declarations of a certain variable are
- @code{extern}, the variable never gets memory space. If that variable
- is used anywhere in the program, the use will be reported as an error,
- saying that the variable is not defined.
- @cindex tentative definition
- A file-scope declaration without an initial value is called a
- @dfn{tentative definition}. This is a strange hybrid: it @emph{can}
- allocate space for the variable, but does not insist. So it causes no
- conflict, no error, if the variable has another declaration that
- allocates space for it, perhaps in another compilation module. But if
- nothing else allocates space for the variable, the tentative
- definition will do it. Any number of compilation modules can declare
- the same variable in this way, and that is sufficient for all of them
- to use the variable.
- @c @opindex -fno-common
- @c @opindex --warn_common
- In programs that are very large or have many contributors, it may be
- wise to adopt the convention of never using tentative definitions.
- You can use the compilation option @option{-fno-common} to make them
- an error, or @option{--warn-common} to warn about them.
- If a file-scope variable gets its space through a tentative
- definition, it starts out containing all zeros.
- @node auto and register
- @section @code{auto} and @code{register}
- @cindex @code{auto} declarations
- @cindex @code{register} declarations
- @findex auto
- @findex register
- For historical reasons, you can write @code{auto} or @code{register}
- before a local variable declaration. @code{auto} merely emphasizes
- that the variable isn't static; it changes nothing.
- @code{register} suggests to the compiler storing this variable in a
- register. However, GNU C ignores this suggestion, since it can
- choose the best variables to store in registers without any hints.
- It is an error to take the address of a variable declared
- @code{register}, so you cannot use the unary @samp{&} operator on it.
- If the variable is an array, you can't use it at all (other than as
- the operand of @code{sizeof}), which makes it rather useless.
- @node Omitting Types
- @section Omitting Types in Declarations
- @cindex omitting types in declarations
- The syntax of C traditionally allows omitting the data type in a
- declaration if it specifies a storage class, a type qualifier (see the
- next chapter), or @code{auto} or @code{register}. Then the type
- defaults to @code{int}. For example:
- @example
- auto foo = 42;
- @end example
- This is bad practice; if you see it, fix it.
- @node Type Qualifiers
- @chapter Type Qualifiers
- A declaration can include type qualifiers to advise the compiler
- about how the variable will be used. There are three different
- qualifiers, @code{const}, @code{volatile} and @code{restrict}. They
- pertain to different issues, so you can use more than one together.
- For instance, @code{const volatile} describes a value that the
- program is not allowed to change, but might have a different value
- each time the program examines it. (This might perhaps be a special
- hardware register, or part of shared memory.)
- If you are just learning C, you can skip this chapter.
- @menu
- * const:: Variables whose values don't change.
- * volatile:: Variables whose values may be accessed
- or changed outside of the control of
- this program.
- * restrict Pointers:: Restricted pointers for code optimization.
- * restrict Pointer Example:: Example of how that works.
- @end menu
- @node const
- @section @code{const} Variables and Fields
- @cindex @code{const} variables and fields
- @cindex variables, @code{const}
- @findex const
- You can mark a variable as ``constant'' by writing @code{const} in
- front of the declaration. This says to treat any assignment to that
- variable as an error. It may also permit some compiler
- optimizations---for instance, to fetch the value only once to satisfy
- multiple references to it. The construct looks like this:
- @example
- const double pi = 3.14159;
- @end example
- After this definition, the code can use the variable @code{pi}
- but cannot assign a different value to it.
- @example
- pi = 3.0; /* @r{Error!} */
- @end example
- Simple variables that are constant can be used for the same purposes
- as enumeration constants, and they are not limited to integers. The
- constantness of the variable propagates into pointers, too.
- A pointer type can specify that the @emph{target} is constant. For
- example, the pointer type @code{const double *} stands for a pointer
- to a constant @code{double}. That's the typethat results from taking
- the address of @code{pi}. Such a pointer can't be dereferenced in the
- left side of an assignment.
- @example
- *(&pi) = 3.0; /* @r{Error!} */
- @end example
- Nonconstant pointers can be converted automatically to constant
- pointers, but not vice versa. For instance,
- @example
- const double *cptr;
- double *ptr;
- cptr = π /* @r{Valid.} */
- cptr = ptr; /* @r{Valid.} */
- ptr = cptr; /* @r{Error!} */
- ptr = π /* @r{Error!} */
- @end example
- This is not an ironclad protection against modifying the value. You
- can always cast the constant pointer to a nonconstant pointer type:
- @example
- ptr = (double *)cptr; /* @r{Valid.} */
- ptr = (double *)π /* @r{Valid.} */
- @end example
- However, @code{const} provides a way to show that a certain function
- won't modify the data structure whose address is passed to it. Here's
- an example:
- @example
- int
- string_length (const char *string)
- @{
- int count = 0;
- while (*string++)
- count++;
- return count;
- @}
- @end example
- @noindent
- Using @code{const char *} for the parameter is a way of saying this
- function never modifies the memory of the string itself.
- In calling @code{string_length}, you can specify an ordinary
- @code{char *} since that can be converted automatically to @code{const
- char *}.
- @node volatile
- @section @code{volatile} Variables and Fields
- @cindex @code{volatile} variables and fields
- @cindex variables, @code{volatile}
- @findex volatile
- The GNU C compiler often performs optimizations that eliminate the
- need to write or read a variable. For instance,
- @example
- int foo;
- foo = 1;
- foo++;
- @end example
- @noindent
- might simply store the value 2 into @code{foo}, without ever storing 1.
- These optimizations can also apply to structure fields in some cases.
- If the memory containing @code{foo} is shared with another program,
- or if it is examined asynchronously by hardware, such optimizations
- could confuse the communication. Using @code{volatile} is one way
- to prevent them.
- Writing @code{volatile} with the type in a variable or field declaration
- says that the value may be examined or changed for reasons outside the
- control of the program at any moment. Therefore, the program must
- execute in a careful way to assure correct interaction with those
- accesses, whenever they may occur.
- The simplest use looks like this:
- @example
- volatile int lock;
- @end example
- This directs the compiler not to do certain common optimizations on
- use of the variable @code{lock}. All the reads and writes for a volatile
- variable or field are really done, and done in the order specified
- by the source code. Thus, this code:
- @example
- lock = 1;
- list = list->next;
- if (lock)
- lock_broken (&lock);
- lock = 0;
- @end example
- @noindent
- really stores the value 1 in @code{lock}, even though there is no
- sign it is really used, and the @code{if} statement reads and
- checks the value of @code{lock}, rather than assuming it is still 1.
- A limited amount of optimization can be done, in principle, on
- @code{volatile} variables and fields: multiple references between two
- sequence points (@pxref{Sequence Points}) can be simplified together.
- Use of @code{volatile} does not eliminate the flexibility in ordering
- the computation of the operands of most operators. For instance, in
- @code{lock + foo ()}, the order of accessing @code{lock} and calling
- @code{foo} is not specified, so they may be done in either order; the
- fact that @code{lock} is @code{volatile} has no effect on that.
- @node restrict Pointers
- @section @code{restrict}-Qualified Pointers
- @cindex @code{restrict} pointers
- @cindex pointers, @code{restrict}-qualified
- @findex restrict
- You can declare a pointer as ``restricted'' using the @code{restrict}
- type qualifier, like this:
- @example
- int *restrict p = x;
- @end example
- @noindent
- This enables better optimization of code that uses the pointer.
- If @code{p} is declared with @code{restrict}, and then the code
- references the object that @code{p} points to (using @code{*p} or
- @code{p[@var{i}]}), the @code{restrict} declaration promises that the
- code will not access that object in any other way---only through
- @code{p}.
- For instance, it means the code must not use another pointer
- to access the same space, as shown here:
- @example
- int *restrict p = @var{whatever};
- int *q = p;
- foo (*p, *q);
- @end example
- @noindent
- That contradicts the @code{restrict} promise by accessing the object
- that @code{p} points to using @code{q}, which bypasses @code{p}.
- Likewise, it must not do this:
- @example
- int *restrict p = @var{whatever};
- struct @{ int *a, *b; @} s;
- s.a = p;
- foo (*p, *s.a);
- @end example
- @noindent
- This example uses a structure field instead of the variable @code{q}
- to hold the other pointer, and that contradicts the promise just the
- same.
- The keyword @code{restrict} also promises that @code{p} won't point to
- the allocated space of any automatic or static variable. So the code
- must not do this:
- @example
- int a;
- int *restrict p = &a;
- foo (*p, a);
- @end example
- @noindent
- because that does direct access to the object (@code{a}) that @code{p}
- points to, which bypasses @code{p}.
- If the code makes such promises with @code{restrict} then breaks them,
- execution is unpredictable.
- @node restrict Pointer Example
- @section @code{restrict} Pointer Example
- Here are examples where @code{restrict} enables real optimization.
- In this example, @code{restrict} assures GCC that the array @code{out}
- points to does not overlap with the array @code{in} points to.
- @example
- void
- process_data (const char *in,
- char * restrict out,
- size_t size)
- @{
- for (i = 0; i < size; i++)
- out[i] = in[i] + in[i + 1];
- @}
- @end example
- Here's a simple tree structure, where each tree node holds data of
- type @code{PAYLOAD} plus two subtrees.
- @example
- struct foo
- @{
- PAYLOAD payload;
- struct foo *left;
- struct foo *right;
- @};
- @end example
- Now here's a function to null out both pointers in the @code{left}
- subtree.
- @example
- void
- null_left (struct foo *a)
- @{
- a->left->left = NULL;
- a->left->right = NULL;
- @}
- @end example
- Since @code{*a} and @code{*a->left} have the same data type,
- they could legitimately alias (@pxref{Aliasing}). Therefore,
- the compiled code for @code{null_left} must read @code{a->left}
- again from memory when executing the second assignment statement.
- We can enable optimization, so that it does not need to read
- @code{a->left} again, by writing @code{null_left} this in a less
- obvious way.
- @example
- void
- null_left (struct foo *a)
- @{
- struct foo *b = a->left;
- b->left = NULL;
- b->right = NULL;
- @}
- @end example
- A more elegant way to fix this is with @code{restrict}.
- @example
- void
- null_left (struct foo *restrict a)
- @{
- a->left->left = NULL;
- a->left->right = NULL;
- @}
- @end example
- Declaring @code{a} as @code{restrict} asserts that other pointers such
- as @code{a->left} will not point to the same memory space as @code{a}.
- Therefore, the memory location @code{a->left->left} cannot be the same
- memory as @code{a->left}. Knowing this, the compiled code may avoid
- reloading @code{a->left} for the second statement.
- @node Functions
- @chapter Functions
- @cindex functions
- We have already presented many examples of functions, so if you've
- read this far, you basically understand the concept of a function. It
- is vital, nonetheless, to have a chapter in the manual that collects
- all the information about functions.
- @menu
- * Function Definitions:: Writing the body of a function.
- * Function Declarations:: Declaring the interface of a function.
- * Function Calls:: Using functions.
- * Function Call Semantics:: Call-by-value argument passing.
- * Function Pointers:: Using references to functions.
- * The main Function:: Where execution of a GNU C program begins.
- * Advanced Definitions:: Advanced features of function definitions.
- * Obsolete Definitions:: Obsolete features still used
- in function definitions in old code.
- @end menu
- @node Function Definitions
- @section Function Definitions
- @cindex function definitions
- @cindex defining functions
- We have already presented many examples of function definitions. To
- summarize the rules, a function definition looks like this:
- @example
- @var{returntype}
- @var{functionname} (@var{parm_declarations}@r{@dots{}})
- @{
- @var{body}
- @}
- @end example
- The part before the open-brace is called the @dfn{function header}.
- Write @code{void} as the @var{returntype} if the function does
- not return a value.
- @menu
- * Function Parameter Variables:: Syntax and semantics
- of function parameters.
- * Forward Function Declarations:: Functions can only be called after
- they have been defined or declared.
- * Static Functions:: Limiting visibility of a function.
- * Arrays as Parameters:: Functions that accept array arguments.
- * Structs as Parameters:: Functions that accept structure arguments.
- @end menu
- @node Function Parameter Variables
- @subsection Function Parameter Variables
- @cindex function parameter variables
- @cindex parameter variables in functions
- @cindex parameter list
- A function parameter variable is a local variable (@pxref{Local
- Variables}) used within the function to store the value passed as an
- argument in a call to the function. Usually we say ``function
- parameter'' or ``parameter'' for short, not mentioning the fact that
- it's a variable.
- We declare these variables in the beginning of the function
- definition, in the @dfn{parameter list}. For example,
- @example
- fib (int n)
- @end example
- @noindent
- has a parameter list with one function parameter @code{n}, which has
- type @code{int}.
- Function parameter declarations differ from ordinary variable
- declarations in several ways:
- @itemize @bullet
- @item
- Inside the function definition header, commas separate parameter
- declarations, and each parameter needs a complete declaration
- including the type. For instance, if a function @code{foo} has two
- @code{int} parameters, write this:
- @example
- foo (int a, int b)
- @end example
- You can't share the common @code{int} between the two declarations:
- @example
- foo (int a, b) /* @r{Invalid!} */
- @end example
- @item
- A function parameter variable is initialized to whatever value is
- passed in the function call, so its declaration cannot specify an
- initial value.
- @item
- Writing an array type in a function parameter declaration has the
- effect of declaring it as a pointer. The size specified for the array
- has no effect at all, and we normally omit the size. Thus,
- @example
- foo (int a[5])
- foo (int a[])
- foo (int *a)
- @end example
- @noindent
- are equivalent.
- @item
- The scope of the parameter variables is the entire function body,
- notwithstanding the fact that they are written in the function header,
- which is just outside the function body.
- @end itemize
- If a function has no parameters, it would be most natural for the
- list of parameters in its definition to be empty. But that, in C, has
- a special meaning for historical reasons: ``Do not check that calls to
- this function have the right number of arguments.'' Thus,
- @example
- int
- foo ()
- @{
- return 5;
- @}
- int
- bar (int x)
- @{
- return foo (x);
- @}
- @end example
- @noindent
- would not report a compilation error in passing @code{x} as an
- argument to @code{foo}. By contrast,
- @example
- int
- foo (void)
- @{
- return 5;
- @}
- int
- bar (int x)
- @{
- return foo (x);
- @}
- @end example
- @noindent
- would report an error because @code{foo} is supposed to receive
- no arguments.
- @node Forward Function Declarations
- @subsection Forward Function Declarations
- @cindex forward function declarations
- @cindex function declarations, forward
- The order of the function definitions in the source code makes no
- difference, except that each function needs to be defined or declared
- before code uses it.
- The definition of a function also declares its name for the rest of
- the containing scope. But what if you want to call the function
- before its definition? To permit that, write a compatible declaration
- of the same function, before the first call. A declaration that
- prefigures a subsequent definition in this way is called a
- @dfn{forward declaration}. The function declaration can be at top
- @c ??? file scope
- level or within a block, and it applies until the end of the containing
- scope.
- @xref{Function Declarations}, for more information about these
- declarations.
- @node Static Functions
- @subsection Static Functions
- @cindex static functions
- @cindex functions, static
- @findex static
- The keyword @code{static} in a function definition limits the
- visibility of the name to the current compilation module. (That's the
- same thing @code{static} does in variable declarations;
- @pxref{File-Scope Variables}.) For instance, if one compilation module
- contains this code:
- @example
- static int
- foo (void)
- @{
- @r{@dots{}}
- @}
- @end example
- @noindent
- then the code of that compilation module can call @code{foo} anywhere
- after the definition, but other compilation modules cannot refer to it
- at all.
- @cindex forward declaration
- @cindex static function, declaration
- To call @code{foo} before its definition, it needs a forward
- declaration, which should use @code{static} since the function
- definition does. For this function, it looks like this:
- @example
- static int foo (void);
- @end example
- It is generally wise to use @code{static} on the definitions of
- functions that won't be called from outside the same compilation
- module. This makes sure that calls are not added in other modules.
- If programmers decide to change the function's calling convention, or
- understand all the consequences of its use, they will only have to
- check for calls in the same compilation module.
- @node Arrays as Parameters
- @subsection Arrays as Parameters
- @cindex array as parameters
- @cindex functions with array parameters
- Arrays in C are not first-class objects: it is impossible to copy
- them. So they cannot be passed as arguments like other values.
- @xref{Limitations of C Arrays}. Rather, array parameters work in
- a special way.
- @menu
- * Array Parm Pointer::
- * Passing Array Args::
- * Array Parm Qualifiers::
- @end menu
- @node Array Parm Pointer
- @subsubsection Array parameters are pointers
- Declaring a function parameter variable as an array really gives it a
- pointer type. C does this because an expression with array type, if
- used as an argument in a function call, is converted automatically to
- a pointer (to the zeroth element of the array). If you declare the
- corresponding parameter as an ``array'', it will work correctly with
- the pointer value that really gets passed.
- This relates to the fact that C does not check array bounds in access
- to elements of the array (@pxref{Accessing Array Elements}).
- For example, in this function,
- @example
- void
- clobber4 (int array[20])
- @{
- array[4] = 0;
- @}
- @end example
- @noindent
- the parameter @code{array}'s real type is @code{int *}; the specified
- length, 20, has no effect on the program. You can leave out the length
- and write this:
- @example
- void
- clobber4 (int array[])
- @{
- array[4] = 0;
- @}
- @end example
- @noindent
- or write the parameter declaration explicitly as a pointer:
- @example
- void
- clobber4 (int *array)
- @{
- array[4] = 0;
- @}
- @end example
- They are all equivalent.
- @node Passing Array Args
- @subsubsection Passing array arguments
- The function call passes this pointer by
- value, like all argument values in C@. However, the result is
- paradoxical in that the array itself is passed by reference: its
- contents are treated as shared memory---shared between the caller and
- the called function, that is. When @code{clobber4} assigns to element
- 4 of @code{array}, the effect is to alter element 4 of the array
- specified in the call.
- @example
- #include <stddef.h> /* @r{Defines @code{NULL}.} */
- #include <stdlib.h> /* @r{Declares @code{malloc},} */
- /* @r{Defines @code{EXIT_SUCCESS}.} */
- int
- main (void)
- @{
- int data[] = @{1, 2, 3, 4, 5, 6@};
- int i;
- /* @r{Show the initial value of element 4.} */
- for (i = 0; i < 6; i++)
- printf ("data[%d] = %d\n", i, data[i]);
- printf ("\n");
- clobber4 (data);
- /* @r{Show that element 4 has been changed.} */
- for (i = 0; i < 6; i++)
- printf ("data[%d] = %d\n", i, data[i]);
- printf ("\n");
- return EXIT_SUCCESS;
- @}
- @end example
- @noindent
- shows that @code{data[4]} has become zero after the call to
- @code{clobber4}.
- The array @code{data} has 6 elements, but passing it to a function
- whose argument type is written as @code{int [20]} is not an error,
- because that really stands for @code{int *}. The pointer that is the
- real argument carries no indication of the length of the array it
- points into. It is not required to point to the beginning of the
- array, either. For instance,
- @example
- clobber4 (data+1);
- @end example
- @noindent
- passes an ``array'' that starts at element 1 of @code{data}, and the
- effect is to zero @code{data[5]} instead of @code{data[4]}.
- If all calls to the function will provide an array of a particular
- size, you can specify the size of the array to be @code{static}:
- @example
- void
- clobber4 (int array[static 20])
- @r{@dots{}}
- @end example
- @noindent
- This is a promise to the compiler that the function will always be
- called with an array of 20 elements, so that the compiler can optimize
- code accordingly. If the code breaks this promise and calls the
- function with, for example, a shorter array, unpredictable things may
- happen.
- @node Array Parm Qualifiers
- @subsubsection Type qualifiers on array parameters
- You can use the type qualifiers @code{const}, @code{restrict}, and
- @code{volatile} with array parameters; for example:
- @example
- void
- clobber4 (volatile int array[20])
- @r{@dots{}}
- @end example
- @noindent
- denotes that @code{array} is equivalent to a pointer to a volatile
- @code{int}. Alternatively:
- @example
- void
- clobber4 (int array[const 20])
- @r{@dots{}}
- @end example
- @noindent
- makes the array parameter equivalent to a constant pointer to an
- @code{int}. If we want the @code{clobber4} function to succeed, it
- would not make sense to write
- @example
- void
- clobber4 (const int array[20])
- @r{@dots{}}
- @end example
- @noindent
- as this would tell the compiler that the parameter should point to an
- array of constant @code{int} values, and then we would not be able to
- store zeros in them.
- In a function with multiple array parameters, you can use @code{restrict}
- to tell the compiler that each array parameter passed in will be distinct:
- @example
- void
- foo (int array1[restrict 10], int array2[restrict 10])
- @r{@dots{}}
- @end example
- @noindent
- Using @code{restrict} promises the compiler that callers will
- not pass in the same array for more than one @code{restrict} array
- parameter. Knowing this enables the compiler to perform better code
- optimization. This is the same effect as using @code{restrict}
- pointers (@pxref{restrict Pointers}), but makes it clear when reading
- the code that an array of a specific size is expected.
- @node Structs as Parameters
- @subsection Functions That Accept Structure Arguments
- Structures in GNU C are first-class objects, so using them as function
- parameters and arguments works in the natural way. This function
- @code{swapfoo} takes a @code{struct foo} with two fields as argument,
- and returns a structure of the same type but with the fields
- exchanged.
- @example
- struct foo @{ int a, b; @};
- struct foo x;
- struct foo
- swapfoo (struct foo inval)
- @{
- struct foo outval;
- outval.a = inval.b;
- outval.b = inval.a;
- return outval;
- @}
- @end example
- This simpler definition of @code{swapfoo} avoids using a local
- variable to hold the result about to be return, by using a structure
- constructor (@pxref{Structure Constructors}), like this:
- @example
- struct foo
- swapfoo (struct foo inval)
- @{
- return (struct foo) @{ inval.b, inval.a @};
- @}
- @end example
- It is valid to define a structure type in a function's parameter list,
- as in
- @example
- int
- frob_bar (struct bar @{ int a, b; @} inval)
- @{
- @var{body}
- @}
- @end example
- @noindent
- and @var{body} can access the fields of @var{inval} since the
- structure type @code{struct bar} is defined for the whole function
- body. However, there is no way to create a @code{struct bar} argument
- to pass to @code{frob_bar}, except with kludges. As a result,
- defining a structure type in a parameter list is useless in practice.
- @node Function Declarations
- @section Function Declarations
- @cindex function declarations
- @cindex declararing functions
- To call a function, or use its name as a pointer, a @dfn{function
- declaration} for the function name must be in effect at that point in
- the code. The function's definition serves as a declaration of that
- function for the rest of the containing scope, but to use the function
- in code before the definition, or from another compilation module, a
- separate function declaration must precede the use.
- A function declaration looks like the start of a function definition.
- It begins with the return value type (@code{void} if none) and the
- function name, followed by argument declarations in parentheses
- (though these can sometimes be omitted). But that's as far as the
- similarity goes: instead of the function body, the declaration uses a
- semicolon.
- @cindex function prototype
- @cindex prototype of a function
- A declaration that specifies argument types is called a @dfn{function
- prototype}. You can include the argument names or omit them. The
- names, if included in the declaration, have no effect, but they may
- serve as documentation.
- This form of prototype specifies fixed argument types:
- @example
- @var{rettype} @var{function} (@var{argtypes}@r{@dots{}});
- @end example
- @noindent
- This form says the function takes no arguments:
- @example
- @var{rettype} @var{function} (void);
- @end example
- @noindent
- This form declares types for some arguments, and allows additional
- arguments whose types are not specified:
- @example
- @var{rettype} @var{function} (@var{argtypes}@r{@dots{}}, ...);
- @end example
- For a parameter that's an array of variable length, you can write
- its declaration with @samp{*} where the ``length'' of the array would
- normally go; for example, these are all equivalent.
- @example
- double maximum (int n, int m, double a[n][m]);
- double maximum (int n, int m, double a[*][*]);
- double maximum (int n, int m, double a[ ][*]);
- double maximum (int n, int m, double a[ ][m]);
- @end example
- @noindent
- The old-fashioned form of declaration, which is not a prototype, says
- nothing about the types of arguments or how many they should be:
- @example
- @var{rettype} @var{function} ();
- @end example
- @strong{Warning:} Arguments passed to a function declared without a
- prototype are converted with the default argument promotions
- (@pxref{Argument Promotions}. Likewise for additional arguments whose
- types are unspecified.
- Function declarations are usually written at the top level in a source file,
- but you can also put them inside code blocks. Then the function name
- is visible for the rest of the containing scope. For example:
- @example
- void
- foo (char *file_name)
- @{
- void save_file (char *);
- save_file (file_name);
- @}
- @end example
- If another part of the code tries to call the function
- @code{save_file}, this declaration won't be in effect there. So the
- function will get an implicit declaration of the form @code{extern int
- save_file ();}. That conflicts with the explicit declaration
- here, and the discrepancy generates a warning.
- The syntax of C traditionally allows omitting the data type in a
- function declaration if it specifies a storage class or a qualifier.
- Then the type defaults to @code{int}. For example:
- @example
- static foo (double x);
- @end example
- @noindent
- defaults the return type to @code{int}.
- This is bad practice; if you see it, fix it.
- Calling a function that is undeclared has the effect of an creating
- @dfn{implicit} declaration in the innermost containing scope,
- equivalent to this:
- @example
- extern int @dfn{function} ();
- @end example
- @noindent
- This declaration says that the function returns @code{int} but leaves
- its argument types unspecified. If that does not accurately fit the
- function, then the program @strong{needs} an explicit declaration of
- the function with argument types in order to call it correctly.
- Implicit declarations are deprecated, and a function call that creates one
- causes a warning.
- @node Function Calls
- @section Function Calls
- @cindex function calls
- @cindex calling functions
- Starting a program automatically calls the function named @code{main}
- (@pxref{The main Function}). Aside from that, a function does nothing
- except when it is @dfn{called}. That occurs during the execution of a
- function-call expression specifying that function.
- A function-call expression looks like this:
- @example
- @var{function} (@var{arguments}@r{@dots{}})
- @end example
- Most of the time, @var{function} is a function name. However, it can
- also be an expression with a function pointer value; that way, the
- program can determine at run time which function to call.
- The @var{arguments} are a series of expressions separated by commas.
- Each expression specifies one argument to pass to the function.
- The list of arguments in a function call looks just like use of the
- comma operator (@pxref{Comma Operator}), but the fact that it fills
- the parentheses of a function call gives it a different meaning.
- Here's an example of a function call, taken from an example near the
- beginning (@pxref{Complete Program}).
- @example
- printf ("Fibonacci series item %d is %d\n",
- 19, fib (19));
- @end example
- The three arguments given to @code{printf} are a constant string, the
- integer 19, and the integer returned by @code{fib (19)}.
- @node Function Call Semantics
- @section Function Call Semantics
- @cindex function call semantics
- @cindex semantics of function calls
- @cindex call-by-value
- The meaning of a function call is to compute the specified argument
- expressions, convert their values according to the function's
- declaration, then run the function giving it copies of the converted
- values. (This method of argument passing is known as
- @dfn{call-by-value}.) When the function finishes, the value it
- returns becomes the value of the function-call expression.
- Call-by-value implies that an assignment to the function argument
- variable has no direct effect on the caller. For instance,
- @example
- #include <stdlib.h> /* @r{Defines @code{EXIT_SUCCESS}.} */
- #include <stdio.h> /* @r{Declares @code{printf}.} */
- void
- subroutine (int x)
- @{
- x = 5;
- @}
- void
- main (void)
- @{
- int y = 20;
- subroutine (y);
- printf ("y is %d\n", y);
- return EXIT_SUCCESS;
- @}
- @end example
- @noindent
- prints @samp{y is 20}. Calling @code{subroutine} initializes @code{x}
- from the value of @code{y}, but this does not establish any other
- relationship between the two variables. Thus, the assignment to
- @code{x}, inside @code{subroutine}, changes only @emph{that} @code{x}.
- If an argument's type is specified by the function's declaration, the
- function call converts the argument expression to that type if
- possible. If the conversion is impossible, that is an error.
- If the function's declaration doesn't specify the type of that
- argument, then the @emph{default argument promotions} apply.
- @xref{Argument Promotions}.
- @node Function Pointers
- @section Function Pointers
- @cindex function pointers
- @cindex pointers to functions
- A function name refers to a fixed function. Sometimes it is useful to
- call a function to be determined at run time; to do this, you can use
- a @dfn{function pointer value} that points to the chosen function
- (@pxref{Pointers}).
- Pointer-to-function types can be used to declare variables and other
- data, including array elements, structure fields, and union
- alternatives. They can also be used for function arguments and return
- values. These types have the peculiarity that they are never
- converted automatically to @code{void *} or vice versa. However, you
- can do that conversion with a cast.
- @menu
- * Declaring Function Pointers:: How to declare a pointer to a function.
- * Assigning Function Pointers:: How to assign values to function pointers.
- * Calling Function Pointers:: How to call functions through pointers.
- @end menu
- @node Declaring Function Pointers
- @subsection Declaring Function Pointers
- @cindex declaring function pointers
- @cindex function pointers, declaring
- The declaration of a function pointer variable (or structure field)
- looks almost like a function declaration, except it has an additional
- @samp{*} just before the variable name. Proper nesting requires a
- pair of parentheses around the two of them. For instance, @code{int
- (*a) ();} says, ``Declare @code{a} as a pointer such that @code{*a} is
- an @code{int}-returning function.''
- Contrast these three declarations:
- @example
- /* @r{Declare a function returning @code{char *}.} */
- char *a (char *);
- /* @r{Declare a pointer to a function returning @code{char}.} */
- char (*a) (char *);
- /* @r{Declare a pointer to a function returning @code{char *}.} */
- char *(*a) (char *);
- @end example
- The possible argument types of the function pointed to are the same
- as in a function declaration. You can write a prototype
- that specifies all the argument types:
- @example
- @var{rettype} (*@var{function}) (@var{arguments}@r{@dots{}});
- @end example
- @noindent
- or one that specifies some and leaves the rest unspecified:
- @example
- @var{rettype} (*@var{function}) (@var{arguments}@r{@dots{}}, ...);
- @end example
- @noindent
- or one that says there are no arguments:
- @example
- @var{rettype} (*@var{function}) (void);
- @end example
- You can also write a non-prototype declaration that says
- nothing about the argument types:
- @example
- @var{rettype} (*@var{function}) ();
- @end example
- For example, here's a declaration for a variable that should
- point to some arithmetic function that operates on two @code{double}s:
- @example
- double (*binary_op) (double, double);
- @end example
- Structure fields, union alternatives, and array elements can be
- function pointers; so can parameter variables. The function pointer
- declaration construct can also be combined with other operators
- allowed in declarations. For instance,
- @example
- int **(*foo)();
- @end example
- @noindent
- declares @code{foo} as a pointer to a function that returns
- type @code{int **}, and
- @example
- int **(*foo[30])();
- @end example
- @noindent
- declares @code{foo} as an array of 30 pointers to functions that
- return type @code{int **}.
- @example
- int **(**foo)();
- @end example
- @noindent
- declares @code{foo} as a pointer to a pointer to a function that
- returns type @code{int **}.
- @node Assigning Function Pointers
- @subsection Assigning Function Pointers
- @cindex assigning function pointers
- @cindex function pointers, assigning
- Assuming we have declared the variable @code{binary_op} as in the
- previous section, giving it a value requires a suitable function to
- use. So let's define a function suitable for the variable to point
- to. Here's one:
- @example
- double
- double_add (double a, double b)
- @{
- return a+b;
- @}
- @end example
- Now we can give it a value:
- @example
- binary_op = double_add;
- @end example
- The target type of the function pointer must be upward compatible with
- the type of the function (@pxref{Compatible Types}).
- There is no need for @samp{&} in front of @code{double_add}.
- Using a function name such as @code{double_add} as an expression
- automatically converts it to the function's address, with the
- appropriate function pointer type. However, it is ok to use
- @samp{&} if you feel that is clearer:
- @example
- binary_op = &double_add;
- @end example
- @node Calling Function Pointers
- @subsection Calling Function Pointers
- @cindex calling function pointers
- @cindex function pointers, calling
- To call the function specified by a function pointer, just write the
- function pointer value in a function call. For instance, here's a
- call to the function @code{binary_op} points to:
- @example
- binary_op (x, 5)
- @end example
- Since the data type of @code{binary_op} explicitly specifies type
- @code{double} for the arguments, the call converts @code{x} and 5 to
- @code{double}.
- The call conceptually dereferences the pointer @code{binary_op} to
- ``get'' the function it points to, and calls that function. If you
- wish, you can explicitly represent the derefence by writing the
- @code{*} operator:
- @example
- (*binary_op) (x, 5)
- @end example
- The @samp{*} reminds people reading the code that @code{binary_op} is
- a function pointer rather than the name of a specific function.
- @node The main Function
- @section The @code{main} Function
- @cindex @code{main} function
- @findex main
- Every complete executable program requires at least one function,
- called @code{main}, which is where execution begins. You do not have
- to explicitly declare @code{main}, though GNU C permits you to do so.
- Conventionally, @code{main} should be defined to follow one of these
- calling conventions:
- @example
- int main (void) @{@r{@dots{}}@}
- int main (int argc, char *argv[]) @{@r{@dots{}}@}
- int main (int argc, char *argv[], char *envp[]) @{@r{@dots{}}@}
- @end example
- @noindent
- Using @code{void} as the parameter list means that @code{main} does
- not use the arguments. You can write @code{char **argv} instead of
- @code{char *argv[]}, and likewise for @code{envp}, as the two
- constructs are equivalent.
- @ignore @c Not so at present
- Defining @code{main} in any other way generates a warning. Your
- program will still compile, but you may get unexpected results when
- executing it.
- @end ignore
- You can call @code{main} from C code, as you can call any other
- function, though that is an unusual thing to do. When you do that,
- you must write the call to pass arguments that match the parameters in
- the definition of @code{main}.
- The @code{main} function is not actually the first code that runs when
- a program starts. In fact, the first code that runs is system code
- from the file @file{crt0.o}. In Unix, this was hand-written assembler
- code, but in GNU we replaced it with C code. Its job is to find
- the arguments for @code{main} and call that.
- @menu
- * Values from main:: Returning values from the main function.
- * Command-line Parameters:: Accessing command-line parameters
- provided to the program.
- * Environment Variables:: Accessing system environment variables.
- @end menu
- @node Values from main
- @subsection Returning Values from @code{main}
- @cindex returning values from @code{main}
- @cindex success
- @cindex failure
- @cindex exit status
- When @code{main} returns, the process terminates. Whatever value
- @code{main} returns becomes the exit status which is reported to the
- parent process. While nominally the return value is of type
- @code{int}, in fact the exit status gets truncated to eight bits; if
- @code{main} returns the value 256, the exit status is 0.
- Normally, programs return only one of two values: 0 for success,
- and 1 for failure. For maximum portability, use the macro
- values @code{EXIT_SUCCESS} and @code{EXIT_FAILURE} defined in
- @code{stdlib.h}. Here's an example:
- @cindex @code{EXIT_FAILURE}
- @cindex @code{EXIT_SUCCESS}
- @example
- #include <stdlib.h> /* @r{Defines @code{EXIT_SUCCESS}} */
- /* @r{and @code{EXIT_FAILURE}.} */
- int
- main (void)
- @{
- @r{@dots{}}
- if (foo)
- return EXIT_SUCCESS;
- else
- return EXIT_FAILURE;
- @}
- @end example
- Some types of programs maintain special conventions for various return
- values; for example, comparison programs including @code{cmp} and
- @code{diff} return 1 to indicate a mismatch, and 2 to indicate that
- the comparison couldn't be performed.
- @node Command-line Parameters
- @subsection Accessing Command-line Parameters
- @cindex command-line parameters
- @cindex parameters, command-line
- If the program was invoked with any command-line arguments, it can
- access them through the arguments of @code{main}, @code{argc} and
- @code{argv}. (You can give these arguments any names, but the names
- @code{argc} and @code{argv} are customary.)
- The value of @code{argv} is an array containing all of the
- command-line arguments as strings, with the name of the command
- invoked as the first string. @code{argc} is an integer that says how
- many strings @code{argv} contains. Here is an example of accessing
- the command-line parameters, retrieving the program's name and
- checking for the standard @option{--version} and @option{--help} options:
- @example
- #include <string.h> /* @r{Declare @code{strcmp}.} */
- int
- main (int argc, char *argv[])
- @{
- char *program_name = argv[0];
- for (int i = 1; i < argc; i++)
- @{
- if (!strcmp (argv[i], "--version"))
- @{
- /* @r{Print version information and exit.} */
- @r{@dots{}}
- @}
- else if (!strcmp (argv[i], "--help"))
- @{
- /* @r{Print help information and exit.} */
- @r{@dots{}}
- @}
- @}
- @r{@dots{}}
- @}
- @end example
- @node Environment Variables
- @subsection Accessing Environment Variables
- @cindex environment variables
- You can optionally include a third parameter to @code{main}, another
- array of strings, to capture the environment variables available to
- the program. Unlike what happens with @code{argv}, there is no
- additional parameter for the count of environment variables; rather,
- the array of environment variables concludes with a null pointer.
- @example
- #include <stdio.h> /* @r{Declares @code{printf}.} */
- int
- main (int argc, char *argv[], char *envp[])
- @{
- /* @r{Print out all environment variables.} */
- int i = 0;
- while (envp[i])
- @{
- printf ("%s\n", envp[i]);
- i++;
- @}
- @}
- @end example
- Another method of retrieving environment variables is to use the
- library function @code{getenv}, which is defined in @code{stdlib.h}.
- Using @code{getenv} does not require defining @code{main} to accept the
- @code{envp} pointer. For example, here is a program that fetches and prints
- the user's home directory (if defined):
- @example
- #include <stdlib.h> /* @r{Declares @code{getenv}.} */
- #include <stdio.h> /* @r{Declares @code{printf}.} */
- int
- main (void)
- @{
- char *home_directory = getenv ("HOME");
- if (home_directory)
- printf ("My home directory is: %s\n", home_directory);
- else
- printf ("My home directory is not defined!\n");
- @}
- @end example
- @node Advanced Definitions
- @section Advanced Function Features
- This section describes some advanced or obscure features for GNU C
- function definitions. If you are just learning C, you can skip the
- rest of this chapter.
- @menu
- * Variable-Length Array Parameters:: Functions that accept arrays
- of variable length.
- * Variable Number of Arguments:: Variadic functions.
- * Nested Functions:: Defining functions within functions.
- * Inline Function Definitions:: A function call optimization technique.
- @end menu
- @node Variable-Length Array Parameters
- @subsection Variable-Length Array Parameters
- @cindex variable-length array parameters
- @cindex array parameters, variable-length
- @cindex functions that accept variable-length arrays
- An array parameter can have variable length: simply declare the array
- type with a size that isn't constant. In a nested function, the
- length can refer to a variable defined in a containing scope. In any
- function, it can refer to a previous parameter, like this:
- @example
- struct entry
- tester (int len, char data[len][len])
- @{
- @r{@dots{}}
- @}
- @end example
- Alternatively, in function declarations (but not in function
- definitions), you can use @code{[*]} to denote that the array
- parameter is of a variable length, such that these two declarations
- mean the same thing:
- @example
- struct entry
- tester (int len, char data[len][len]);
- @end example
- @example
- struct entry
- tester (int len, char data[*][*]);
- @end example
- @noindent
- The two forms of input are equivalent in GNU C, but emphasizing that
- the array parameter is variable-length may be helpful to those
- studying the code.
- You can also omit the length parameter, and instead use some other
- in-scope variable for the length in the function definition:
- @example
- struct entry
- tester (char data[*][*]);
- @r{@dots{}}
- int dataLength = 20;
- @r{@dots{}}
- struct entry
- tester (char data[dataLength][dataLength])
- @{
- @r{@dots{}}
- @}
- @end example
- @c ??? check text above
- @cindex parameter forward declaration
- In GNU C, to pass the array first and the length afterward, you can
- use a @dfn{parameter forward declaration}, like this:
- @example
- struct entry
- tester (int len; char data[len][len], int len)
- @{
- @r{@dots{}}
- @}
- @end example
- The @samp{int len} before the semicolon is the parameter forward
- declaration; it serves the purpose of making the name @code{len} known
- when the declaration of @code{data} is parsed.
- You can write any number of such parameter forward declarations in the
- parameter list. They can be separated by commas or semicolons, but
- the last one must end with a semicolon, which is followed by the
- ``real'' parameter declarations. Each forward declaration must match
- a subsequent ``real'' declaration in parameter name and data type.
- Standard C does not support parameter forward declarations.
- @node Variable Number of Arguments
- @subsection Variable-Length Parameter Lists
- @cindex variable-length parameter lists
- @cindex parameters lists, variable length
- @cindex function parameter lists, variable length
- @cindex variadic function
- A function that takes a variable number of arguments is called a
- @dfn{variadic function}. In C, a variadic function must specify at
- least one fixed argument with an explicitly declared data type.
- Additional arguments can follow, and can vary in both quantity and
- data type.
- In the function header, declare the fixed parameters in the normal
- way, then write a comma and an ellipsis: @samp{, ...}. Here is an
- example of a variadic function header:
- @example
- int add_multiple_values (int number, ...)
- @end example
- @cindex @code{va_list}
- @cindex @code{va_start}
- @cindex @code{va_end}
- The function body can refer to fixed arguments by their parameter
- names, but the additional arguments have no names. Accessing them in
- the function body uses certain standard macros. They are defined in
- the library header file @file{stdarg.h}, so the code must
- @code{#include} that file.
- In the body, write
- @example
- va_list ap;
- va_start (ap, @var{last_fixed_parameter});
- @end example
- @noindent
- This declares the variable @code{ap} (you can use any name for it)
- and then sets it up to point before the first additional argument.
- Then, to fetch the next consecutive additional argument, write this:
- @example
- va_arg (ap, @var{type})
- @end example
- After fetching all the additional arguments (or as many as need to be
- used), write this:
- @example
- va_end (ap);
- @end example
- Here's an example of a variadic function definition that adds any
- number of @code{int} arguments. The first (fixed) argument says how
- many more arguments follow.
- @example
- #include <stdarg.h> /* @r{Defines @code{va}@r{@dots{}} macros.} */
- @r{@dots{}}
- int
- add_multiple_values (int argcount, ...)
- @{
- int counter, total = 0;
- /* @r{Declare a variable of type @code{va_list}.} */
- va_list argptr;
- /* @r{Initialize that variable..} */
- va_start (argptr, argcount);
- for (counter = 0; counter < argcount; counter++)
- @{
- /* @r{Get the next additional argument.} */
- total += va_arg (argptr, int);
- @}
- /* @r{End use of the @code{argptr} variable.} */
- va_end (argptr);
- return total;
- @}
- @end example
- With GNU C, @code{va_end} is superfluous, but some other compilers
- might make @code{va_start} allocate memory so that calling
- @code{va_end} is necessary to avoid a memory leak. Before doing
- @code{va_start} again with the same variable, do @code{va_end}
- first.
- @cindex @code{va_copy}
- Because of this possible memory allocation, it is risky (in principle)
- to copy one @code{va_list} variable to another with assignment.
- Instead, use @code{va_copy}, which copies the substance but allocates
- separate memory in the variable you copy to. The call looks like
- @code{va_copy (@var{to}, @var{from})}, where both @var{to} and
- @var{from} should be variables of type @code{va_list}. In principle,
- do @code{va_end} on each of these variables before its scope ends.
- Since the additional arguments' types are not specified in the
- function's definition, the default argument promotions
- (@pxref{Argument Promotions}) apply to them in function calls. The
- function definition must take account of this; thus, if an argument
- was passed as @code{short}, the function should get it as @code{int}.
- If an argument was passed as @code{float}, the function should get it
- as @code{double}.
- C has no mechanism to tell the variadic function how many arguments
- were passed to it, so its calling convention must give it a way to
- determine this. That's why @code{add_multiple_values} takes a fixed
- argument that says how many more arguments follow. Thus, you can
- call the function like this:
- @example
- sum = add_multiple_values (3, 12, 34, 190);
- /* @r{Value is 12+34+190.} */
- @end example
- In GNU C, there is no actual need to use the @code{va_end} function.
- In fact, it does nothing. It's used for compatibility with other
- compilers, when that matters.
- It is a mistake to access variables declared as @code{va_list} except
- in the specific ways described here. Just what that type consists of
- is an implementation detail, which could vary from one platform to
- another.
- @node Nested Functions
- @subsection Nested Functions
- @cindex nested functions
- @cindex functions, nested
- @cindex downward funargs
- @cindex thunks
- A @dfn{nested function} is a function defined inside another function.
- The nested function's name is local to the block where it is defined.
- For example, here we define a nested function named @code{square}, and
- call it twice:
- @example
- @group
- foo (double a, double b)
- @{
- double square (double z) @{ return z * z; @}
- return square (a) + square (b);
- @}
- @end group
- @end example
- The nested function can access all the variables of the containing
- function that are visible at the point of its definition. This is
- called @dfn{lexical scoping}. For example, here we show a nested
- function that uses an inherited variable named @code{offset}:
- @example
- @group
- bar (int *array, int offset, int size)
- @{
- int access (int *array, int index)
- @{ return array[index + offset]; @}
- int i;
- @r{@dots{}}
- for (i = 0; i < size; i++)
- @r{@dots{}} access (array, i) @r{@dots{}}
- @}
- @end group
- @end example
- Nested function definitions can appear wherever automatic variable
- declarations are allowed; that is, in any block, interspersed with the
- other declarations and statements in the block.
- The nested function's name is visible only within the parent block;
- the name's scope starts from its definition and continues to the end
- of the containing block. If the nested function's name
- is the same as the parent function's name, there wil be
- no way to refer to the parent function inside the scope of the
- name of the nested function.
- Using @code{extern} or @code{static} on a nested function definition
- is an error.
- It is possible to call the nested function from outside the scope of its
- name by storing its address or passing the address to another function.
- You can do this safely, but you must be careful:
- @example
- @group
- hack (int *array, int size, int addition)
- @{
- void store (int index, int value)
- @{ array[index] = value + addition; @}
- intermediate (store, size);
- @}
- @end group
- @end example
- Here, the function @code{intermediate} receives the address of
- @code{store} as an argument. If @code{intermediate} calls @code{store},
- the arguments given to @code{store} are used to store into @code{array}.
- @code{store} also accesses @code{hack}'s local variable @code{addition}.
- It is safe for @code{intermediate} to call @code{store} because
- @code{hack}'s stack frame, with its arguments and local variables,
- continues to exist during the call to @code{intermediate}.
- Calling the nested function through its address after the containing
- function has exited is asking for trouble. If it is called after a
- containing scope level has exited, and if it refers to some of the
- variables that are no longer in scope, it will refer to memory
- containing junk or other data. It's not wise to take the risk.
- The GNU C Compiler implements taking the address of a nested function
- using a technique called @dfn{trampolines}. This technique was
- described in @cite{Lexical Closures for C@t{++}} (Thomas M. Breuel,
- USENIX C@t{++} Conference Proceedings, October 17--21, 1988).
- A nested function can jump to a label inherited from a containing
- function, provided the label was explicitly declared in the containing
- function (@pxref{Local Labels}). Such a jump returns instantly to the
- containing function, exiting the nested function that did the
- @code{goto} and any intermediate function invocations as well. Here
- is an example:
- @example
- @group
- bar (int *array, int offset, int size)
- @{
- /* @r{Explicitly declare the label @code{failure}.} */
- __label__ failure;
- int access (int *array, int index)
- @{
- if (index > size)
- /* @r{Exit this function,}
- @r{and return to @code{bar}.} */
- goto failure;
- return array[index + offset];
- @}
- @end group
- @group
- int i;
- @r{@dots{}}
- for (i = 0; i < size; i++)
- @r{@dots{}} access (array, i) @r{@dots{}}
- @r{@dots{}}
- return 0;
- /* @r{Control comes here from @code{access}
- if it does the @code{goto}.} */
- failure:
- return -1;
- @}
- @end group
- @end example
- To declare the nested function before its definition, use
- @code{auto} (which is otherwise meaningless for function declarations;
- @pxref{auto and register}). For example,
- @example
- bar (int *array, int offset, int size)
- @{
- auto int access (int *, int);
- @r{@dots{}}
- @r{@dots{}} access (array, i) @r{@dots{}}
- @r{@dots{}}
- int access (int *array, int index)
- @{
- @r{@dots{}}
- @}
- @r{@dots{}}
- @}
- @end example
- @node Inline Function Definitions
- @subsection Inline Function Definitions
- @cindex inline function definitions
- @cindex function definitions, inline
- @findex inline
- To declare a function inline, use the @code{inline} keyword in its
- definition. Here's a simple function that takes a pointer-to-@code{int}
- and increments the integer stored there---declared inline.
- @example
- struct list
- @{
- struct list *first, *second;
- @};
- inline struct list *
- list_first (struct list *p)
- @{
- return p->first;
- @}
- inline struct list *
- list_second (struct list *p)
- @{
- return p->second;
- @}
- @end example
- optimized compilation can substitute the inline function's body for
- any call to it. This is called @emph{inlining} the function. It
- makes the code that contains the call run faster, significantly so if
- the inline function is small.
- Here's a function that uses @code{pair_second}:
- @example
- int
- pairlist_length (struct list *l)
- @{
- int length = 0;
- while (l)
- @{
- length++;
- l = pair_second (l);
- @}
- return length;
- @}
- @end example
- Substituting the code of @code{pair_second} into the definition of
- @code{pairlist_length} results in this code, in effect:
- @example
- int
- pairlist_length (struct list *l)
- @{
- int length = 0;
- while (l)
- @{
- length++;
- l = l->second;
- @}
- return length;
- @}
- @end example
- Since the definition of @code{pair_second} does not say @code{extern}
- or @code{static}, that definition is used only for inlining. It
- doesn't generate code that can be called at run time. If not all the
- calls to the function are inlined, there must be a definition of the
- same function name in another module for them to call.
- @cindex inline functions, omission of
- @c @opindex fkeep-inline-functions
- Adding @code{static} to an inline function definition means the
- function definition is limited to this compilation module. Also, it
- generates run-time code if necessary for the sake of any calls that
- were not inlined. If all calls are inlined then the function
- definition does not generate run-time code, but you can force
- generation of run-time code with the option
- @option{-fkeep-inline-functions}.
- @cindex extern inline function
- Specifying @code{extern} along with @code{inline} means the function is
- external and generates run-time code to be called from other
- separately compiled modules, as well as inlined. You can define the
- function as @code{inline} without @code{extern} in other modules so as
- to inline calls to the same function in those modules.
- Why are some calls not inlined? First of all, inlining is an
- optimization, so non-optimized compilation does not inline.
- Some calls cannot be inlined for technical reasons. Also, certain
- usages in a function definition can make it unsuitable for inline
- substitution. Among these usages are: variadic functions, use of
- @code{alloca}, use of computed goto (@pxref{Labels as Values}), and
- use of nonlocal goto. The option @option{-Winline} requests a warning
- when a function marked @code{inline} is unsuitable to be inlined. The
- warning explains what obstacle makes it unsuitable.
- Just because a call @emph{can} be inlined does not mean it
- @emph{should} be inlined. The GNU C compiler weighs costs and
- benefits to decide whether inlining a particular call is advantageous.
- You can force inlining of all calls to a given function that can be
- inlined, even in a non-optimized compilation. by specifying the
- @samp{always_inline} attribute for the function, like this:
- @example
- /* @r{Prototype.} */
- inline void foo (const char) __attribute__((always_inline));
- @end example
- @noindent
- This is a GNU C extension. @xref{Attributes}.
- A function call may be inlined even if not declared @code{inline} in
- special cases where the compiler can determine this is correct and
- desirable. For instance, when a static function is called only once,
- it will very likely be inlined. With @option{-flto}, link-time
- optimization, any function might be inlined. To absolutely prevent
- inlining of a specific function, specify
- @code{__attribute__((__noinline__))} in the function's definition.
- @node Obsolete Definitions
- @section Obsolete Function Features
- These features of function definitions are still used in old
- programs, but you shouldn't write code this way today.
- If you are just learning C, you can skip this section.
- @menu
- * Old GNU Inlining:: An older inlining technique.
- * Old-Style Function Definitions:: Original K&R style functions.
- @end menu
- @node Old GNU Inlining
- @subsection Older GNU C Inlining
- The GNU C spec for inline functions, before GCC version 5, defined
- @code{extern inline} on a function definition to mean to inline calls
- to it but @emph{not} generate code for the function that could be
- called at run time. By contrast, @code{inline} without @code{extern}
- specified to generate run-time code for the function. In effect, ISO
- incompatibly flipped the meanings of these two cases. We changed GCC
- in version 5 to adopt the ISO specification.
- Many programs still use these cases with the previous GNU C meanings.
- You can specify use of those meanings with the option
- @option{-fgnu89-inline}. You can also specify this for a single
- function with @code{__attribute__ ((gnu_inline))}. Here's an example:
- @example
- inline __attribute__ ((gnu_inline))
- int
- inc (int *a)
- @{
- (*a)++;
- @}
- @end example
- @node Old-Style Function Definitions
- @subsection Old-Style Function Definitions
- @cindex old-style function definitions
- @cindex function definitions, old-style
- @cindex K&R-style function definitions
- The syntax of C traditionally allows omitting the data type in a
- function declaration if it specifies a storage class or a qualifier.
- Then the type defaults to @code{int}. For example:
- @example
- static foo (double x);
- @end example
- @noindent
- defaults the return type to @code{int}. This is bad practice; if you
- see it, fix it.
- An @dfn{old-style} (or ``K&R'') function definition is the way
- function definitions were written in the 1980s. It looks like this:
- @example
- @var{rettype}
- @var{function} (@var{parmnames})
- @var{parm_declarations}
- @{
- @var{body}
- @}
- @end example
- In @var{parmnames}, only the parameter names are listed, separated by
- commas. Then @var{parm_declarations} declares their data types; these
- declarations look just like variable declarations. If a parameter is
- listed in @var{parmnames} but has no declaration, it is implicitly
- declared @code{int}.
- There is no reason to write a definition this way nowadays, but they
- can still be seen in older GNU programs.
- An old-style variadic function definition looks like this:
- @example
- #include <varargs.h>
- int
- add_multiple_values (va_alist)
- va_dcl
- @{
- int argcount;
- int counter, total = 0;
- /* @r{Declare a variable of type @code{va_list}.} */
- va_list argptr;
- /* @r{Initialize that variable.} */
- va_start (argptr);
- /* @r{Get the first argument (fixed).} */
- argcount = va_arg (int);
- for (counter = 0; counter < argcount; counter++)
- @{
- /* @r{Get the next additional argument.} */
- total += va_arg (argptr, int);
- @}
- /* @r{End use of the @code{argptr} variable.} */
- va_end (argptr);
- return total;
- @}
- @end example
- Note that the old-style variadic function definition has no fixed
- parameter variables; all arguments must be obtained with
- @code{va_arg}.
- @node Compatible Types
- @chapter Compatible Types
- @cindex compatible types
- @cindex types, compatible
- Declaring a function or variable twice is valid in C only if the two
- declarations specify @dfn{compatible} types. In addition, some
- operations on pointers require operands to have compatible target
- types.
- In C, two different primitive types are never compatible. Likewise for
- the defined types @code{struct}, @code{union} and @code{enum}: two
- separately defined types are incompatible unless they are defined
- exactly the same way.
- However, there are a few cases where different types can be
- compatible:
- @itemize @bullet
- @item
- Every enumeration type is compatible with some integer type. In GNU
- C, the choice of integer type depends on the largest enumeration
- value.
- @c ??? Which one, in GCC?
- @c ??? ... it varies, depending on the enum values. Testing on
- @c ??? fencepost, it appears to use a 4-byte signed integer first,
- @c ??? then moves on to an 8-byte signed integer. These details
- @c ??? might be platform-dependent, as the C standard says that even
- @c ??? char could be used as an enum type, but it's at least true
- @c ??? that GCC chooses a type that is at least large enough to
- @c ??? hold the largest enum value.
- @item
- Array types are compatible if the element types are compatible
- and the sizes (when specified) match.
- @item
- Pointer types are compatible if the pointer target types are
- compatible.
- @item
- Function types that specify argument types are compatible if the
- return types are compatible and the argument types are compatible,
- argument by argument. In addition, they must all agree in whether
- they use @code{...} to allow additional arguments.
- @item
- Function types that don't specify argument types are compatible if the
- return types are.
- @item
- Function types that specify the argument types are compatible with
- function types that omit them, if the return types are compatible and
- the specified argument types are unaltered by the argument promotions
- (@pxref{Argument Promotions}).
- @end itemize
- In order for types to be compatible, they must agree in their type
- qualifiers. Thus, @code{const int} and @code{int} are incompatible.
- It follows that @code{const int *} and @code{int *} are incompatible
- too (they are pointers to types that are not compatible).
- If two types are compatible ignoring the qualifiers, we call them
- @dfn{nearly compatible}. (If they are array types, we ignore
- qualifiers on the element types.@footnote{This is a GNU C extension.})
- Comparison of pointers is valid if the pointers' target types are
- nearly compatible. Likewise, the two branches of a conditional
- expression may be pointers to nearly compatible target types.
- If two types are compatible ignoring the qualifiers, and the first
- type has all the qualifiers of the second type, we say the first is
- @dfn{upward compatible} with the second. Assignment of pointers
- requires the assigned pointer's target type to be upward compatible
- with the right operand (the new value)'s target type.
- @node Type Conversions
- @chapter Type Conversions
- @cindex type conversions
- @cindex conversions, type
- C converts between data types automatically when that seems clearly
- necessary. In addition, you can convert explicitly with a @dfn{cast}.
- @menu
- * Explicit Type Conversion:: Casting a value from one type to another.
- * Assignment Type Conversions:: Automatic conversion by assignment operation.
- * Argument Promotions:: Automatic conversion of function parameters.
- * Operand Promotions:: Automatic conversion of arithmetic operands.
- * Common Type:: When operand types differ, which one is used?
- @end menu
- @node Explicit Type Conversion
- @section Explicit Type Conversion
- @cindex cast
- @cindex explicit type conversion
- You can do explicit conversions using the unary @dfn{cast} operator,
- which is written as a type designator (@pxref{Type Designators}) in
- parentheses. For example, @code{(int)} is the operator to cast to
- type @code{int}. Here's an example of using it:
- @example
- @{
- double d = 5.5;
- printf ("Floating point value: %f\n", d);
- printf ("Rounded to integer: %d\n", (int) d);
- @}
- @end example
- Using @code{(int) d} passes an @code{int} value as argument to
- @code{printf}, so you can print it with @samp{%d}. Using just
- @code{d} without the cast would pass the value as @code{double}.
- That won't work at all with @samp{%d}; the results would be gibberish.
- To divide one integer by another without rounding,
- cast either of the integers to @code{double} first:
- @example
- (double) @var{dividend} / @var{divisor}
- @var{dividend} / (double) @var{divisor}
- @end example
- It is enough to cast one of them, because that forces the common type
- to @code{double} so the other will be converted automatically.
- The valid cast conversions are:
- @itemize @bullet
- @item
- One numerical type to another.
- @item
- One pointer type to another.
- (Converting between pointers that point to functions
- and pointers that point to data is not standard C.)
- @item
- A pointer type to an integer type.
- @item
- An integer type to a pointer type.
- @item
- To a union type, from the type of any alternative in the union
- (@pxref{Unions}). (This is a GNU extension.)
- @item
- Anything, to @code{void}.
- @end itemize
- @node Assignment Type Conversions
- @section Assignment Type Conversions
- @cindex assignment type conversions
- Certain type conversions occur automatically in assignments
- and certain other contexts. These are the conversions
- assignments can do:
- @itemize @bullet
- @item
- Converting any numeric type to any other numeric type.
- @item
- Converting @code{void *} to any other pointer type
- (except pointer-to-function types).
- @item
- Converting any other pointer type to @code{void *}.
- (except pointer-to-function types).
- @item
- Converting 0 (a null pointer constant) to any pointer type.
- @item
- Converting any pointer type to @code{bool}. (The result is
- 1 if the pointer is not null.)
- @item
- Converting between pointer types when the left-hand target type is
- upward compatible with the right-hand target type. @xref{Compatible
- Types}.
- @end itemize
- These type conversions occur automatically in certain contexts,
- which are:
- @itemize @bullet
- @item
- An assignment converts the type of the right-hand expression
- to the type wanted by the left-hand expression. For example,
- @example
- double i;
- i = 5;
- @end example
- @noindent
- converts 5 to @code{double}.
- @item
- A function call, when the function specifies the type for that
- argument, converts the argument value to that type. For example,
- @example
- void foo (double);
- foo (5);
- @end example
- @noindent
- converts 5 to @code{double}.
- @item
- A @code{return} statement converts the specified value to the type
- that the function is declared to return. For example,
- @example
- double
- foo ()
- @{
- return 5;
- @}
- @end example
- @noindent
- also converts 5 to @code{double}.
- @end itemize
- In all three contexts, if the conversion is impossible, that
- constitutes an error.
- @node Argument Promotions
- @section Argument Promotions
- @cindex argument promotions
- @cindex promotion of arguments
- When a function's definition or declaration does not specify the type
- of an argument, that argument is passed without conversion in whatever
- type it has, with these exceptions:
- @itemize @bullet
- @item
- Some narrow numeric values are @dfn{promoted} to a wider type. If the
- expression is a narrow integer, such as @code{char} or @code{short},
- the call converts it automatically to @code{int} (@pxref{Integer
- Types}).@footnote{On an embedded controller where @code{char}
- or @code{short} is the same width as @code{int}, @code{unsigned char}
- or @code{unsigned short} promotes to @code{unsigned int}, but that
- never occurs in GNU C on real computers.}
- In this example, the expression @code{c} is passed as an @code{int}:
- @example
- char c = '$';
- printf ("Character c is '%c'\n", c);
- @end example
- @item
- If the expression
- has type @code{float}, the call converts it automatically to
- @code{double}.
- @item
- An array as argument is converted to a pointer to its zeroth element.
- @item
- A function name as argument is converted to a pointer to that function.
- @end itemize
- @node Operand Promotions
- @section Operand Promotions
- @cindex operand promotions
- The operands in arithmetic operations undergo type conversion automatically.
- These @dfn{operand promotions} are the same as the argument promotions
- except without converting @code{float} to @code{double}. In other words,
- the operand promotions convert
- @itemize @bullet
- @item
- @code{char} or @code{short} (whether signed or not) to @code{int}.
- @item
- an array to a pointer to its zeroth element, and
- @item
- a function name to a pointer to that function.
- @end itemize
- @node Common Type
- @section Common Type
- @cindex common type
- Arithmetic binary operators (except the shift operators) convert their
- operands to the @dfn{common type} before operating on them.
- Conditional expressions also convert the two possible results to their
- common type. Here are the rules for determining the common type.
- If one of the numbers has a floating-point type and the other is an
- integer, the common type is that floating-point type. For instance,
- @example
- 5.6 * 2 @result{} 11.2 /* @r{a @code{double} value} */
- @end example
- If both are floating point, the type with the larger range is the
- common type.
- If both are integers but of different widths, the common type
- is the wider of the two.
- If they are integer types of the same width, the common type is
- unsigned if either operand is unsigned, and it's @code{long} if either
- operand is @code{long}. It's @code{long long} if either operand is
- @code{long long}.
- These rules apply to addition, subtraction, multiplication, division,
- remainder, comparisons, and bitwise operations. They also apply to
- the two branches of a conditional expression, and to the arithmetic
- done in a modifying assignment operation.
- @node Scope
- @chapter Scope
- @cindex scope
- @cindex block scope
- @cindex function scope
- @cindex function prototype scope
- Each definition or declaration of an identifier is visible
- in certain parts of the program, which is typically less than the whole
- of the program. The parts where it is visible are called its @dfn{scope}.
- Normally, declarations made at the top-level in the source -- that is,
- not within any blocks and function definitions -- are visible for the
- entire contents of the source file after that point. This is called
- @dfn{file scope} (@pxref{File-Scope Variables}).
- Declarations made within blocks of code, including within function
- definitions, are visible only within those blocks. This is called
- @dfn{block scope}. Here is an example:
- @example
- @group
- void
- foo (void)
- @{
- int x = 42;
- @}
- @end group
- @end example
- @noindent
- In this example, the variable @code{x} has block scope; it is visible
- only within the @code{foo} function definition block. Thus, other
- blocks could have their own variables, also named @code{x}, without
- any conflict between those variables.
- A variable declared inside a subblock has a scope limited to
- that subblock,
- @example
- @group
- void
- foo (void)
- @{
- @{
- int x = 42;
- @}
- // @r{@code{x} is out of scope here.}
- @}
- @end group
- @end example
- If a variable declared within a block has the same name as a variable
- declared outside of that block, the definition within the block
- takes precedence during its scope:
- @example
- @group
- int x = 42;
- void
- foo (void)
- @{
- int x = 17;
- printf ("%d\n", x);
- @}
- @end group
- @end example
- @noindent
- This prints 17, the value of the variable @code{x} declared in the
- function body block, rather than the value of the variable @code{x} at
- file scope. We say that the inner declaration of @code{x}
- @dfn{shadows} the outer declaration, for the extent of the inner
- declaration's scope.
- A declaration with block scope can be shadowed by another declaration
- with the same name in a subblock.
- @example
- @group
- void
- foo (void)
- @{
- char *x = "foo";
- @{
- int x = 42;
- @r{@dots{}}
- exit (x / 6);
- @}
- @}
- @end group
- @end example
- A function parameter's scope is the entire function body, but it can
- be shadowed. For example:
- @example
- @group
- int x = 42;
- void
- foo (int x)
- @{
- printf ("%d\n", x);
- @}
- @end group
- @end example
- @noindent
- This prints the value of @code{x} the function parameter, rather than
- the value of the file-scope variable @code{x}. However,
- Labels (@pxref{goto Statement}) have @dfn{function} scope: each label
- is visible for the whole of the containing function body, both before
- and after the label declaration:
- @example
- @group
- void
- foo (void)
- @{
- @r{@dots{}}
- goto bar;
- @r{@dots{}}
- @{ // @r{Subblock does not affect labels.}
- bar:
- @r{@dots{}}
- @}
- goto bar;
- @}
- @end group
- @end example
- Except for labels, a declared identifier is not
- visible to code before its declaration. For example:
- @example
- @group
- int x = 5;
- int y = x + 10;
- @end group
- @end example
- @noindent
- will work, but:
- @example
- @group
- int x = y + 10;
- int y = 5;
- @end group
- @end example
- @noindent
- cannot refer to the variable @code{y} before its declaration.
- @include cpp.texi
- @node Integers in Depth
- @chapter Integers in Depth
- This chapter explains the machine-level details of integer types: how
- they are represented as bits in memory, and the range of possible
- values for each integer type.
- @menu
- * Integer Representations:: How integer values appear in memory.
- * Maximum and Minimum Values:: Value ranges of integer types.
- @end menu
- @node Integer Representations
- @section Integer Representations
- @cindex integer representations
- @cindex representation of integers
- Modern computers store integer values as binary (base-2) numbers that
- occupy a single unit of storage, typically either as an 8-bit
- @code{char}, a 16-bit @code{short int}, a 32-bit @code{int}, or
- possibly, a 64-bit @code{long long int}. Whether a @code{long int} is
- a 32-bit or a 64-bit value is system dependent.@footnote{In theory,
- any of these types could have some other size, bit it's not worth even
- a minute to cater to that possibility. It never happens on
- GNU/Linux.}
- @cindex @code{CHAR_BIT}
- The macro @code{CHAR_BIT}, defined in @file{limits.h}, gives the number
- of bits in type @code{char}. On any real operating system, the value
- is 8.
- The fixed sizes of numeric types necessarily limits their @dfn{range
- of values}, and the particular encoding of integers decides what that
- range is.
- @cindex two's-complement representation
- For unsigned integers, the entire space is used to represent a
- nonnegative value. Signed integers are stored using
- @dfn{two's-complement representation}: a signed integer with @var{n}
- bits has a range from @math{-2@sup{(@var{n} - 1)}} to @minus{}1 to 0
- to 1 to @math{+2@sup{(@var{n} - 1)} - 1}, inclusive. The leftmost, or
- high-order, bit is called the @dfn{sign bit}.
- @c ??? Needs correcting
- There is only one value that means zero, and the most negative number
- lacks a positive counterpart. As a result, negating that number
- causes overflow; in practice, its result is that number back again.
- For example, a two's-complement signed 8-bit integer can represent all
- decimal numbers from @minus{}128 to +127. We will revisit that
- peculiarity shortly.
- Decades ago, there were computers that didn't use two's-complement
- representation for integers (@pxref{Integers in Depth}), but they are
- long gone and not worth any effort to support.
- @c ??? Is this duplicate?
- When an arithmetic operation produces a value that is too big to
- represent, the operation is said to @dfn{overflow}. In C, integer
- overflow does not interrupt the control flow or signal an error.
- What it does depends on signedness.
- For unsigned arithmetic, the result of an operation that overflows is
- the @var{n} low-order bits of the correct value. If the correct value
- is representable in @var{n} bits, that is always the result;
- thus we often say that ``integer arithmetic is exact,'' omitting the
- crucial qualifying phrase ``as long as the exact result is
- representable.''
- In principle, a C program should be written so that overflow never
- occurs for signed integers, but in GNU C you can specify various ways
- of handling such overflow (@pxref{Integer Overflow}).
- Integer representations are best understood by looking at a table for
- a tiny integer size; here are the possible values for an integer with
- three bits:
- @multitable @columnfractions .25 .25 .25 .25
- @headitem Unsigned @tab Signed @tab Bits @tab 2s Complement
- @item 0 @tab 0 @tab 000 @tab 000 (0)
- @item 1 @tab 1 @tab 001 @tab 111 (-1)
- @item 2 @tab 2 @tab 010 @tab 110 (-2)
- @item 3 @tab 3 @tab 011 @tab 101 (-3)
- @item 4 @tab -4 @tab 100 @tab 100 (-4)
- @item 5 @tab -3 @tab 101 @tab 011 (3)
- @item 6 @tab -2 @tab 110 @tab 010 (2)
- @item 7 @tab -1 @tab 111 @tab 001 (1)
- @end multitable
- The parenthesized decimal numbers in the last column represent the
- signed meanings of the two's-complement of the line's value. Recall
- that, in two's-complement encoding, the high-order bit is 0 when
- the number is nonnegative.
- We can now understand the peculiar behavior of negation of the
- most negative two's-complement integer: start with 0b100,
- invert the bits to get 0b011, and add 1: we get
- 0b100, the value we started with.
- We can also see overflow behavior in two's-complement:
- @example
- 3 + 1 = 0b011 + 0b001 = 0b100 = (-4)
- 3 + 2 = 0b011 + 0b010 = 0b101 = (-3)
- 3 + 3 = 0b011 + 0b011 = 0b110 = (-2)
- @end example
- @noindent
- A sum of two nonnegative signed values that overflows has a 1 in the
- sign bit, so the exact positive result is truncated to a negative
- value.
- @c =====================================================================
- @node Maximum and Minimum Values
- @section Maximum and Minimum Values
- @cindex maximum integer values
- @cindex minimum integer values
- @cindex integer ranges
- @cindex ranges of integer types
- @findex INT_MAX
- @findex UINT_MAX
- @findex SHRT_MAX
- @findex LONG_MAX
- @findex LLONG_MAX
- @findex USHRT_MAX
- @findex ULONG_MAX
- @findex ULLONG_MAX
- @findex CHAR_MAX
- @findex SCHAR_MAX
- @findex UCHAR_MAX
- For each primitive integer type, there is a standard macro defined in
- @file{limits.h} that gives the largest value that type can hold. For
- instance, for type @code{int}, the maximum value is @code{INT_MAX}.
- On a 32-bit computer, that is equal to 2,147,483,647. The
- maximum value for @code{unsigned int} is @code{UINT_MAX}, which on a
- 32-bit computer is equal to 4,294,967,295. Likewise, there are
- @code{SHRT_MAX}, @code{LONG_MAX}, and @code{LLONG_MAX}, and
- corresponding unsigned limits @code{USHRT_MAX}, @code{ULONG_MAX}, and
- @code{ULLONG_MAX}.
- Since there are three ways to specify a @code{char} type, there are
- also three limits: @code{CHAR_MAX}, @code{SCHAR_MAX}, and
- @code{UCHAR_MAX}.
- For each type that is or might be signed, there is another symbol that
- gives the minimum value it can hold. (Just replace @code{MAX} with
- @code{MIN} in the names listed above.) There is no minimum limit
- symbol for types specified with @code{unsigned} because the
- minimum for them is universally zero.
- @code{INT_MIN} is not the negative of @code{INT_MAX}. In
- two's-complement representation, the most negative number is 1 less
- than the negative of the most positive number. Thus, @code{INT_MIN}
- on a 32-bit computer has the value @minus{}2,147,483,648. You can't
- actually write the value that way in C, since it would overflow.
- That's a good reason to use @code{INT_MIN} to specify
- that value. Its definition is written to avoid overflow.
- @include fp.texi
- @node Compilation
- @chapter Compilation
- @cindex object file
- @cindex compilation module
- @cindex make rules
- Early in the manual we explained how to compile a simple C program
- that consists of a single source file (@pxref{Compile Example}).
- However, we handle only short programs that way. A typical C program
- consists of many source files, each of which is a separate
- @dfn{compilation module}---meaning that it has to be compiled
- separately.
- The full details of how to compile with GCC are documented in xxxx.
- @c ??? ref
- Here we give only a simple introduction.
- These are the commands to compile two compilation modules,
- @file{foo.c} and @file{bar.c}, with a command for each module:
- @example
- gcc -c -O -g foo.c
- gcc -c -O -g bar.c
- @end example
- @noindent
- In these commands, @option{-g} says to generate debugging information,
- @option{-O} says to do some optimization, and @option{-c} says to put
- the compiled code for that module into a corresponding @dfn{object
- file} and go no further. The object file for @file{foo.c} is called
- @file{foo.o}, and so on.
- If you wish, you can specify the additional options @option{-Wformat
- -Wparenthesis -Wstrict-prototypes}, which request additional warnings.
- One reason to divide a large program into multiple compilation modules
- is to control how each module can access the internals of the others.
- When a module declares a function or variable @code{extern}, other
- modules can access it. The other functions and variables in
- a module can't be accessed from outside that module.
- The other reason for using multiple modules is so that changing
- one source file does not require recompiling all of them in order
- to try the modified program. Dividing a large program into many
- substantial modules in this way typically makes recompilation much faster.
- @cindex linking object files
- After you compile all the program's modules, in order to run the
- program you must @dfn{link} the object files into a combined
- executable, like this:
- @example
- gcc -o foo foo.o bar.o
- @end example
- @noindent
- In this command, @option{-o foo} species the file name for the
- executable file, and the other arguments are the object files to link.
- Always specify the executable file name in a command that generates
- one.
- Normally we don't run any of these commands directly. Instead we
- write a set of @dfn{make rules} for the program, then use the
- @command{make} program to recompile only the source files that need to
- be recompiled.
- @c ??? ref to make manual
- @node Directing Compilation
- @chapter Directing Compilation
- This chapter describes C constructs that don't alter the program's
- meaning @emph{as such}, but rather direct the compiler how to treat
- some aspects of the program.
- @menu
- * Pragmas:: Controling compilation of some constructs.
- * Static Assertions:: Compile-time tests for conditions.
- @end menu
- @node Pragmas
- @section Pragmas
- A @dfn{pragma} is an annotation in a program that gives direction to
- the compiler.
- @menu
- * Pragma Basics:: Pragma syntax and usage.
- * Severity Pragmas:: Settings for compile-time pragma output.
- * Optimization Pragmas:: Controlling optimizations.
- @end menu
- @c See also @ref{Macro Pragmas}, which save and restore macro definitions.
- @node Pragma Basics
- @subsection Pragma Basics
- C defines two syntactical forms for pragmas, the line form and the
- token form. You can write any pragma in either form, with the same
- meaning.
- The line form is a line in the source code, like this:
- @example
- #pragma @var{line}
- @end example
- @noindent
- The line pragma has no effect on the parsing of the lines around it.
- This form has the drawback that it can't be generated by a macro expansion.
- The token form is a series of tokens; it can appear anywhere in the
- program between the other tokens.
- @example
- _Pragma (@var{stringconstant})
- @end example
- @noindent
- The pragma has no effect on the syntax of the tokens that surround it;
- thus, here's a pragma in the middle of an @code{if} statement:
- @example
- if _Pragma ("hello") (x > 1)
- @end example
- @noindent
- However, that's an unclear thing to do; for the sake of
- understandability, it is better to put a pragma on a line by itself
- and not embedded in the middle of another construct.
- Both forms of pragma have a textual argument. In a line pragma, the
- text is the rest of the line. The textual argument to @code{_Pragma}
- uses the same syntax as a C string constant: surround the text with
- two @samp{"} characters, and add a backslash before each @samp{"} or
- @samp{\} character in it.
- With either syntax, the textual argument specifies what to do.
- It begins with one or several words that specify the operation.
- If the compiler does not recognize them, it ignores the pragma.
- Here are the pragma operations supported in GNU C@.
- @c ??? Verify font for []
- @table @code
- @item #pragma GCC dependency "@var{file}" [@var{message}]
- @itemx _Pragma ("GCC dependency \"@var{file}\" [@var{message}]")
- Declares that the current source file depends on @var{file}, so GNU C
- compares the file times and gives a warning if @var{file} is newer
- than the current source file.
- This directive searches for @var{file} the way @code{#include}
- searches for a non-system header file.
- If @var{message} is given, the warning message includes that text.
- Examples:
- @example
- #pragma GCC dependency "parse.y"
- _pragma ("GCC dependency \"/usr/include/time.h\" \
- rerun fixincludes")
- @end example
- @item #pragma GCC poison @var{identifiers}
- @itemx _Pragma ("GCC poison @var{identifiers}")
- Poisons the identifiers listed in @var{identifiers}.
- This is useful to make sure all mention of @var{identifiers} has been
- deleted from the program and that no reference to them creeps back in.
- If any of those identifiers appears anywhere in the source after the
- directive, it causes a compilation error. For example,
- @example
- #pragma GCC poison printf sprintf fprintf
- sprintf(some_string, "hello");
- @end example
- @noindent
- generates an error.
- If a poisoned identifier appears as part of the expansion of a macro
- that was defined before the identifier was poisoned, it will @emph{not}
- cause an error. Thus, system headers that define macros that use
- the identifier will not cause errors.
- For example,
- @example
- #define strrchr rindex
- _Pragma ("GCC poison rindex")
- strrchr(some_string, 'h');
- @end example
- @noindent
- does not cause a compilation error.
- @item #pragma GCC system_header
- @itemx _Pragma ("GCC system_header")
- Specify treating the rest of the current source file as if it came
- from a system header file. @xref{System Headers, System Headers,
- System Headers, gcc, Using the GNU Compiler Collection}.
- @item #pragma GCC warning @var{message}
- @itemx _Pragma ("GCC warning @var{message}")
- Equivalent to @code{#warning}. Its advantage is that the
- @code{_Pragma} form can be included in a macro definition.
- @item #pragma GCC error @var{message}
- @itemx _Pragma ("GCC error @var{message}")
- Equivalent to @code{#error}. Its advantage is that the
- @code{_Pragma} form can be included in a macro definition.
- @item #pragma GCC message @var{message}
- @itemx _Pragma ("GCC message @var{message}")
- Similar to @samp{GCC warning} and @samp{GCC error}, this simply prints an
- informational message, and could be used to include additional warning
- or error text without triggering more warnings or errors. (Note that
- unlike @samp{warning} and @samp{error}, @samp{message} does not include
- @samp{GCC} as part of the pragma.)
- @end table
- @node Severity Pragmas
- @subsection Severity Pragmas
- These pragmas control the severity of classes of diagnostics.
- You can specify the class of diagnostic with the GCC option that causes
- those diagnostics to be generated.
- @table @code
- @item #pragma GCC diagnostic error @var{option}
- @itemx _Pragma ("GCC diagnostic error @var{option}")
- For code following this pragma, treat diagnostics of the variety
- specified by @var{option} as errors. For example:
- @example
- _Pragma ("GCC diagnostic error -Wformat")
- @end example
- @noindent
- specifies to treat diagnostics enabled by the @var{-Wformat} option
- as errors rather than warnings.
- @item #pragma GCC diagnostic warning @var{option}
- @itemx _Pragma ("GCC diagnostic warning @var{option}")
- For code following this pragma, treat diagnostics of the variety
- specified by @var{option} as warnings. This overrides the
- @var{-Werror} option which says to treat warnings as errors.
- @item #pragma GCC diagnostic ignore @var{option}
- @itemx _Pragma ("GCC diagnostic ignore @var{option}")
- For code following this pragma, refrain from reporting any diagnostics
- of the variety specified by @var{option}.
- @item #pragma GCC diagnostic push
- @itemx _Pragma ("GCC diagnostic push")
- @itemx #pragma GCC diagnostic pop
- @itemx _Pragma ("GCC diagnostic pop")
- These pragmas maintain a stack of states for severity settings.
- @samp{GCC diagnostic push} saves the current settings on the stack,
- and @samp{GCC diagnostic pop} pops the last stack item and restores
- the current settings from that.
- @samp{GCC diagnostic pop} when the severity setting stack is empty
- restores the settings to what they were at the start of compilation.
- Here is an example:
- @example
- _Pragma ("GCC diagnostic error -Wformat")
- /* @r{@option{-Wformat} messages treated as errors. } */
- _Pragma ("GCC diagnostic push")
- _Pragma ("GCC diagnostic warning -Wformat")
- /* @r{@option{-Wformat} messages treated as warnings. } */
- _Pragma ("GCC diagnostic push")
- _Pragma ("GCC diagnostic ignored -Wformat")
- /* @r{@option{-Wformat} messages suppressed. } */
- _Pragma ("GCC diagnostic pop")
- /* @r{@option{-Wformat} messages treated as warnings again. } */
- _Pragma ("GCC diagnostic pop")
- /* @r{@option{-Wformat} messages treated as errors again. } */
- /* @r{This is an excess @samp{pop} that matches no @samp{push}. } */
- _Pragma ("GCC diagnostic pop")
- /* @r{@option{-Wformat} messages treated once again}
- @r{as specified by the GCC command-line options.} */
- @end example
- @end table
- @node Optimization Pragmas
- @subsection Optimization Pragmas
- These pragmas enable a particular optimization for specific function
- definitions. The settings take effect at the end of a function
- definition, so the clean place to use these pragmas is between
- function definitions.
- @table @code
- @item #pragma GCC optimize @var{optimization}
- @itemx _Pragma ("GCC optimize @var{optimization}")
- These pragmas enable the optimization @var{optimization} for the
- following functions. For example,
- @example
- _Pragma ("GCC optimize -fforward-propagate")
- @end example
- @noindent
- says to apply the @samp{forward-propagate} optimization to all
- following function definitions. Specifying optimizations for
- individual functions, rather than for the entire program, is rare but
- can be useful for getting around a bug in the compiler.
- If @var{optimization} does not correspond to a defined optimization
- option, the pragma is erroneous. To turn off an optimization, use the
- corresponding @samp{-fno-} option, such as
- @samp{-fno-forward-propagate}.
- @item #pragma GCC target @var{optimizations}
- @itemx _Pragma ("GCC target @var{optimizations}")
- The pragma @samp{GCC target} is similar to @samp{GCC optimize} but is
- used for platform-specific optimizations. Thus,
- @example
- _Pragma ("GCC target popcnt")
- @end example
- @noindent
- activates the optimization @samp{popcnt} for all
- following function definitions. This optimization is supported
- on a few common targets but not on others.
- @item #pragma GCC push_options
- @itemx _Pragma ("GCC push_options")
- The @samp{push_options} pragma saves on a stack the current settings
- specified with the @samp{target} and @samp{optimize} pragmas.
- @item #pragma GCC pop_options
- @itemx _Pragma ("GCC pop_options")
- The @samp{pop_options} pragma pops saved settings from that stack.
- Here's an example of using this stack.
- @example
- _Pragma ("GCC push_options")
- _Pragma ("GCC optimize forward-propagate")
- /* @r{Functions to compile}
- @r{with the @code{forward-propagate} optimization.} */
- _Pragma ("GCC pop_options")
- /* @r{Ends enablement of @code{forward-propagate}.} */
- @end example
- @item #pragma GCC reset_options
- @itemx _Pragma ("GCC reset_options")
- Clears all pragma-defined @samp{target} and @samp{optimize}
- optimization settings.
- @end table
- @node Static Assertions
- @section Static Assertions
- @cindex static assertions
- @findex _Static_assert
- You can add compiler-time tests for necessary conditions into your
- code using @code{_Static_assert}. This can be useful, for example, to
- check that the compilation target platform supports the type sizes
- that the code expects. For example,
- @example
- _Static_assert ((sizeof (long int) >= 8),
- "long int needs to be at least 8 bytes");
- @end example
- @noindent
- reports a compile-time error if compiled on a system with long
- integers smaller than 8 bytes, with @samp{long int needs to be at
- least 8 bytes} as the error message.
- Since calls @code{_Static_assert} are processed at compile time, the
- expression must be computable at compile time and the error message
- must be a literal string. The expression can refer to the sizes of
- variables, but can't refer to their values. For example, the
- following static assertion is invalid for two reasons:
- @example
- char *error_message
- = "long int needs to be at least 8 bytes";
- int size_of_long_int = sizeof (long int);
- _Static_assert (size_of_long_int == 8, error_message);
- @end example
- @noindent
- The expression @code{size_of_long_int == 8} isn't computable at
- compile time, and the error message isn't a literal string.
- You can, though, use preprocessor definition values with
- @code{_Static_assert}:
- @example
- #define LONG_INT_ERROR_MESSAGE "long int needs to be \
- at least 8 bytes"
- _Static_assert ((sizeof (long int) == 8),
- LONG_INT_ERROR_MESSAGE);
- @end example
- Static assertions are permitted wherever a statement or declaration is
- permitted, including at top level in the file, and also inside the
- definition of a type.
- @example
- union y
- @{
- int i;
- int *ptr;
- _Static_assert (sizeof (int *) == sizeof (int),
- "Pointer and int not same size");
- @};
- @end example
- @node Type Alignment
- @appendix Type Alignment
- @cindex type alignment
- @cindex alignment of type
- @findex _Alignof
- @findex __alignof__
- Code for device drivers and other communication with low-level
- hardware sometimes needs to be concerned with the alignment of
- data objects in memory.
- Each data type has a required @dfn{alignment}, always a power of 2,
- that says at which memory addresses an object of that type can validly
- start. A valid address for the type must be a multiple of its
- alignment. If a type's alignment is 1, that means it can validly
- start at any address. If a type's alignment is 2, that means it can
- only start at an even address. If a type's alignment is 4, that means
- it can only start at an address that is a multiple of 4.
- The alignment of a type (except @code{char}) can vary depending on the
- kind of computer in use. To refer to the alignment of a type in a C
- program, use @code{_Alignof}, whose syntax parallels that of
- @code{sizeof}. Like @code{sizeof}, @code{_Alignof} is a compile-time
- operation, and it doesn't compute the value of the expression used
- as its argument.
- Nominally, each integer and floating-point type has an alignment equal to
- the largest power of 2 that divides its size. Thus, @code{int} with
- size 4 has a nominal alignment of 4, and @code{long long int} with
- size 8 has a nominal alignment of 8.
- However, each kind of computer generally has a maximum alignment, and
- no type needs more alignment than that. If the computer's maximum
- alignment is 4 (which is common), then no type's alignment is more
- than 4.
- The size of any type is always a multiple of its alignment; that way,
- in an array whose elements have that type, all the elements are
- properly aligned if the first one is.
- These rules apply to all real computers today, but some embedded
- controllers have odd exceptions. We don't have references to cite for
- them.
- @c We can't cite a nonfree manual as documentation.
- Ordinary C code guarantees that every object of a given type is in
- fact aligned as that type requires.
- If the operand of @code{_Alignof} is a structure field, the value
- is the alignment it requires. It may have a greater alignment by
- coincidence, due to the other fields, but @code{_Alignof} is not
- concerned about that. @xref{Structures}.
- Older versions of GNU C used the keyword @code{__alignof__} for this,
- but now that the feature has been standardized, it is better
- to use the standard keyword @code{_Alignof}.
- @findex _Alignas
- @findex __aligned__
- You can explicitly specify an alignment requirement for a particular
- variable or structure field by adding @code{_Alignas
- (@var{alignment})} to the declaration, where @var{alignment} is a
- power of 2 or a type name. For instance:
- @example
- char _Alignas (8) x;
- @end example
- @noindent
- or
- @example
- char _Alignas (double) x;
- @end example
- @noindent
- specifies that @code{x} must start on an address that is a multiple of
- 8. However, if @var{alignment} exceeds the maximum alignment for the
- machine, that maximum is how much alignment @code{x} will get.
- The older GNU C syntax for this feature looked like
- @code{__attribute__ ((__aligned__ (@var{alignment})))} to the
- declaration, and was added after the variable. For instance:
- @example
- char x __attribute__ ((__aligned__ 8));
- @end example
- @xref{Attributes}.
- @node Aliasing
- @appendix Aliasing
- @cindex aliasing (of storage)
- @cindex pointer type conversion
- @cindex type conversion, pointer
- We have already presented examples of casting a @code{void *} pointer
- to another pointer type, and casting another pointer type to
- @code{void *}.
- One common kind of pointer cast is guaranteed safe: casting the value
- returned by @code{malloc} and related functions (@pxref{Dynamic Memory
- Allocation}). It is safe because these functions do not save the
- pointer anywhere else; the only way the program will access the newly
- allocated memory is via the pointer just returned.
- In fact, C allows casting any pointer type to any other pointer type.
- Using this to access the same place in memory using two
- different data types is called @dfn{aliasing}.
- Aliasing is necessary in some programs that do sophisticated memory
- management, such as GNU Emacs, but most C programs don't need to do
- aliasing. When it isn't needed, @strong{stay away from it!} To do
- aliasing correctly requires following the rules stated below.
- Otherwise, the aliasing may result in malfunctions when the program
- runs.
- The rest of this appendix explains the pitfalls and rules of aliasing.
- @menu
- * Aliasing Alignment:: Memory alignment considerations for
- casting between pointer types.
- * Aliasing Length:: Type size considerations for
- casting between pointer types.
- * Aliasing Type Rules:: Even when type alignment and size matches,
- aliasing can still have surprising results.
- @end menu
- @node Aliasing Alignment
- @appendixsection Aliasing and Alignment
- In order for a type-converted pointer to be valid, it must have the
- alignment that the new pointer type requires. For instance, on most
- computers, @code{int} has alignment 4; the address of an @code{int}
- must be a multiple of 4. However, @code{char} has alignment 1, so the
- address of a @code{char} is usually not a multiple of 4. Taking the
- address of such a @code{char} and casting it to @code{int *} probably
- results in an invalid pointer. Trying to dereference it may cause a
- @code{SIGBUS} signal, depending on the platform in use (@pxref{Signals}).
- @example
- foo ()
- @{
- char i[4];
- int *p = (int *) &i[1]; /* @r{Misaligned pointer!} */
- return *p; /* @r{Crash!} */
- @}
- @end example
- This requirement is never a problem when casting the return value
- of @code{malloc} because that function always returns a pointer
- with as much alignment as any type can require.
- @node Aliasing Length
- @appendixsection Aliasing and Length
- When converting a pointer to a different pointer type, make sure the
- object it really points to is at least as long as the target of the
- converted pointer. For instance, suppose @code{p} has type @code{int
- *} and it's cast as follows:
- @example
- int *p;
- struct
- @{
- double d, e, f;
- @} foo;
- struct foo *q = (struct foo *)p;
- q->f = 5.14159;
- @end example
- @noindent
- the value @code{q->f} will run past the end of the @code{int} that
- @code{p} points to. If @code{p} was initialized to the start of an
- array of type @code{int[6]}, the object is long enough for three
- @code{double}s. But if @code{p} points to something shorter,
- @code{q->f} will run on beyond the end of that, overlaying some other
- data. Storing that will garble that other data. Or it could extend
- past the end of memory space and cause a @code{SIGSEGV} signal
- (@pxref{Signals}).
- @node Aliasing Type Rules
- @appendixsection Type Rules for Aliasing
- C code that converts a pointer to a different pointer type can use the
- pointers to access the same memory locations with two different data
- types. If the same address is accessed with different types in a
- single control thread, optimization can make the code do surprising
- things (in effect, make it malfunction).
- Here's a concrete example where aliasing that can change the code's
- behavior when it is optimized. We assume that @code{float} is 4 bytes
- long, like @code{int}, and so is every pointer. Thus, the structures
- @code{struct a} and @code{struct b} are both 8 bytes.
- @example
- #include <stdio.h>
- struct a @{ int size; char *data; @};
- struct b @{ float size; char *data; @};
- void sub (struct a *p, struct b *q)
- @{
- int x;
- p->size = 0;
- q->size = 1;
- x = p->size;
- printf("x =%d\n", x);
- printf("p->size =%d\n", (int)p->size);
- printf("q->size =%d\n", (int)q->size);
- @}
- int main(void)
- @{
- struct a foo;
- struct a *p = &foo;
- struct b *q = (struct b *) &foo;
- sub (p, q);
- @}
- @end example
- This code works as intended when compiled without optimization. All
- the operations are carried out sequentially as written. The code
- sets @code{x} to @code{p->size}, but what it actually gets is the
- bits of the floating point number 1, as type @code{int}.
- However, when optimizing, the compiler is allowed to assume
- (mistakenly, here) that @code{q} does not point to the same storage as
- @code{p}, because their data types are not allowed to alias.
- From this assumption, the compiler can deduce (falsely, here) that the
- assignment into @code{q->size} has no effect on the value of
- @code{p->size}, which must therefore still be 0. Thus, @code{x} will
- be set to 0.
- GNU C, following the C standard, @emph{defines} this optimization as
- legitimate. Code that misbehaves when optimized following these rules
- is, by definition, incorrect C code.
- The rules for storage aliasing in C are based on the two data types:
- the type of the object, and the type it is accessed through. The
- rules permit accessing part of a storage object of type @var{t} using
- only these types:
- @itemize @bullet
- @item
- @var{t}.
- @item
- A type compatible with @var{t}. @xref{Compatible Types}.
- @item
- A signed or unsigned version of one of the above.
- @item
- A qualifed version of one of the above.
- @xref{Type Qualifiers}.
- @item
- An array, structure (@pxref{Structures}), or union type
- (@code{Unions}) that contains one of the above, either directly as a
- field or through multiple levels of fields. If @var{t} is
- @code{double}, this would include @code{struct s @{ union @{ double
- d[2]; int i[4]; @} u; int i; @};} because there's a @code{double}
- inside it somewhere.
- @item
- A character type.
- @end itemize
- What do these rules say about the example in this subsection?
- For @code{foo.size} (equivalently, @code{a->size}), @var{t} is
- @code{int}. The type @code{float} is not allowed as an aliasing type
- by those rules, so @code{b->size} is not supposed to alias with
- elements of @code{j}. Based on that assumption, GNU C makes a
- permitted optimization that was not, in this case, consistent with
- what the programmer intended the program to do.
- Whether GCC actually performs type-based aliasing analysis depends on
- the details of the code. GCC has other ways to determine (in some cases)
- whether objects alias, and if it gets a reliable answer that way, it won't
- fall back on type-based heuristics.
- @c @opindex -fno-strict-aliasing
- The importance of knowing the type-based aliasing rules is not so as
- to ensure that the optimization is done where it would be safe, but so
- as to ensure it is @emph{not} done in a way that would break the
- program. You can turn off type-based aliasing analysis by giving GCC
- the option @option{-fno-strict-aliasing}.
- @node Digraphs
- @appendix Digraphs
- @cindex digraphs
- C accepts aliases for certain characters. Apparently in the 1990s
- some computer systems had trouble inputting these characters, or
- trouble displaying them. These digraphs almost never appear in C
- programs nowadays, but we mention them for completeness.
- @table @samp
- @item <:
- An alias for @samp{[}.
- @item :>
- An alias for @samp{]}.
- @item <%
- An alias for @samp{@{}.
- @item %>
- An alias for @samp{@}}.
- @item %:
- An alias for @samp{#},
- used for preprocessing directives (@pxref{Directives}) and
- macros (@pxref{Macros}).
- @end table
- @node Attributes
- @appendix Attributes in Declarations
- @cindex attributes
- @findex __attribute__
- You can specify certain additional requirements in a declaration, to
- get fine-grained control over code generation, and helpful
- informational messages during compilation. We use a few attributes in
- code examples throughout this manual, including
- @table @code
- @item aligned
- The @code{aligned} attribute specifies a minimum alignment for a
- variable or structure field, measured in bytes:
- @example
- int foo __attribute__ ((aligned (8))) = 0;
- @end example
- @noindent
- This directs GNU C to allocate @code{foo} at an address that is a
- multiple of 8 bytes. However, you can't force an alignment bigger
- than the computer's maximum meaningful alignment.
- @item packed
- The @code{packed} attribute specifies to compact the fields of a
- structure by not leaving gaps between fields. For example,
- @example
- struct __attribute__ ((packed)) bar
- @{
- char a;
- int b;
- @};
- @end example
- @noindent
- allocates the integer field @code{b} at byte 1 in the structure,
- immediately after the character field @code{a}. The packed structure
- is just 5 bytes long (assuming @code{int} is 4 bytes) and its
- alignment is 1, that of @code{char}.
- @item deprecated
- Applicable to both variables and functions, the @code{deprecated}
- attribute tells the compiler to issue a warning if the variable or
- function is ever used in the source file.
- @example
- int old_foo __attribute__ ((deprecated));
- int old_quux () __attribute__ ((deprecated));
- @end example
- @item __noinline__
- The @code{__noinline__} attribute, in a function's declaration or
- definition, specifies never to inline calls to that function. All
- calls to that function, in a compilation unit where it has this
- attribute, will be compiled to invoke the separately compiled
- function. @xref{Inline Function Definitions}.
- @item __noclone__
- The @code{__noclone__} attribute, in a function's declaration or
- definition, specifies never to clone that function. Thus, there will
- be only one compiled version of the function. @xref{Label Value
- Caveats}, for more information about cloning.
- @item always_inline
- The @code{always_inline} attribute, in a function's declaration or
- definition, specifies to inline all calls to that function (unless
- something about the function makes inlining impossible). This applies
- to all calls to that function in a compilation unit where it has this
- attribute. @xref{Inline Function Definitions}.
- @item gnu_inline
- The @code{gnu_inline} attribute, in a function's declaration or
- definition, specifies to handle the @code{inline} keywprd the way GNU
- C originally implemented it, many years before ISO C said anything
- about inlining. @xref{Inline Function Definitions}.
- @end table
- For full documentation of attributes, see the GCC manual.
- @xref{Attribute Syntax, Attribute Syntax, System Headers, gcc, Using
- the GNU Compiler Collection}.
- @node Signals
- @appendix Signals
- @cindex signal
- @cindex handler (for signal)
- @cindex @code{SIGSEGV}
- @cindex @code{SIGFPE}
- @cindex @code{SIGBUS}
- Some program operations bring about an error condition called a
- @dfn{signal}. These signals terminate the program, by default.
- There are various different kinds of signals, each with a name. We
- have seen several such error conditions through this manual:
- @table @code
- @item SIGSEGV
- This signal is generated when a program tries to read or write outside
- the memory that is allocated for it, or to write memory that can only
- be read. The name is an abbreviation for ``segmentation violation''.
- @item SIGFPE
- This signal indicates a fatal arithmetic error. The name is an
- abbreviation for ``floating-point exception'', but covers all types of
- arithmetic errors, including division by zero and overflow.
- @item SIGBUS
- This signal is generated when an invalid pointer is dereferenced,
- typically the result of dereferencing an uninintalized pointer. It is
- similar to @code{SIGSEGV}, except that @code{SIGSEGV} indicates
- invalid access to valid memory, while @code{SIGBUS} indicates an
- attempt to access an invalid address.
- @end table
- These kinds of signal allow the program to specify a function as a
- @dfn{signal handler}. When a signal has a handler, it doesn't
- terminate the program; instead it calls the handler.
- There are many other kinds of signal; here we list only those that
- come from run-time errors in C operations. The rest have to do with
- the functioning of the operating system. The GNU C Library Reference
- Manual gives more explanation about signals (@pxref{Program Signal
- Handling, The GNU C Library, , libc, The GNU C Library Reference
- Manual}).
- @node GNU Free Documentation License
- @appendix GNU Free Documentation License
- @include fdl.texi
- @node Symbol Index
- @unnumbered Index of Symbols and Keywords
- @printindex fn
- @node Concept Index
- @unnumbered Concept Index
- @printindex cp
- @bye
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement