| .. _physical: |
| |
| Physical lines |
| -------------- |
| |
| A physical line is a sequence of characters terminated by an end-of-line |
| sequence. In source files, any of the standard platform line termination |
| sequences can be used - the Unix form using ASCII LF (linefeed), the Windows |
n | form using the ASCII sequence CR LF (return followed by linefeed), or the |
n | form using the ASCII sequence CR LF (return followed by linefeed), or the old |
| Macintosh form using the ASCII CR (return) character. All of these forms can be |
| used equally, regardless of platform. |
| |
| When embedding Python, source code strings should be passed to Python APIs using |
| the standard C conventions for newline characters (the ``\n`` character, |
| representing ASCII LF, is the line terminator). |
| |
| |
| |
| If an encoding is declared, the encoding name must be recognized by Python. The |
| encoding is used for all lexical analysis, in particular to find the end of a |
| string, and to interpret the contents of Unicode literals. String literals are |
| converted to Unicode for syntactical analysis, then converted back to their |
| original encoding before interpretation starts. The encoding declaration must |
| appear on a line of its own. |
| |
n | .. % XXX there should be a list of supported encodings. |
n | .. XXX there should be a list of supported encodings. |
| |
| |
| .. _explicit-joining: |
| |
| Explicit line joining |
| --------------------- |
| |
| .. index:: |
| single: line joining |
| single: line continuation |
| single: backslash character |
| |
| Two or more physical lines may be joined into logical lines using backslash |
| characters (``\``), as follows: when a physical line ends in a backslash that is |
| not part of a string literal or comment, it is joined with the following forming |
| a single logical line, deleting the backslash and the following end-of-line |
n | character. For example: |
n | character. For example:: |
| |
| .. % |
| |
| :: |
| |
| if 1900 < year < 2100 and 1 <= month <= 12 \ |
| and 1 <= day <= 31 and 0 <= hour < 24 \ |
| and 0 <= minute < 60 and 0 <= second < 60: # Looks like a valid date |
| return 1 |
| |
| A line ending in a backslash cannot carry a comment. A backslash does not |
| continue a comment. A backslash does not continue a token except for string |
| -------- |
| |
| .. index:: |
| single: keyword |
| single: reserved word |
| |
| The following identifiers are used as reserved words, or *keywords* of the |
| language, and cannot be used as ordinary identifiers. They must be spelled |
n | exactly as written here: |
n | exactly as written here:: |
| |
n | .. % |
| .. % |
| |
| :: |
| |
| and del from not while |
| and del from not while |
| as elif global or with |
| as elif global or with |
| assert else if pass yield |
| assert else if pass yield |
| break except import print |
| break except import print |
| class exec in raise |
| class exec in raise |
| continue finally is return |
| continue finally is return |
| def for lambda try |
| def for lambda try |
| |
| .. % When adding keywords, use reswords.py for reformatting |
| |
| .. versionchanged:: 2.4 |
| :const:`None` became a constant and is now recognized by the compiler as a name |
| for the built-in object :const:`None`. Although it is not a keyword, you cannot |
| assign a different object to it. |
| |
| .. versionchanged:: 2.5 |
| Both :keyword:`as` and :keyword:`with` are only recognized when the |
| Certain classes of identifiers (besides keywords) have special meanings. These |
| classes are identified by the patterns of leading and trailing underscore |
| characters: |
| |
| ``_*`` |
| Not imported by ``from module import *``. The special identifier ``_`` is used |
| in the interactive interpreter to store the result of the last evaluation; it is |
| stored in the :mod:`__builtin__` module. When not in interactive mode, ``_`` |
n | has no special meaning and is not defined. See section :ref:`import`, "The |
n | has no special meaning and is not defined. See section :ref:`import`. |
| :keyword:`import` statement." |
| |
| .. note:: |
| |
n | The name ``_`` is often used in conjunction with internationalization; refer to |
n | The name ``_`` is often used in conjunction with internationalization; |
| the documentation for the :mod:`gettext` module (XXX reference: ../lib/module- |
| refer to the documentation for the :mod:`gettext` module for more |
| gettext.html) for more information on this convention. |
| information on this convention. |
| |
| ``__*__`` |
| System-defined names. These names are defined by the interpreter and its |
| implementation (including the standard library); applications should not expect |
| to define additional names using this convention. The set of names of this |
| class defined by Python may be extended in future versions. See section |
n | :ref:`specialnames`, "Special method names." |
n | :ref:`specialnames`. |
| |
| ``__*`` |
| Class-private names. Names in this category, when used within the context of a |
| class definition, are re-written to use a mangled form to help avoid name |
| clashes between "private" attributes of base and derived classes. See section |
n | :ref:`atom-identifiers`, "Identifiers (Names)." |
n | :ref:`atom-identifiers`. |
| |
| |
| .. _literals: |
| |
| Literals |
| ======== |
| |
| .. index:: |
| |
| .. index:: single: string literal |
| |
| String literals are described by the following lexical definitions: |
| |
| .. index:: single: ASCII@ASCII |
| |
| .. productionlist:: |
n | stringliteral: [`stringprefix`](`shortstring` \| `longstring`) |
n | stringliteral: [`stringprefix`](`shortstring` | `longstring`) |
| stringprefix: "r" \| "u" \| "ur" \| "R" \| "U" \| "UR" \| "Ur" \| "uR" |
| stringprefix: "r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR" |
| shortstring: "'" `shortstringitem`\* "'" \| '"' `shortstringitem`\* '"' |
| shortstring: "'" `shortstringitem`* "'" | '"' `shortstringitem`* '"' |
| longstring: ""'" `longstringitem`\* ""'" |
| longstring: "'''" `longstringitem`* "'''" |
| : \| '"""' `longstringitem`\* '"""' |
| : | '"""' `longstringitem`* '"""' |
| shortstringitem: `shortstringchar` \| `escapeseq` |
| shortstringitem: `shortstringchar` | `escapeseq` |
| longstringitem: `longstringchar` \| `escapeseq` |
| longstringitem: `longstringchar` | `escapeseq` |
| shortstringchar: <any source character except "\" or newline or the quote> |
| longstringchar: <any source character except "\"> |
| escapeseq: "\" <any ASCII character> |
| |
| One syntactic restriction not indicated by these productions is that whitespace |
| is not allowed between the :token:`stringprefix` and the rest of the string |
| literal. The source character set is defined by the encoding declaration; it is |
| ASCII if no encoding declaration is given in the source file; see section |
| .. index:: |
| single: triple-quoted string |
| single: Unicode Consortium |
| single: string; Unicode |
| single: raw string |
| |
| In plain English: String literals can be enclosed in matching single quotes |
| (``'``) or double quotes (``"``). They can also be enclosed in matching groups |
n | of three single or double quotes (these are generally referred to as *triple- |
n | of three single or double quotes (these are generally referred to as |
| quoted strings*). The backslash (``\``) character is used to escape characters |
| *triple-quoted strings*). The backslash (``\``) character is used to escape |
| that otherwise have a special meaning, such as newline, backslash itself, or the |
| characters that otherwise have a special meaning, such as newline, backslash |
| quote character. String literals may optionally be prefixed with a letter |
| itself, or the quote character. String literals may optionally be prefixed with |
| ``'r'`` or ``'R'``; such strings are called :dfn:`raw strings` and use different |
| a letter ``'r'`` or ``'R'``; such strings are called :dfn:`raw strings` and use |
| rules for interpreting backslash escape sequences. A prefix of ``'u'`` or |
| different rules for interpreting backslash escape sequences. A prefix of |
| ``'U'`` makes the string a Unicode string. Unicode strings use the Unicode |
| ``'u'`` or ``'U'`` makes the string a Unicode string. Unicode strings use the |
| character set as defined by the Unicode Consortium and ISO 10646. Some |
| Unicode character set as defined by the Unicode Consortium and ISO 10646. Some |
| additional escape sequences, described below, are available in Unicode strings. |
| The two prefix characters may be combined; in this case, ``'u'`` must appear |
| before ``'r'``. |
| |
| In triple-quoted strings, unescaped newlines and quotes are allowed (and are |
| retained), except that three unescaped quotes in a row terminate the string. (A |
| "quote" is the character used to open the string, i.e. either ``'`` or ``"``.) |
| |
| Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is |
| compiled to use 16-bit code units (the default). Individual code units which |
| form parts of a surrogate pair can be encoded using this escape sequence. |
| |
| (3) |
| As in Standard C, up to three octal digits are accepted. |
| |
| (4) |
n | Unlike in Standard C, at most two hex digits are accepted. |
n | Unlike in Standard C, exactly two hex digits are required. |
| |
| (5) |
| In a string literal, hexadecimal and octal escapes denote the byte with the |
| given value; it is not necessary that the byte encodes a character in the source |
| character set. In a Unicode literal, these escapes denote a Unicode character |
| with the given value. |
| |
| .. index:: single: unrecognized escape sequence |
| .. index:: |
| single: number |
| single: numeric literal |
| single: integer literal |
| single: plain integer literal |
| single: long integer literal |
| single: floating point literal |
| single: hexadecimal literal |
n | single: binary literal |
| single: octal literal |
| single: decimal literal |
| single: imaginary literal |
| single: complex; literal |
| |
| There are four types of numeric literals: plain integers, long integers, |
| floating point numbers, and imaginary numbers. There are no complex literals |
| (complex numbers can be formed by adding a real number and an imaginary number). |
| |
| Integer and long integer literals |
| --------------------------------- |
| |
| Integer and long integer literals are described by the following lexical |
| definitions: |
| |
| .. productionlist:: |
n | longinteger: `integer` ("l" \| "L") |
n | longinteger: `integer` ("l" | "L") |
| integer: `decimalinteger` \| `octinteger` \| `hexinteger` |
| integer: `decimalinteger` | `octinteger` | `hexinteger` | `bininteger` |
| decimalinteger: `nonzerodigit` `digit`\* \| "0" |
| decimalinteger: `nonzerodigit` `digit`* | "0" |
| octinteger: "0" `octdigit`\ + |
| octinteger: "0" ("o" | "O") `octdigit`+ | "0" `octdigit`+ |
| hexinteger: "0" ("x" \| "X") `hexdigit`\ + |
| hexinteger: "0" ("x" | "X") `hexdigit`+ |
| bininteger: "0" ("b" | "B") `bindigit`+ |
| nonzerodigit: "1"..."9" |
| octdigit: "0"..."7" |
n | bindigit: "0" | "1" |
| hexdigit: `digit` \| "a"..."f" \| "A"..."F" |
| hexdigit: `digit` | "a"..."f" | "A"..."F" |
| |
| Although both lower case ``'l'`` and upper case ``'L'`` are allowed as suffix |
| for long integers, it is strongly recommended to always use ``'L'``, since the |
| letter ``'l'`` looks too much like the digit ``'1'``. |
| |
| Plain integer literals that are above the largest representable plain integer |
| (e.g., 2147483647 when using 32-bit arithmetic) are accepted as if they were |
| long integers instead. [#]_ There is no limit for long integer literals apart |
| from what can be stored in available memory. |
| |
| Some examples of plain integer literals (first row) and long integer literals |
| (second and third rows):: |
| |
| 7 2147483647 0177 |
| 3L 79228162514264337593543950336L 0377L 0x100000000L |
n | 79228162514264337593543950336 0xdeadbeef |
n | 79228162514264337593543950336 0xdeadbeef |
| |
| |
| .. _floating: |
| |
| Floating point literals |
| ----------------------- |
| |
| Floating point literals are described by the following lexical definitions: |
| |
| .. productionlist:: |
n | floatnumber: `pointfloat` \| `exponentfloat` |
n | floatnumber: `pointfloat` | `exponentfloat` |
| pointfloat: [`intpart`] `fraction` \| `intpart` "." |
| pointfloat: [`intpart`] `fraction` | `intpart` "." |
| exponentfloat: (`intpart` \| `pointfloat`) `exponent` |
| exponentfloat: (`intpart` | `pointfloat`) `exponent` |
| intpart: `digit`\ + |
| intpart: `digit`+ |
| fraction: "." `digit`\ + |
| fraction: "." `digit`+ |
| exponent: ("e" \| "E") ["+" \| "-"] `digit`\ + |
| exponent: ("e" | "E") ["+" | "-"] `digit`+ |
| |
| Note that the integer and exponent parts of floating point numbers can look like |
| octal integers, but are interpreted using radix 10. For example, ``077e010`` is |
| legal, and denotes the same number as ``77e10``. The allowed range of floating |
| point literals is implementation-dependent. Some examples of floating point |
| literals:: |
| |
| 3.14 10. .001 1e100 3.14e-10 0e0 |
| .. _imaginary: |
| |
| Imaginary literals |
| ------------------ |
| |
| Imaginary literals are described by the following lexical definitions: |
| |
| .. productionlist:: |
n | imagnumber: (`floatnumber` \| `intpart`) ("j" \| "J") |
n | imagnumber: (`floatnumber` | `intpart`) ("j" | "J") |
| |
| An imaginary literal yields a complex number with a real part of 0.0. Complex |
| numbers are represented as a pair of floating point numbers and have the same |
| restrictions on their range. To create a complex number with a nonzero real |
| part, add a floating point number to it, e.g., ``(3+4j)``. Some examples of |
| imaginary literals:: |
| |
t | 3.14j 10.j 10j .001j 1e100j 3.14e-10j |
t | 3.14j 10.j 10j .001j 1e100j 3.14e-10j |
| |
| |
| .. _operators: |
| |
| Operators |
| ========= |
| |
| .. index:: single: operators |