| |
| In the following examples, input and output are distinguished by the presence or |
| absence of prompts (``>>>`` and ``...``): to repeat the example, you must type |
| everything after the prompt, when the prompt appears; lines that do not begin |
| with a prompt are output from the interpreter. Note that a secondary prompt on a |
| line by itself in an example means you must type a blank line; this is used to |
| end a multi-line command. |
| |
n | .. % |
| .. % \footnote{ |
| .. % I'd prefer to use different fonts to distinguish input |
| .. % from output, but the amount of LaTeX hacking that would require |
| .. % is currently beyond my ability. |
| .. % } |
| |
| Many of the examples in this manual, even those entered at the interactive |
| prompt, include comments. Comments in Python start with the hash character, |
n | ``'#'``, and extend to the end of the physical line. A comment may appear at |
n | ``#``, and extend to the end of the physical line. A comment may appear at the |
| the start of a line or following whitespace or code, but not within a string |
| start of a line or following whitespace or code, but not within a string |
| literal. A hash character within a string literal is just a hash character. |
| literal. A hash character within a string literal is just a hash character. |
| Since comments are to clarify code and are not interpreted by Python, they may |
| be omitted when typing in examples. |
| |
| Some examples:: |
| |
| # this is the first comment |
| SPAM = 1 # and this is the second comment |
| # ... and now a third! |
| STRING = "# This is not a comment." |
| |
| >>> x = y = z = 0 # Zero x, y and z |
| >>> x |
| 0 |
| >>> y |
| 0 |
| >>> z |
| 0 |
| |
n | Variables must be "defined" (assigned a value) before they can be used, or an |
| error will occur:: |
| |
| >>> # try to access an undefined variable |
| ... n |
| Traceback (most recent call last): |
| File "<stdin>", line 1, in <module> |
| NameError: name 'n' is not defined |
| |
| There is full support for floating point; operators with mixed type operands |
| convert the integer operand to floating point:: |
| |
| >>> 3 * 3.75 / 1.5 |
| 7.5 |
| >>> 7.0 / 2 |
| 3.5 |
| |
| Note that newlines still need to be embedded in the string using ``\n``; the |
| newline following the trailing backslash is discarded. This example would print |
| the following:: |
| |
| This is a rather long string containing |
| several lines of text just as you would do in C. |
| Note that whitespace at the beginning of the line is significant. |
| |
n | If we make the string literal a "raw" string, however, the ``\n`` sequences are |
| not converted to newlines, but the backslash at the end of the line, and the |
| newline character in the source, are both included in the string as data. Thus, |
| the example:: |
| |
| hello = r"This is a rather long string containing\n\ |
| several lines of text much as you would do in C." |
| |
| print hello |
| |
| would print:: |
| |
| This is a rather long string containing\n\ |
| several lines of text much as you would do in C. |
| |
| Or, strings can be surrounded in a pair of matching triple-quotes: ``"""`` or |
| ``'''``. End of lines do not need to be escaped when using triple-quotes, but |
| they will be included in the string. :: |
| |
| print """ |
n | Usage: thingy [OPTIONS] |
n | Usage: thingy [OPTIONS] |
| -h Display this usage message |
| -H hostname Hostname to connect to |
| """ |
| |
| produces the following output:: |
| |
n | Usage: thingy [OPTIONS] |
n | Usage: thingy [OPTIONS] |
| -h Display this usage message |
| -H hostname Hostname to connect to |
n | |
| If we make the string literal a "raw" string, ``\n`` sequences are not converted |
| to newlines, but the backslash at the end of the line, and the newline character |
| in the source, are both included in the string as data. Thus, the example:: |
| |
| hello = r"This is a rather long string containing\n\ |
| several lines of text much as you would do in C." |
| |
| print hello |
| |
| would print:: |
| |
| This is a rather long string containing\n\ |
| several lines of text much as you would do in C. |
| |
| The interpreter prints the result of string operations in the same way as they |
| are typed for input: inside quotes, and with quotes and other funny characters |
| escaped by backslashes, to show the precise value. The string is enclosed in |
| double quotes if the string contains a single quote and no double quotes, else |
| it's enclosed in single quotes. (The :keyword:`print` statement, described |
| later, can be used to write strings without quotes or escapes.) |
| |
| Slice indices have useful defaults; an omitted first index defaults to zero, an |
| omitted second index defaults to the size of the string being sliced. :: |
| |
| >>> word[:2] # The first two characters |
| 'He' |
| >>> word[2:] # Everything except the first two characters |
| 'lpA' |
| |
n | Unlike a C string, Python strings cannot be changed. Assigning to an indexed |
n | Unlike a C string, Python strings cannot be changed. Assigning to an indexed |
| position in the string results in an error:: |
| |
| >>> word[0] = 'x' |
| Traceback (most recent call last): |
| File "<stdin>", line 1, in ? |
| TypeError: object doesn't support item assignment |
| >>> word[:1] = 'Splat' |
| Traceback (most recent call last): |
| |
| >>> word[-100:] |
| 'HelpA' |
| >>> word[-10] # error |
| Traceback (most recent call last): |
| File "<stdin>", line 1, in ? |
| IndexError: string index out of range |
| |
n | The best way to remember how slices work is to think of the indices as pointing |
n | One way to remember how slices work is to think of the indices as pointing |
| *between* characters, with the left edge of the first character numbered 0. |
| Then the right edge of the last character of a string of *n* characters has |
| index *n*, for example:: |
| |
n | +---+---+---+---+---+ |
n | +---+---+---+---+---+ |
| | H | e | l | p | A | |
n | +---+---+---+---+---+ |
n | +---+---+---+---+---+ |
| 0 1 2 3 4 5 |
| 0 1 2 3 4 5 |
| -5 -4 -3 -2 -1 |
| |
| The first row of numbers gives the position of the indices 0...5 in the string; |
| the second row gives the corresponding negative indices. The slice from *i* to |
| *j* consists of all characters between the edges labeled *i* and *j*, |
| respectively. |
| |
| For non-negative indices, the length of a slice is the difference of the |
| |
| >>> s = 'supercalifragilisticexpialidocious' |
| >>> len(s) |
| 34 |
| |
| |
| .. seealso:: |
| |
n | `Sequence Types <../lib/typesseq.html>`_ |
n | :ref:`typesseq` |
| Strings, and the Unicode strings described in the next section, are examples of |
| Strings, and the Unicode strings described in the next section, are |
| *sequence types*, and support the common operations supported by such types. |
| examples of *sequence types*, and support the common operations supported |
| by such types. |
| |
n | `String Methods <../lib/string-methods.html>`_ |
n | :ref:`string-methods` |
| Both strings and Unicode strings support a large number of methods for basic |
| Both strings and Unicode strings support a large number of methods for |
| transformations and searching. |
| basic transformations and searching. |
| |
n | `String Formatting Operations <../lib/typesseq-strings.html>`_ |
n | :ref:`new-string-formatting` |
| Information about string formatting with :meth:`str.format` is described |
| here. |
| |
| :ref:`string-formatting` |
| The formatting operations invoked when strings and Unicode strings are the left |
| The old formatting operations invoked when strings and Unicode strings are |
| operand of the ``%`` operator are described in more detail here. |
| the left operand of the ``%`` operator are described in more detail here. |
| |
| |
| .. _tut-unicodestrings: |
| |
| Unicode Strings |
| --------------- |
| |
| .. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com> |
| |
| |
| Starting with Python 2.0 a new data type for storing text data is available to |
| the programmer: the Unicode object. It can be used to store and manipulate |
n | Unicode data (see `<http://www.unicode.org/>`_) and integrates well with the |
n | Unicode data (see http://www.unicode.org/) and integrates well with the existing |
| existing string objects, providing auto-conversions where necessary. |
| string objects, providing auto-conversions where necessary. |
| |
| Unicode has the advantage of providing one ordinal for every character in every |
| script used in modern and ancient texts. Previously, there were only 256 |
| possible ordinals for script characters. Texts were typically bound to a code |
| page which mapped the ordinals to script characters. This lead to very much |
| confusion especially with respect to internationalization (usually written as |
| ``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software. Unicode solves |
| these problems by defining one code page for all scripts. |
| |
| Other characters are interpreted by using their respective ordinal values |
| directly as Unicode ordinals. If you have literal strings in the standard |
| Latin-1 encoding that is used in many Western countries, you will find it |
| convenient that the lower 256 characters of Unicode are the same as the 256 |
| characters of Latin-1. |
| |
| For experts, there is also a raw mode just like the one for normal strings. You |
n | have to prefix the opening quote with 'ur' to have Python use the *Raw-Unicode- |
n | have to prefix the opening quote with 'ur' to have Python use the |
| Escape* encoding. It will only apply the above ``\uXXXX`` conversion if there is |
| *Raw-Unicode-Escape* encoding. It will only apply the above ``\uXXXX`` |
| an uneven number of backslashes in front of the small 'u'. :: |
| conversion if there is an uneven number of backslashes in front of the small |
| 'u'. :: |
| |
| >>> ur'Hello\u0020World !' |
| u'Hello World !' |
| >>> ur'Hello\\u0020World !' |
| u'Hello\\\\u0020World !' |
| |
| The raw mode is most useful when you have to enter lots of backslashes, as can |
| be necessary in regular expressions. |
| [123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234] |
| >>> # Clear the list: replace all items with an empty list |
| >>> a[:] = [] |
| >>> a |
| [] |
| |
| The built-in function :func:`len` also applies to lists:: |
| |
n | >>> a = ['a', 'b', 'c', 'd'] |
| >>> len(a) |
n | 8 |
n | 4 |
| |
| It is possible to nest lists (create lists containing other lists), for |
| example:: |
| |
| >>> q = [2, 3] |
| >>> p = [1, q, 4] |
| >>> len(p) |
| 3 |