| of bug fixes and improvements are always being submitted. A host of minor fixes, |
| a few optimizations, additional docstrings, and better error messages went into |
| 2.0; to list them all would be impossible, but they're certainly significant. |
| Consult the publicly-available CVS logs if you want to see the full list. This |
| progress is due to the five developers working for PythonLabs are now getting |
| paid to spend their days fixing bugs, and also due to the improved communication |
| resulting from moving to SourceForge. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| What About Python 1.6? |
| ====================== |
| |
| Python 1.6 can be thought of as the Contractual Obligations Python release. |
| After the core development team left CNRI in May 2000, CNRI requested that a 1.6 |
| release be created, containing all the work on Python that had been performed at |
| and 2.0beta1 releases were made on the same day (September 5, 2000), the plan |
| being to finalize Python 2.0 within a month or so. If you have applications to |
| maintain, there seems little point in breaking things by moving to 1.6, fixing |
| them, and then having another round of breakage within a month by moving to 2.0; |
| you're better off just going straight to 2.0. Most of the really interesting |
| features described in this document are only in 2.0, because a lot of work was |
| done between May and September. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| New Development Process |
| ======================= |
| |
| The most important change in Python 2.0 may not be to the code at all, but to |
| how Python is developed: in May 2000 the Python developers began using the tools |
| made available by SourceForge for storing source code, tracking bug reports, |
| and managing the queue of patch submissions. To report bugs or submit patches |
| for Python 2.0, use the bug tracking and patch manager tools available from |
n | Python's project page, located at `<http://sourceforge.net/projects/python/>`_. |
n | Python's project page, located at http://sourceforge.net/projects/python/. |
| |
| The most important of the services now hosted at SourceForge is the Python CVS |
| tree, the version-controlled repository containing the source code for Python. |
| Previously, there were roughly 7 or so people who had write access to the CVS |
| tree, and all patches had to be inspected and checked in by one of the people on |
| this short list. Obviously, this wasn't very scalable. By moving the CVS tree |
| to SourceForge, it became possible to grant write access to more people; as of |
| September 2000 there were 27 people able to check in changes, a fourfold |
| We intend PEPs to be the primary mechanisms for proposing new features, for |
| collecting community input on an issue, and for documenting the design decisions |
| that have gone into Python. The PEP author is responsible for building |
| consensus within the community and documenting dissenting opinions. |
| |
| Read the rest of PEP 1 for the details of the PEP editorial process, style, and |
| format. PEPs are kept in the Python CVS tree on SourceForge, though they're not |
| part of the Python 2.0 distribution, and are also available in HTML form from |
n | `<http://www.python.org/peps/>`_. As of September 2000, there are 25 PEPS, |
n | http://www.python.org/peps/. As of September 2000, there are 25 PEPS, ranging |
| ranging from PEP 201, "Lockstep Iteration", to PEP 225, "Elementwise/Objectwise |
| from PEP 201, "Lockstep Iteration", to PEP 225, "Elementwise/Objectwise |
| Operators". |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| Unicode |
| ======= |
| |
| The largest new feature in Python 2.0 is a new fundamental data type: Unicode |
| strings. Unicode uses 16-bit numbers to represent characters instead of the |
| 8-bit number used by ASCII, meaning that 65,536 distinct characters can be |
| In Python source code, Unicode strings are written as ``u"string"``. Arbitrary |
| Unicode characters can be written using a new escape sequence, ``\uHHHH``, where |
| *HHHH* is a 4-digit hexadecimal number from 0000 to FFFF. The existing |
| ``\xHHHH`` escape sequence can also be used, and octal escapes can be used for |
| characters up to U+01FF, which is represented by ``\777``. |
| |
| Unicode strings, just like regular strings, are an immutable sequence type. |
| They can be indexed and sliced, but not modified in place. Unicode strings have |
n | an :meth:`encode( [encoding] )` method that returns an 8-bit string in the |
n | an ``encode( [encoding] )`` method that returns an 8-bit string in the desired |
| desired encoding. Encodings are named by strings, such as ``'ascii'``, |
| encoding. Encodings are named by strings, such as ``'ascii'``, ``'utf-8'``, |
| ``'utf-8'``, ``'iso-8859-1'``, or whatever. A codec API is defined for |
| ``'iso-8859-1'``, or whatever. A codec API is defined for implementing and |
| implementing and registering new encodings that are then available throughout a |
| registering new encodings that are then available throughout a Python program. |
| Python program. If an encoding isn't specified, the default encoding is usually |
| If an encoding isn't specified, the default encoding is usually 7-bit ASCII, |
| 7-bit ASCII, though it can be changed for your Python installation by calling |
| though it can be changed for your Python installation by calling the |
| the :func:`sys.setdefaultencoding(encoding)` function in a customised version of |
| :func:`sys.setdefaultencoding(encoding)` function in a customised version of |
| :file:`site.py`. |
| |
| Combining 8-bit and Unicode strings always coerces to Unicode, using the default |
| ASCII encoding; the result of ``'a' + u'bc'`` is ``u'abc'``. |
| |
| New built-in functions have been added, and existing built-ins modified to |
| support Unicode: |
| |
| * ``unichr(ch)`` returns a Unicode string 1 character long, containing the |
| character *ch*. |
| |
| * ``ord(u)``, where *u* is a 1-character regular or Unicode string, returns the |
| number of the character as an integer. |
| |
n | * ``unicode(string [, *encoding*] [, *errors*] )`` creates a Unicode string |
n | * ``unicode(string [, encoding] [, errors] )`` creates a Unicode string |
| from an 8-bit string. ``encoding`` is a string naming the encoding to use. The |
| ``errors`` parameter specifies the treatment of characters that are invalid for |
| the current encoding; passing ``'strict'`` as the value causes an exception to |
| be raised on any encoding error, while ``'ignore'`` causes errors to be silently |
| ignored and ``'replace'`` uses U+FFFD, the official replacement character, in |
| case of any problems. |
| |
| * The :keyword:`exec` statement, and various built-ins such as ``eval()``, |
| purpose, but they require a function as one of their arguments. This is fine if |
| there's an existing built-in function that can be passed directly, but if there |
| isn't, you have to create a little function to do the required work, and |
| Python's scoping rules make the result ugly if the little function needs |
| additional information. Take the first example in the previous paragraph, |
| finding all the strings in the list containing a given substring. You could |
| write the following to do it:: |
| |
n | # Given the list L, make a list of all strings |
n | # Given the list L, make a list of all strings |
| # containing the substring S. |
n | sublist = filter( lambda s, substring=S: |
n | sublist = filter( lambda s, substring=S: |
| string.find(s, substring) != -1, |
n | L) |
n | L) |
| |
| Because of Python's scoping rules, a default argument is used so that the |
| anonymous function created by the :keyword:`lambda` statement knows what |
| substring is being searched for. List comprehensions make this cleaner:: |
| |
| sublist = [ s for s in L if string.find(s, S) != -1 ] |
| |
| List comprehensions have the form:: |
| |
n | [ expression for expr in sequence1 |
n | [ expression for expr in sequence1 |
| for expr2 in sequence2 ... |
n | for exprN in sequenceN |
n | for exprN in sequenceN |
| if condition ] |
| |
| The :keyword:`for`...\ :keyword:`in` clauses contain the sequences to be |
| iterated over. The sequences do not have to be the same length, because they |
| are *not* iterated over in parallel, but from left to right; this is explained |
| more clearly in the following paragraphs. The elements of the generated list |
| will be the successive values of *expression*. The final :keyword:`if` clause |
| is optional; if present, *expression* is only evaluated and added to the result |
| comprehension below is a syntax error, while the second one is correct:: |
| |
| # Syntax error |
| [ x,y for x in seq1 for y in seq2] |
| # Correct |
| [ (x,y) for x in seq1 for y in seq2] |
| |
| The idea of list comprehensions originally comes from the functional programming |
n | language Haskell (`<http://www.haskell.org>`_). Greg Ewing argued most |
n | language Haskell (http://www.haskell.org). Greg Ewing argued most effectively |
| effectively for adding them to Python and wrote the initial list comprehension |
| for adding them to Python and wrote the initial list comprehension patch, which |
| patch, which was then discussed for a seemingly endless time on the python-dev |
| was then discussed for a seemingly endless time on the python-dev mailing list |
| mailing list and kept up-to-date by Skip Montanaro. |
| and kept up-to-date by Skip Montanaro. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| Augmented Assignment |
| ==================== |
| |
| Augmented assignment operators, another long-requested feature, have been added |
| to Python 2.0. Augmented assignment operators include ``+=``, ``-=``, ``*=``, |
| and so forth. For example, the statement ``a += 2`` increments the value of the |
| |
| The full list of supported assignment operators is ``+=``, ``-=``, ``*=``, |
| ``/=``, ``%=``, ``**=``, ``&=``, ``|=``, ``^=``, ``>>=``, and ``<<=``. Python |
| classes can override the augmented assignment operators by defining methods |
| named :meth:`__iadd__`, :meth:`__isub__`, etc. For example, the following |
| :class:`Number` class stores a number and supports using += to create a new |
| instance with an incremented value. |
| |
n | .. % The empty groups below prevent conversion to guillemets. |
n | .. The empty groups below prevent conversion to guillemets. |
| |
| :: |
| |
| class Number: |
| def __init__(self, value): |
| self.value = value |
| def __iadd__(self, increment): |
n | return Number( self.value + increment) |
n | return Number( self.value + increment) |
| |
| n = Number(5) |
| n += 3 |
| print n.value |
| |
| The :meth:`__iadd__` special method is called with the value of the increment, |
| and should return a new instance with an appropriately modified value; this |
| return value is bound as the new value of the variable on the left-hand side. |
| |
| Augmented assignment operators were first introduced in the C programming |
| language, and most C-derived languages, such as :program:`awk`, C++, Java, Perl, |
| and PHP also support them. The augmented assignment patch was implemented by |
| Thomas Wouters. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| String Methods |
| ============== |
| |
| Until now string-manipulation functionality was in the :mod:`string` module, |
| which was usually a front-end for the :mod:`strop` module written in C. The |
| addition of Unicode posed a difficulty for the :mod:`strop` module, because the |
| ``s.endswith(t)`` is equivalent to ``s[-len(t):] == t``. |
| |
| One other method which deserves special mention is :meth:`join`. The |
| :meth:`join` method of a string receives one parameter, a sequence of strings, |
| and is equivalent to the :func:`string.join` function from the old :mod:`string` |
| module, with the arguments reversed. In other words, ``s.join(seq)`` is |
| equivalent to the old ``string.join(seq, s)``. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| Garbage Collection of Cycles |
| ============================ |
| |
| The C implementation of Python uses reference counting to implement garbage |
| collection. Every Python object maintains a count of the number of references |
| pointing to itself, and adjusts the count as references are created or |
| implementation of the cycle detection approach was written by Toby Kelsey. The |
| current algorithm was suggested by Eric Tiedemann during a visit to CNRI, and |
| Guido van Rossum and Neil Schemenauer wrote two different implementations, which |
| were later integrated by Neil. Lots of other people offered suggestions along |
| the way; the March 2000 archives of the python-dev mailing list contain most of |
| the relevant discussion, especially in the threads titled "Reference cycle |
| collection for Python" and "Finalization again". |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| Other Core Changes |
| ================== |
| |
| Various minor changes have been made to Python's syntax and built-in functions. |
| None of the changes are very far-reaching, but they're handy conveniences. |
| |
| are isomorphic. See the thread "trashcan and PR#7" in the April 2000 archives of |
| the python-dev mailing list for the discussion leading up to this |
| implementation, and some useful relevant links. Note that comparisons can now |
| also raise exceptions. In earlier versions of Python, a comparison operation |
| such as ``cmp(a,b)`` would always produce an answer, even if a user-defined |
| :meth:`__cmp__` method encountered an error, since the resulting exception would |
| simply be silently swallowed. |
| |
n | .. % Starting URL: |
n | .. Starting URL: |
| .. % http://www.python.org/pipermail/python-dev/2000-April/004834.html |
| .. http://www.python.org/pipermail/python-dev/2000-April/004834.html |
| |
| Work has been done on porting Python to 64-bit Windows on the Itanium processor, |
| mostly by Trent Mick of ActiveState. (Confusingly, ``sys.platform`` is still |
| ``'win32'`` on Win64 because it seems that for ease of porting, MS Visual C++ |
| treats code as 32 bit on Itanium.) PythonWin also supports Windows CE; see the |
n | Python CE page at `<http://starship.python.net/crew/mhammond/ce/>`_ for more |
n | Python CE page at http://starship.python.net/crew/mhammond/ce/ for more |
| information. |
| |
| Another new platform is Darwin/MacOS X; initial support for it is in Python 2.0. |
| Dynamic loading works, if you specify "configure --with-dyld --with-suffix=.x". |
| Consult the README in the Python source distribution for more instructions. |
| |
| An attempt has been made to alleviate one of Python's warts, the often-confusing |
| :exc:`NameError` exception when code refers to a local variable before the |
| |
| Dictionaries have an odd new method, :meth:`setdefault(key, default)`, which |
| behaves similarly to the existing :meth:`get` method. However, if the key is |
| missing, :meth:`setdefault` both returns the value of *default* as :meth:`get` |
| would do, and also inserts it into the dictionary as the value for *key*. Thus, |
| the following lines of code:: |
| |
| if dict.has_key( key ): return dict[key] |
n | else: |
n | else: |
| dict[key] = [] |
| return dict[key] |
| |
| can be reduced to a single ``return dict.setdefault(key, [])`` statement. |
| |
| The interpreter sets a maximum recursion depth in order to catch runaway |
| recursion before filling the C stack and causing a core dump or GPF.. |
| Previously this limit was fixed when you compiled Python, but in 2.0 the maximum |
| recursion depth can be read and modified using :func:`sys.getrecursionlimit` and |
| :func:`sys.setrecursionlimit`. The default value is 1000, and a rough maximum |
| value for a given platform can be found by running a new script, |
| :file:`Misc/find_recursionlimit.py`. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| Porting to 2.0 |
| ============== |
| |
| New Python releases try hard to be compatible with previous releases, and the |
| record has been pretty good. However, some changes are considered useful |
| enough, usually because they fix initial design decisions that turned out to be |
| ``'8.1'``. |
| |
| The ``-X`` command-line option, which turned all standard exceptions into |
| strings instead of classes, has been removed; the standard exceptions will now |
| always be classes. The :mod:`exceptions` module containing the standard |
| exceptions was translated from Python to a built-in C module, written by Barry |
| Warsaw and Fredrik Lundh. |
| |
n | .. % Commented out for now -- I don't think anyone will care. |
n | .. Commented out for now -- I don't think anyone will care. |
| .. % The pattern and match objects provided by SRE are C types, not Python |
| The pattern and match objects provided by SRE are C types, not Python |
| .. % class instances as in 1.5. This means you can no longer inherit from |
| class instances as in 1.5. This means you can no longer inherit from |
| .. % \class{RegexObject} or \class{MatchObject}, but that shouldn't be much |
| \class{RegexObject} or \class{MatchObject}, but that shouldn't be much |
| .. % of a problem since no one should have been doing that in the first |
| of a problem since no one should have been doing that in the first |
| .. % place. |
| place. |
| .. % ====================================================================== |
| .. ====================================================================== |
| |
| |
| Extending/Embedding Changes |
| =========================== |
| |
| Some of the changes are under the covers, and will only be apparent to people |
| writing C extension modules or embedding a Python interpreter in a larger |
| application. If you aren't dealing with Python's C API, you can safely skip |
| of these functions takes a module object, a null-terminated C string containing |
| the name to be added, and a third argument for the value to be assigned to the |
| name. This third argument is, respectively, a Python object, a C long, or a C |
| string. |
| |
| A wrapper API was added for Unix-style signal handlers. :func:`PyOS_getsig` gets |
| a signal handler and :func:`PyOS_setsig` will set a new handler. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| Distutils: Making Modules Easy to Install |
| ========================================= |
| |
| Before Python 2.0, installing modules was a tedious affair -- there was no way |
| to figure out automatically where Python is installed, or what compiler options |
| to use for extension modules. Software authors had to go through an arduous |
| places to override defaults -- separating the build from the install, building |
| or installing in non-default directories, and more. |
| |
| In order to use the Distutils, you need to write a :file:`setup.py` script. For |
| the simple case, when the software contains only .py files, a minimal |
| :file:`setup.py` can be just a few lines long:: |
| |
| from distutils.core import setup |
n | setup (name = "foo", version = "1.0", |
n | setup (name = "foo", version = "1.0", |
| py_modules = ["module1", "module2"]) |
| |
| The :file:`setup.py` file isn't much more complicated if the software consists |
| of a few packages:: |
| |
| from distutils.core import setup |
n | setup (name = "foo", version = "1.0", |
n | setup (name = "foo", version = "1.0", |
| packages = ["package", "package.subpackage"]) |
| |
| A C extension can be the most complicated case; here's an example taken from |
| the PyXML package:: |
| |
| from distutils.core import setup, Extension |
| |
| expat_extension = Extension('xml.parsers.pyexpat', |
n | define_macros = [('XML_NS', None)], |
n | define_macros = [('XML_NS', None)], |
| include_dirs = [ 'extensions/expat/xmltok', |
| include_dirs = [ 'extensions/expat/xmltok', |
| 'extensions/expat/xmlparse' ], |
| 'extensions/expat/xmlparse' ], |
| sources = [ 'extensions/pyexpat.c', |
| sources = [ 'extensions/pyexpat.c', |
| 'extensions/expat/xmltok/xmltok.c', |
| 'extensions/expat/xmltok/xmltok.c', |
| 'extensions/expat/xmltok/xmlrole.c', |
| 'extensions/expat/xmltok/xmlrole.c', ] |
| ] |
| ) |
n | setup (name = "PyXML", version = "0.5.4", |
n | setup (name = "PyXML", version = "0.5.4", |
| ext_modules =[ expat_extension ] ) |
| |
| The Distutils can also take care of creating source and binary distributions. |
| The "sdist" command, run by "``python setup.py sdist``', builds a source |
| distribution such as :file:`foo-1.0.tar.gz`. Adding new commands isn't |
| difficult, "bdist_rpm" and "bdist_wininst" commands have already been |
| contributed to create an RPM distribution and a Windows installer for the |
| software, respectively. Commands to create other distribution formats such as |
| Debian packages and Solaris :file:`.pkg` files are in various stages of |
| development. |
| |
| All this is documented in a new manual, *Distributing Python Modules*, that |
| joins the basic set of Python documentation. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| XML Modules |
| =========== |
| |
| Python 1.5.2 included a simple XML parser in the form of the :mod:`xmllib` |
| module, contributed by Sjoerd Mullender. Since 1.5.2's release, two different |
| interfaces for processing XML have become common: SAX2 (version 2 of the Simple |
| the different :class:`Node` classes and their various methods. |
| |
| |
| Relationship to PyXML |
| --------------------- |
| |
| The XML Special Interest Group has been working on XML-related Python code for a |
| while. Its code distribution, called PyXML, is available from the SIG's Web |
n | pages at `<http://www.python.org/sigs/xml-sig/>`_. The PyXML distribution also |
n | pages at http://www.python.org/sigs/xml-sig/. The PyXML distribution also used |
| used the package name ``xml``. If you've written programs that used PyXML, |
| the package name ``xml``. If you've written programs that used PyXML, you're |
| you're probably wondering about its compatibility with the 2.0 :mod:`xml` |
| probably wondering about its compatibility with the 2.0 :mod:`xml` package. |
| package. |
| |
| The answer is that Python 2.0's :mod:`xml` package isn't compatible with PyXML, |
| but can be made compatible by installing a recent version PyXML. Many |
| applications can get by with the XML support that is included with Python 2.0, |
| but more complicated applications will require that the full PyXML package will |
| be installed. When installed, PyXML versions 0.6.0 or greater will replace the |
| :mod:`xml` package shipped with Python, and will be a strict superset of the |
| standard package, adding a bunch of additional features. Some of the additional |
| features in PyXML include: |
| |
| * 4DOM, a full DOM implementation from FourThought, Inc. |
| |
| * The xmlproc validating parser, written by Lars Marius Garshol. |
| |
| * The :mod:`sgmlop` parser accelerator module, written by Fredrik Lundh. |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| Module changes |
| ============== |
| |
| Lots of improvements and bugfixes were made to Python's extensive standard |
| library; some of the affected modules include :mod:`readline`, |
| :mod:`ConfigParser`, :mod:`cgi`, :mod:`calendar`, :mod:`posix`, :mod:`readline`, |
| are archives produced by :program:`PKZIP` on DOS/Windows or :program:`zip` on |
| Unix, not to be confused with :program:`gzip`\ -format files (which are |
| supported by the :mod:`gzip` module) (Contributed by James C. Ahlstrom.) |
| |
| * :mod:`imputil`: A module that provides a simpler way for writing customised |
| import hooks, in comparison to the existing :mod:`ihooks` module. (Implemented |
| by Greg Stein, with much discussion on python-dev along the way.) |
| |
n | .. % ====================================================================== |
n | .. ====================================================================== |
| |
| |
| IDLE Improvements |
| ================= |
| |
| IDLE is the official Python cross-platform IDE, written using Tkinter. Python |
| 2.0 includes IDLE 0.6, which adds a number of new features and improvements. A |
| partial list: |