Python source code encoding and unicode in 2.x.

Define python encoding

Python will default to ASCII as standard encoding if no other
encoding hints are given.

To define a source code encoding, a magic comment must
be placed into the source files either as first or second
line in the file, such as:

      # coding=<encoding name>

or (using formats recognized by popular editors)

      # -*- coding: <encoding name> -*-


      # vim: set fileencoding=<encoding name> :

More precisely, the first or second line must match the regular
expression "^[ \t\v]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)".

Python 2.x unicode

  • Unicode type:
    • basestring <= str or unicode
    • isinstance(obj, basestring) => isinstance(obj, (str, unicode))
    • unicode(string[, encoding, errors])
  • Unicode literal:
    • u’àôî’
    • U’ÀÔÎ’
    • \x
    • \u


Post tagged with: python