Python 編碼和 unicode

Python 2.x 中的原始碼編碼和unicode。

Define python encoding

Python will default to ASCII as standard encoding if no other
encoding hints are given.

To define a source code encoding, a magic comment must
be placed into the source files either as first or second
line in the file, such as:

      # coding=

or (using formats recognized by popular editors)

      #!/usr/bin/python
      # -*- coding:  -*-

or

      #!/usr/bin/python
      # vim: set fileencoding= :

More precisely, the first or second line must match the regular
expression "^[ \t\v]*#.*?coding[:=][ \t]*([-_.a-zA-Z0-9]+)".

Python 2.x unicode

  • Unicode type:
    • basestring <= str or unicode
    • isinstance(obj, basestring) => isinstance(obj, (str, unicode))
    • unicode(string[, encoding, errors])
  • Unicode literal:
    • u'àôî'
    • U'ÀÔÎ'
    • \x
    • \u

參考文獻

python