|
@@ -201,9 +201,63 @@ module. This process is called *pickling*.
|
|
Next, we retrieve the object using the `load` function of the `pickle`
|
|
Next, we retrieve the object using the `load` function of the `pickle`
|
|
module which returns the object. This process is called *unpickling*.
|
|
module which returns the object. This process is called *unpickling*.
|
|
|
|
|
|
|
|
+## Unicode ##
|
|
|
|
+
|
|
|
|
+So far, when we have been writing and using strings, or reading and
|
|
|
|
+writing to a file, we have used simple English characters only. If we
|
|
|
|
+want to be able to read and write other non-English languages, we need
|
|
|
|
+to use the `unicode` type, and it all starts with the character `u`:
|
|
|
|
+
|
|
|
|
+~~~
|
|
|
|
+>>> "hello world"
|
|
|
|
+'hello world'
|
|
|
|
+
|
|
|
|
+>>> type("hello world")
|
|
|
|
+str
|
|
|
|
+
|
|
|
|
+>>> u"hello world"
|
|
|
|
+u'hello world'
|
|
|
|
+
|
|
|
|
+>>> type(u"hello world")
|
|
|
|
+unicode
|
|
|
|
+~~~
|
|
|
|
+
|
|
|
|
+We use the `unicode` type instead of `strings` to make sure that we
|
|
|
|
+handle non-English languages in our programs. However, when we read or
|
|
|
|
+write to a file or when we talk to other computers on the Internet, we
|
|
|
|
+need to convert our unicode strings into a format that can be sent and
|
|
|
|
+received, and that format is called "UTF-8". We can read and write in
|
|
|
|
+that format, using a simple keyword argument to our standard `open`
|
|
|
|
+function:
|
|
|
|
+
|
|
|
|
+~~~python
|
|
|
|
+# encoding=utf-8
|
|
|
|
+
|
|
|
|
+f = open("abc.txt", "wt", encoding="utf-8")
|
|
|
|
+f.write("नमस्ते दुनिया")
|
|
|
|
+f.close()
|
|
|
|
+
|
|
|
|
+text = open("abc.txt", encoding="utf-8").read()
|
|
|
|
+~~~
|
|
|
|
+
|
|
|
|
+How It Works:
|
|
|
|
+
|
|
|
|
+Whenever we write a program that uses Unicode literals like we have
|
|
|
|
+used above, we have to make sure that Python itself is told that our
|
|
|
|
+program uses UTF-8, and we have to put `# encoding=utf-8` comment at
|
|
|
|
+the top of our program.
|
|
|
|
+
|
|
|
|
+Whenever, we read or write from a file, we specify `encoding="utf-8"`
|
|
|
|
+and then Python knows how to read or write the Unicode strings.
|
|
|
|
+
|
|
|
|
+You can learn more about this topic by reading the
|
|
|
|
+[Unicode howto](http://docs.python.org/3/howto/unicode.html) and
|
|
|
|
+watching
|
|
|
|
+[Nat Batchelder's Pragmatic Unicode talk](http://nedbatchelder.com/text/unipain.html).
|
|
|
|
+
|
|
## Summary ##
|
|
## Summary ##
|
|
|
|
|
|
-We have discussed various types of input/output and also file handling
|
|
|
|
-and using the pickle module.
|
|
|
|
|
|
+We have discussed various types of input/output, about file handling,
|
|
|
|
+about the pickle module and about Unicode.
|
|
|
|
|
|
Next, we will explore the concept of exceptions.
|
|
Next, we will explore the concept of exceptions.
|