快速入门
字符串可以包含在单引号或双引号中。
>>> 'spam eggs' # single quotes'spam eggs'>>> 'doesn\'t' # use \' to escape the single quote..."doesn't">>> "doesn't" # ...or use double quotes instead"doesn't">>> '"Yes," he said.''"Yes," he said.'>>> "\"Yes,\" he said."'"Yes," he said.'>>> '"Isn\'t," she said.''"Isn\'t," she said.'
解释器按照字符串被输入的方式显示字符串,通常包含在单引号中,如果内容包含包含单引号,则包含在双引号中。
print会以更可视的格式显示:
>>> '"Isn\'t," she said.''"Isn\'t," she said.'>>> print('"Isn\'t," she said.')"Isn't," she said.>>> s = 'First line.\nSecond line.' # \n means newline>>> s # without print(), \n is included in the output'First line.\nSecond line.'>>> print(s) # with print(), \n produces a new lineFirst line.Second line.
字符串前面添加’r’表示原始字符串,里面的反斜杠不会转义:
>>> r'C:\Program Files\foo\bar\' File "<stdin>", line 1 r'C:\Program Files\foo\bar\' ^SyntaxError: EOL while scanning string literal>>> r'C:\Program Files\foo\bar''\\''C:\\Program Files\\foo\\bar\\'>>>
原始字符串不能以单个反斜杠结尾。换而言之,原始字符串的最后一个字符不能是反斜杠,除非你对其进行转义(但进行转义时,用于转义的反斜杠也将是字符串的一部分)如果最后一个字符(位于结束引号前面的那个字符)为反斜杠,且未对其进行转义,Python将无法判断字符串是否到此结束。
跨行的字符串多使用三引号,即三个单引号或者三个双引号:
print("""\Usage: thingy [OPTIONS] -h Display this usage message -H hostname Hostname to connect to""")Usage: thingy [OPTIONS] -h Display this usage message -H hostname Hostname to connect to
注意第一个三引号后面有反斜杠,就不会输出第一个换行符。末尾的反斜杠表示续行。
字符串可用+操作符连接,用*重复:
>>> 3 * 'un' + 'ium''unununium'
相邻字符串文本会自动连接,它只用于字符串文本,不能用于字符串表达式和变量(需要使用加号)等:
>>> 'Py' 'thon''Python'>>> prefix 'thon File "<stdin>", line 1 prefix 'thon ^SyntaxError: EOL while scanning string literal>>> ('un' * 3) 'ium' File "<stdin>", line 1 ('un' * 3) 'ium' ^SyntaxError: invalid syntax>>> prefix + 'thon''Python'# 在拆分长字符串时很有用。>>> text = ('Put several strings within parentheses '... 'to have them joined together.')>>> text'Put several strings within parentheses to have them joined together.'
字符串下标又称索引和C类似 ,第一个字符索引为 0 。没有独立的字符类型,字符就是长度为 1 的字符串,也可以使用负数,-1表示倒数第一个,-2表示倒数第二个,以此类推。不存在的下标会报IndexError。
>>> word = 'Python'>>> word[0] # character in position 0'P'>>> word[5] # character in position 5'n'>>> word[-1] # last character'n'>>> word[-2] # second-last character'o'>>> word[-6]'P'>>> word[-16]Traceback (most recent call last): File "<stdin>", line 1, in <module>IndexError: string index out of range>>> word[16]Traceback (most recent call last): File "<stdin>", line 1, in <module>IndexError: string index out of range
字符串支持切片:由两个索引,中间是冒号。第一个索引表示起点,包含该元素,默认为0;第2个索引表示终点,不包含该元素,默认为字符串末尾。s[:i] + s[i:]等同于s。
>>> word[0:2] # characters from position 0 (included) to 2 (excluded)'Py'>>> word[2:5] # characters from position 2 (included) to 5 (excluded)'tho'>>> word[:2] + word[2:]'Python'>>> word[:4] + word[4:]'Python'>>> word[:2] # character from the beginning to position 2 (excluded)'Py'>>> word[4:] # characters from position 4 (included) to the end'on'>>> word[-2:] # characters from the second-last (included) to the end'on'
记住切片的工作方式:切片索引是在字符之间。左边第一个字符的索引为0,右界索引为字符串长度n 。例如:
+---+---+---+---+---+---+ | P | y | t | h | o | n | +---+---+---+---+---+---+ 0 1 2 3 4 5 6-6 -5 -4 -3 -2 -1
第一行数字给出字符串正索引点值0…5 。第二行给出相应的负索引。切片是从 i 到 j 两个数值标示的边界之间的所有字符。
对于非负索引,如果两个索引都在边界内,切片长度就是两个索引之差。例如, word[1:3] 是 2 。
切片时,下标溢出不会报错。
>>> word[4:42]'on'>>> word[43:42]''
Python的字符串是不可变。向字符串文本的某一个索引赋值会引发错误:
>>> word[0] = 'J'Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: 'str' object does not support item assignment
通过联合(加号)可以简单高效的创建字符串。(注,jython中这种操作并不高效)。
>>> 'J' + word[1:]'Jython'>>> word[:2] + 'py''Pypy'
内置函数len()返回字符串长度:
>>> s = 'supercalifragilisticexpialidocious'>>> len(s)34
参考资料
- 讨论qq群144081101 591302926 567351477 钉钉免费群21745728
- 本文最新版本地址
- 本文涉及的python测试开发库 谢谢点赞!
- 本文相关海量书籍下载
- 本文源码地址
- Sequence Types — str, unicode, list, tuple, bytearray, buffer, xrange https://docs.python.org/3/tutorial/introduction.html#strings
- String Methods: https://docs.python.org/3/library/stdtypes.html#string-methods
- String Formatting:https://docs.python.org/3/library/string.html#new-string-formatting
- String Formatting Operations: https://docs.python.org/2/library/stdtypes.html#string-formatting
- 试题
1,下面哪个个字符串定义有错误?
A,r’C:\Program Files\foo\bar’
B,r’C:\Program Files\foo\bar’
C, r’C:\Program Files\foo\bar\’
D,r’C:\Program Files\foo\bar\\’
2,min(‘abcd’)的结果是?
A,a B,b |C,c D,d
2,max(‘abcd3A’)的结果是?
A,a B,3 |C,A D,d