Python Keeps Alignment when Printing ASCII Characters Including Chinese
When we print Python strings, we add spaces and tabs between words to align the output list or ASCII table.
For example, the following ASCII table
# ╔════╦════════╦═════════╦═══════╗ # ║ id ║ name ║ course ║ score ║ # ╠════╬════════╬═════════╬═══════╣ # ║ 1 ║ Alex ║ English ║ 90 ║ # ║ 2 ║ Elaine ║ Math ║ 92 ║ # ║ 3 ║ Tom ║ Science ║ 88 ║ # ║ 4 ║ Sophia ║ History ║ 94 ║ # ╚════╩════════╩═════════╩═══════╝
There is no problem when the strings are all in English, but once the string contains Chinese or other non-ASCII characters, it will be difficult to align.
World!! ║ # 7 normal spaces Hello你好 ║ # 5 normal spaces Hello你好 ║ # 6 normal spaces
The reason is that the width of non-ASCII characters such as Chinese is larger than that of English letters, which requires special treatment for Chinese.
In order to align ASCII characters, enough spaces are usually added to the character gap. The ordinary spaces
" " belong to ASCII characters, and the Unicode code is
U+0020, and if there are Chinese or Chinese punctuation marks, you need to use Chinese Full-width spaces to fill blanks, Unicode encoding
Here we make a simple demonstration. When we recognize characters, after counting the length of Chinese characters, use the same length of ordinary spaces and the remaining length of full-width spaces to fill in. Equivalent to every ASCII character has a non-ASCII character pair, the length must be the same.
You can get the following output
Hello你好 ║ # 2 normal spaces + 5 full-width spaces World!! ║ # 7 full-width spaces
import re re_chinese = re.compile(r"[\u4e00-\u9fa5\！\？\。\＂\＇\（\）\＊\＋\，\－\／\：\；\＜\＝\＞\＠\［\＼\］\＾\＿\｀\｛\｜\｝\～\｟\｠\、\〃\《\》\「\」\『\』\【\】\〔\〕\〖\〗\〘\〙\〚\〛\〜\〝\〞\〟\〰\〾\〿\–\—\‘\’\‛\“\”\„\‟\…\‧\﹏\．]", re.S) def format_ascii(text) : t= re.findall(re_chinese,text) count = len(t) return text + " " * count + u"\u3000" * (len(text) - count) print(format_ascii("Hello你好") + "║") print(format_ascii("World!!") + "║")
Online Demo Python Online Editor
The above is about the Chinese string alignment problem encountered in Python development, which basically meets our development needs, and there may be some details that have not been noticed. You are welcome to put forward better ideas.