美文网首页
python3读取 UCS-2 little endian(ut

python3读取 UCS-2 little endian(ut

作者: 小王同学123321 | 来源:发表于2019-01-08 22:01 被阅读0次

需要分析邮件数据,将邮件保存本地为html的文件
将windows端的html文件上传到linux,通过vim的:set fileencoding命令查看文档是utf-16-le编码的

import os
import codecs
from bs4 import BeautifulSoup
def parseFile(filepath):
    try:
        with open(filepath, 'r') as fp:
                encoding = 'utf-16-le'
                with codecs.open(filepath, 'r', encoding) as fp2:
                     soup = BeautifulSoup(fp2,'lxml')
                     print(soup)                
    except Exception,ex:
        print '[ERROR]--',ex
 
if __name__ == '__main__':
    filepath = './Signature.txt'
    parseFile(filepath)

相关文章

网友评论

      本文标题:python3读取 UCS-2 little endian(ut

      本文链接:https://www.haomeiwen.com/subject/zukirqtx.html