用Python爬取了《雪中悍刀行》数据，终于知道它为什么这么火了

绪论

大家好，我是J哥。

如何查找视频id？

项目结构：

展开全文

一.爬虫部分：

二.数据处理部分

#coding=gbkimportcsvimporttimecsvFile=open("data.csv",'w',newline='',encoding='utf-8')writer=csv.writer(csvFile)csvRow=[]#print(csvRow)f=open("time.txt",'r',encoding='utf-8')forlineinf:csvRow=int(line)#print(csvRow)timeArray=time.localtime(csvRow)csvRow=time.strftime("%Y-%m-%d%H:%M:%S",timeArray)print(csvRow)csvRow=csvRow.split()writer.writerow(csvRow)f.close()csvFile.close()

#coding=gbkimportcsvcsvFile=open("content.csv",'w',newline='',encoding='utf-8')writer=csv.writer(csvFile)csvRow=[]f=open("content.txt",'r',encoding='utf-8')forlineinf:csvRow=line.split()writer.writerow(csvRow)f.close()csvFile.close()

三.数据分析

1.制作词云图

wc.py

importnumpyasnpimportreimportjiebafromwordcloudimportWordCloudfrommatplotlibimportpyplotaspltfromPILimportImage#上面的包自己安装，不会的就百度f=open('content.txt','r',encoding='utf-8')#这是数据源，也就是想生成词云的数据txt=f.read()#读取文件f.close()#关闭文件，其实用with就好，但是懒得改了#如果是文章的话，需要用到jieba分词，分完之后也可以自己处理下再生成词云newtxt=re.sub("[A-Za-z0-9\!\%\[\]\,\。]","",txt)print(newtxt)words=jieba.lcut(newtxt)img=Image.open(r'wc.jpg

')#想要搞得形状img_array=np.array(img)#相关配置，里面这个collocations配置可以避免重复wordcloud=WordCloud(background_color="white",width=1080,height=960,font_path="../文悦新青年.otf",max_words=150,scale=10,#清晰度max_font_size=100,mask=img_array,collocations=False).generate(newtxt)plt.imshow(wordcloud)plt.axis('off')plt.show()wordcloud.to_file('wc.png

轮廓图：wc.jpg

在这里插入图片描述

词云图：result.png

（注：这里要把英文字母过滤掉）

效果图：DrawBar.html

效果图

总结

历史小故事

历史人物故事_中国历史朝代顺序_历史记录

这么 终于 为什么 数据 Python

用Python爬取了《雪中悍刀行》数据，终于知道它为什么这么火了

jnlyseo998998 发表于2023-04-01 01:22:02 浏览38 评论0

少长咸集

历史人物故事_中国历史朝代顺序_历史记录

这么 终于 为什么 数据 Python

用Python爬取了《雪中悍刀行》数据，终于知道它为什么这么火了

jnlyseo998998 发表于2023-04-01 01:22:02 浏览38 评论0

少长咸集

这么终于为什么数据 Python