「【水】【爬虫】【东方】」

【水】【爬虫】【东方】

板块灌水区
楼主blue_space
当前回复12
已保存回复12
发布时间2021/11/3 22:04
上次更新2023/11/4 01:29:07

被骇客银狼阻止的越权访问保存失败

【水】【爬虫】【东方】

blue_space楼主2021/11/3 22:04

蒟蒻想下载东方的官方歌曲，于是就来到了thb，但是需要一首一首歌去下，但是我看到了评论区有好心人发了一个爬虫代码，让其他人去爬，但是这个好心人不仅没有注释，还没有缩进，是如下的一堆乱码

import requests from bs4 import BeautifulSoup import re import os if __name__ == '__main__': mainURL=BeautifulSoup(requests.get("https://thwiki.cc/\ %E5%8E%9F%E6%9B%B2%E5%88%97%E8%A1%A8",headers={ "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:88.0) \ Gecko/20100101 Firefox/88.0"}).text, "html.parser") n=[] x=[] for x_v in mainURL.find_all('ul'): x.append(BeautifulSoup(str(x_v), "html.parser")) for x_v2 in x: x_v2=x_v2.find_all("a") try: for x_v3 in x_v2: n.append("https://thwiki.cc"+x_v3.get("href")) except: pass for n_v1 in n: try: n_req1=requests.get(n_v1,headers={"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:88.0) \ Gecko/20100101 Firefox/88.0"}) except: continue n_bs1=BeautifulSoup(n_req1.text,"html.parser") mus_link1=n_bs1.find("audio") if mus_link1: os.system("aria2c "+str(mus_link1.get("src"))) print(mus_link1) print(len(n))

所以有没有好心人帮忙调一下，或者有东方歌曲的合集文件直接发我也行

2021/11/3 22:04