某Q阅读网页版正文JS+CSS解密教程
本帖最后由 洛璃 于 2025-1-31 13:36 编辑该教程仅供技术交流,并且无法获得正常正文(字体解密无解),可以参考一下JS+CSS解密流程(比较简单)
某些原因,该教程无法写过于详细,但是核心步骤都会给出
获取网页源代码,可以在末尾找到fkconfig和content
content通过js加密,具体加密js可通过浏览器网络抓包获取,得到:
window = global;
window.self = window;
window.outerHeight = 1030
window.innerHeight = 940
location = {
protocol: "https:",
hostname: "book.qq.com",
}
navigator = {
userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36',
}
window.Fock = require('./work')
var beimu = {
initFock: function initFock(fkConfig) {
var fuid = fkConfig.fuid;
window.Fock.setupUserKey(fuid);
try {
eval(atob(fkConfig.fkp))
} catch (e) {
console.error("fkp errpr", e)
}
},
showContent: function (fkConfig, data, e) {
this.initFock(fkConfig);
window.Fock.unlock(data, String(e), (function () {
debugger;
var res = arguments;
console.log(btoa(encodeURI(res)))
}))
}
}
var fkConfig = __fk_config__
var content = '__page_content__'
var page_num = "__page_num__"
beimu.showContent(fkConfig, content, page_num)
然后Python调用js:
import subprocess
import json
import base64
from urllib.parse import quote,unquote
with open('main_module.js','r',encoding='utf-8') as f:
js_code = f.read()
page_num = 84
js_code = js_code.replace('__fk_config__',json.dumps(fkConfig))
js_code = js_code.replace('__page_content__',content)
js_code = js_code.replace('__page_num__',str(page_num))
with open('main.js','w',encoding='utf-8') as f:
f.write(js_code)
res = subprocess.check_output(['node','main.js'],text=True).strip()
res = base64.b64decode(res).decode('utf-8')
res = unquote(res)
print(res)
然后得到JS解密后的内容,类似于:
<p class="p0"><y4v>㺹</y4v><ykz>㐃</ykz><yht>妻</yht><yni>择</yni><yku>㐂</yku><y86>迅</y86>
接着看抓包后的style.css,进行css解密
核心代码:
# 定义正则表达式,适配多行内容
order_pattern = re.compile(r"(\S+)\s*\{\s*order:\s*(\d+);?\s*\}")
xpath_pattern = re.compile(r"\.(\S+)\s(\S+)\s+\{")
css_file = cssutils.parseFile('style.css')
selector = etree.HTML(text=html_text)
name_list = set()
for rule in css_file:
x_matches = xpath_pattern.findall(rule.cssText)
if len(x_matches)>=1:
class_name = x_matches
tag_name = x_matches
if "::" in tag_name:
tag_a = tag_name.split("::")
tag_name = tag_name.split("::")
# print(tag_a)
name_list.add(tag_a)
xpath_ste = '//*[@class="{}"]/{}'.format(class_name, tag_name)
tag = selector.xpath(xpath_ste)
if tag_a =='before':
t = tag.text
if t is None:
t = ''
c = re.search('content:\s(\S+)\\n',rule.cssText)
if c:
c = c.group(1)
if c.find('attr')!=-1:
a = re.search('attr\((\S+)\)',c).group(1)
v = tag.attrib.get(a)
t = "{}{}".format(v,t)
else:
t = "{}{}".format(c.replace('"',''),t)
tag.text = t
# print('debugger')
elif tag_a =='first-letter':
if rule.cssText.find('font-size: 0') != -1:
t = tag.text
tag.text = t
# print(t)
elif tag_a =='after':
t = tag.text
if t is None:
t = ''
c = re.search('content:\s(\S+)\\n', rule.cssText)
if c:
c = c.group(1)
if c.find('attr') != -1:
a = re.search('attr\((\S+)\)', c).group(1)
v = tag.attrib.get(a)
t = "{}{}".format(t,v)
else:
t = "{}{}".format(t, c.replace('"',''))
tag.text = t
# print('debugger')
else:
if rule.cssText.find('font-size: 0') != -1:
class_name = re.search('\.(\S+)\s\{',rule.cssText).group(1)
xpath_ste = '//*[@class="{}"]'.format(class_name)
tag_list = selector.xpath(xpath_ste)
for tag in tag_list:
tag.text = ''
if rule.cssText.find('scalex') != -1:
# 这里是字符反转 未作指定使用
pass
接着,遍历CSS文件中的规则并提取键值对
order_key_value_pairs = []
for rule in css_file:
order_matches = order_pattern.findall(rule.cssText)
if order_matches:
order_key_value_pairs.extend(order_matches)
最后的处理:
order_key_value_dict = {key: int(value) for key, value in order_key_value_pairs}
selector = etree.HTML(text=modified_html_string)
p_list = selector.xpath('/html/body/p')
html_text_list = []
for p in p_list:
if p.attrib.get('class'):
name_list = p.xpath('./*')
# 按照 order_key_value_dict 中的值对 name_list 进行排序
name_list_sorted = sorted(name_list, key=lambda x: order_key_value_dict.get(x.tag, float('inf')))
text = "".join()
# 创建一个新的 <p> 元素,保留原来的属性
new_p = etree.Element(p.tag, attrib=p.attrib)
for element in name_list_sorted:
new_p.append(element)
# 将新的 <p> 元素序列化为字符串
result = etree.tostring(new_p, pretty_print=True).decode('utf-8')
else:
result = etree.tostring(p, pretty_print=True).decode('utf-8')
# print(result)
html_text_list.append(result)
html_text = "".join(html_text_list)
# print(html_text)
#
old_woff = UniversalFontRecognition('fixed.m4mlmb6d.woff2')
random_woff = UniversalFontRecognition('font.ttf')
old_recognition_result = old_woff.crack()
random_recognition_result = random_woff.crack()
def replace_with_upper(match):
r = old_recognition_result.get(int(match.group(1)))
if r:
return r
else:
return random_recognition_result.get(int(match.group(1)))
new_text = re.sub("&#(\d+?)\;", replace_with_upper, html_text)
print(new_text)
得到最终内容,使用了自定义字体,但由于字体是随机的,无法进行打表解密
不过可以参考以上的JS+CSS解密流程,仅供交流学习使用 路҈̣͉̮͍̩̜̞̘̥̟̜̠͔̭̭̙̯̘̞̥̣̓́̒̓̋͌͗̊̅̓̓͌͆̑͐̀ͅ过҉̲̤̖̳̙̠̫̘̯̤̦͓̘̰̐͐̋̍̉̓͑̃̏͌͂͌̍͗́̊̍͒͆̚ͅ 小白 路过。 路过 感谢分享 路过 看看 感谢分享
页:
[1]