编写爬虫过程中遇到数据加密或者JS混淆的情况,导致无法获取明文,现通过实例进行JS逆向获取明文信息;
现通过搜索返回内容的关键字“encrypt_data”查询前端的关键函数;
通过debug找到Z函数
function Z(f) {
return JSON.parse(W("sjdqmp20161205#_316@gfmt", J.decode(f), 0, 0, "012345677890123", 1))
}
Z函数中存在W函数和J函数,在对其进行进一步分析调试
W函数如下
function W(f, c, s, v, C, h) {
var m = new Array(16843776,0,65536,16843780,16842756,66564,4,65536,1024,16843776,16843780,1024,16778244,16842756,16777216,4,1028,16778240,16778240
//太多了,省略…
y = y.replace(/\0*$/g, ""),
!s) {
if (h === 1) {
var E = y.length
, O = 0;
E && (O = y.charCodeAt(E - 1)),
O <= 8 && (y = y.substring(0, E - O))
}
y = decodeURIComponent(escape(y))
}
return y
}
J.decode方法
function(t) {
t = String(t).replace(R, "");
var n = t.length;
n % 4 == 0 && (t = t.replace(/==?$/, ""),
n = t.length),
(n % 4 == 1 || /[^+a-zA-Z0-9/]/.test(t)) && F("Invalid character: the string to be decoded is not correctly encoded.");
for (var i = 0, A, r, p = "", d = -1; ++d < n; )
r = w.indexOf(t.charAt(d)),
A = i % 4 ? A * 64 + r : r,
i++ % 4 && (p += String.fromCharCode(255 & A >> (-2 * i & 6)));
return p
}
以上就是前端数据混淆的所有函数、方法,现在将函数写入js文件,
通过nodejs执行js文件成功解密出明文;
自动化批量查询
使用Python获取密文,并通过Python执行JS文件并传入密文获取结果
#/usr/bin/env python3
#coding:utf-8
import requests
import execjs
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:128.0) Gecko/20100101 Firefox/128.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/png,image/svg+xml,*/*;q=0.8',
'Accept-Language': 'zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
'Priority': 'u=0, i',
'Pragma': 'no-cache',
'Cache-Control': 'no-cache',
}
html = requests.get(url, headers=headers).json()
decodeData = execjs.compile(open("./1.js", "r", encoding="utf-8").read()).call('Z',html['encrypt_data'])
print(decodeData)