Phantom evaluate with a random value after js delay

56 1

Above mentioned

The page is controlled by js,"dynamic". For example, there's a full screen ad in the target site, when the page is loaded after a few seconds after a few seconds. ( if you'll open this page, please turn off the agent, disable the ligature ad )


I'm using page to grab these routes, using. Evaluate to calculate the js page with. In this case, get all the a link ( doing so, just for better expression of this problem ). But accidentally found that the number of links to a link"random". Sometimes 1850, sometimes 1871, sometimes 1872 ( this is all links ).


I tried a page with no js delay, so that the number of a is fixed, but like the page above, the value is"random".

Suspect and question

Look at the results, and now evaluate the js page after you can grab the correct number of 1872, but this is what happens. It isn't familiar to js, please.


console.log('Loading a web page');
var page = require('webpage').create();
var url = '';, function() {
 var links = page.evaluate(function() {
 return document.querySelectorAll("a");
 console.log("有这么多a链接:" + links.length);
2 Answers

138 5

I tried 10 times on the code above, two times 1872, and the rest is 1873.


I found http://jsqmt. Qq. Com/cdn_djl. Js asynchronously loaded on the page. This js file found a 5 second delay, if you add a breakpoint inside the page, and then get the number of a tags on the page. In this file, the ads are loaded, and a a tag is generated. Because of this setTimeout, it isn't guaranteed that the ad is loaded when phantomjs 's callback function is executed. Because the load time of the page is greater than 5 seconds, the code associated with this ad will be executed and the result will be the final result; But if the page loads less than 5 seconds, the code won't be executed, and the result will be less. I see the page when I see the page loaded in the browser, and the specific value varies depending on the speed of the net.