when try markafoni.com's html data curl, returns;
<script> document.cookie = 'nsid=2;expires=sun, 17-jan-2038 01:00:00 gmt'; location.reload(true); </script><noscript>%90'a varan indirim markafoni'de</noscript>
$ch = curl_init(); curl_setopt($ch, curlopt_header,true); curl_setopt($ch, curlopt_cookiefile, 'cookie.txt'); curl_setopt($ch, curlopt_cookiejar, 'cookie.txt'); curl_setopt($ch, curlopt_returntransfer,true); curl_setopt($ch, curlopt_cookiesession,true); curl_setopt($ch, curlopt_ssl_verifypeer,false); curl_setopt($ch, curlopt_ssl_verifyhost,false); curl_setopt($ch, curlopt_followlocation,true); curl_setopt($ch, curlopt_useragent, "mozilla/5.0 (windows; u; windows nt 5.1; en-us; rv:1.8.1.6) gecko/20070725 firefox/2.0.0.6"); curl_setopt($ch, curlopt_referer, 'http://www.markafoni.com/'); curl_setopt($ch, curlopt_url, 'https://www.markafoni.com/');
how can solve problem?
the problem server's technique setting cookies, rather idiosyncratic. may in fact intended prevent screen-scraping you're doing, there may other reasons.
the server has 2 different responses.
- if there no
nsid
cookie set, send javascript set 1 , send nothing else. - if there
nsid
cookie set, send page content.
curl can receive , set cookies server, following code:
curl_setopt($ch, curlopt_cookiefile, 'cookie.txt'); curl_setopt($ch, curlopt_cookiejar, 'cookie.txt');
this, however, presumes server setting cookies in normal way, i.e. using cookie
http header. since it's doing javascript (highly idiosyncratic!) curl doesn't understand it.
you'll have set cookie using curlopt_cookie
option:
curl_setopt($ch, curlopt_cookie, 'nsid=2');
the curlopt_cookie
option sets cookie string sent curl.
Comments
Post a Comment