Reading a website in PHP
step 1:
First we need to open a socket to www.website.com, and use port 80 for this.
$website = fsockopen(’www.website.com’, 80);
step:2
check whether the socket is connected. Now we need to use some http protocol, we should let the server know what we’re looking at. The following is the protocol code.
GET / HTTP/1.1
Host: www.website.com
Connection: Close
The first line, the GET / part is to indicate what file we want to read. This time we want to read the frontpage so we leave the / like it is. When you want to open another page you will use
GET /anotherfolder/anotherfile.html HTTP/1.1.
The second line is simply the host. In this tutorial it’s www.website.com. When opening another website, don’t add the filename to it.
The last line makes sense, just to close the http request. When sending these lines to the http server, you need to add carriage returns and new lines to it. With last Connection: Close line, you need to add it twice.
$website = fsockopen(’www.website.com’, 80);
// check if the website is found
if(!$website) {
echo ‘Could’t open!’;
} else {
// write to the http server
fwrite($website, “GET / HTTP/1.1\r\n”);
fwrite($website, “Host: www.website.com\r\n”);
fwrite($website, “Connection: Close\r\n\r\n”);
}
Step 3:
The final step is to read the website.
$website = fsockopen(’www.website.com’, 80);
// check if the website is found
if(!$website) {
echo ‘Could’t open website!’;
} else {
// write to the http server
fwrite($website, “GET / HTTP/1.1\r\n”);
fwrite($website, “Host: www.website.com\r\n”);
fwrite($website, “Connection: Close\r\n\r\n”);
// a variable for storing the html code
$html = ‘’;
// read the website
while(!feof($website)) {
// store the html into a variable
$html .= fgets($website, 128);
}
// when we’re done we’ll close the socket
fclose($website);
}
The reading is done by the fgets() function inside the while loop. The !feof() function checks if the website is totally read, when it isn’t it will continue looping. When feof() sees the file is totally read, it will stop looping. In the end the socket will be closed.
