Getting a web page with sockets

I am currently working on socket programming and have encountered a problem that I need help with. What I'm trying to do is write a small Java class that will connect to the web host, load the default page, and then disconnect from the host. I know it's easier to use URLConnection for this, but I'm trying to learn the Sockets classes. I will be able to connect to the web server, but I’m having difficulty pulling the page. This is what I'm working for (and not working for):

import java.io.*; import java.net.*; import java.lang.IllegalArgumentException; public class SocketsFun{ public static void main(String[] myArgs){ // Set some variables String theServer = null; String theLine = null; int thePort = 0; Socket theSocket = null; boolean exit = false; boolean socketCheck = false; BufferedReader theInput = null; // Grab the server and port number try{ theServer = myArgs[0]; thePort = Integer.parseInt(myArgs[1]); System.out.println("Opening a connection to " + theServer + " on port " + thePort); } catch(ArrayIndexOutOfBoundsException aioobe){ System.out.println("usage: SocketsFun host port"); exit = true; } catch(NumberFormatException nfe) { System.out.println("usage: SocketsFun host port"); exit = true; } if(!exit){ // Open the socket try{ theSocket = new Socket(theServer, thePort); } catch(UnknownHostException uhe){ System.out.println("* " + theServer + " does not exist"); } catch(IOException ioe){ System.out.println("* " + "Connection Refused"); } catch(IllegalArgumentException iae){ System.out.println("* " + thePort + " Not A Valid TCP/UDP Port."); } // Print out some stuff try{ System.out.println("Connected Socket: " + theSocket.toString()); } catch(Exception e){ System.out.println("* " + "No Open Socket"); } try{ theInput = new BufferedReader(new InputStreamReader(theSocket.getInputStream())); while ((theLine = theInput.readLine()) != null){ System.out.println(theLine); } theInput.close(); } catch(IOException ioe){ System.out.println("* " + "No Data To Read"); } catch(NullPointerException npe){ System.out.println("* " + "No Data To Read"); } // Close the socket try{ socketCheck = theSocket.isConnected(); } catch(NullPointerException npe){ System.out.println("* " + "No Socket To Close"); } } } } 

All I want is for this class to spit out what can be inferred from "curl", "lynx -dump" or "wget", etc. Any help is appreciated.

+4
source share
2 answers

You have the right idea, but you are not sending an HTTP request. Submit:

GET / HTTP/1.1\r\nHost: <hostname\r\n\r\n

It follows the format

  [METHOD] [PATH] HTTP / 1.1 [CRLF]
 Host: [HOSTNAME] [CRLF]
 OTHER: HEADERS [CRLF]
 [CRLF] 

You should get an answer that follows a similar format: header, blank line and data. Read more about the HTTP protocol.

EDIT Perhaps this will help to understand the syntax of the HTTP request. It's pretty simple, and just good to know overall. Open a terminal and use netcat (preferably) or telnet . netcat google.com 80 or telnet google.com 80 . A type:

  GET / HTTP / 1.1 [ENTER]
 Host: google.com [ENTER]
 [ENTER] 

I get a response (after the second return):

  HTTP / 1.1 301 Moved Permanently
 Location: http://www.google.com/
 Content-Type: text / html;  charset = utf-8
 Date: Thu, 09 Dec 2010 00:03:39 GMT
 Expires: Sat, 08 Jan 2011 00:03:39 GMT
 Cache-Control: public, max-age = 2592000
 Server: gws
 Content-Length: 219
 X-XSS-Protection: 1;  mode = block

 <HTML & <HEAD> <meta http-equiv = "content-type" content = "text / html; charset = utf-8">
 <TITLE> 301 Moved </TITLE> </HEAD> <BODY>
 <H1> 301 Moved </H1>
 The document has moved
 <A HREF="http://www.google.com/"> here </A>.
 </BODY> </HTML> 

As soon as you feel the syntax of the request, just write it on the socket, then read the lines until the server closes, as you do.

+6
source

You need to write something to the socket output stream. Web servers are waiting for a request from the client before sending anything: the "GET" entry will ask the server to return the default page.

Your code does not write anything, so the server will always wait.

0
source

All Articles