Berkeley sockets notes


Last Updated: Tuesday September 4 2001

Your Moment of Zen

According to the socket(3) manpage, the socket(3) call was first found in 4.2BSD.

Important Note

If you don't know C, you shouldn't be reading this because you'll go blind and swear off programming for good ("Holy sh*t, I have to learn all this crap to start a dot-com?!"), possibly ruining a perfectly good CS undergrad. Didn't your mommy tell you what happens to computer science dropouts? Hint: they end up working for Fry's...

In case you were wondering, yes, your web browser really does do all this stuff. A web browser is just a big fat socket program that pays a lot of attention to the data it gets. Yes, even on Windows. They call it Windows Sockets, but it's still good ol' bind(2), select(2), and accept(2).

Now that that's over with, here we go...

H00k3d on S0ck3ts wurk3ed for m3!

The point of sockets is to provide a way to get information from one place to another via a computer network (which could use any of a hundred protocols) in a way that doesn't require that you rewrite your program every time it gets ported to a different platform. Here's how to do it, step by step.

  1. Create a socket. A socket is a lot like a filehandle, but instead of doing things like:

    FILE * fp;

    or

    int fd;

    you use the socket(3) call to create it. The socket(3) prototype on most true-blue UNIXes is

    #include <sys/types.h>
    #include <sys/socket.h> 
    int socket(int domain, int type, int protocol);  

    The domain is a namespace, or address family. Probably the most popular domain argument value is the Internet domain, PF_INET (the PF stands for protocol family). Other less commonly used arguments are PF_UNIX or PF_LOCAL for UNIX domain sockets in the local filesystem, PF_IPX (Novell IPX), PF_APPLETALK (Appletalk DDP), and PF_PACKET (for raw socket access on Linux).

    The second argument (int type) is the type of connection desired. This and the domain argument are used to map to the actual layer 4 protocol (such as UDP or SPX). If you're running some kind of UNIX variant, you must have the protocols in question compiled into your kernel. Here's the protocols I get with different arguments on my Linux 2.2 system (with IPX and Appletalk DDP compiled into the kernel) at home:

    SOCK_STREAM

    SOCK_DGRAM

    SOCK_SEQPACKET

    SOCK_RAW

    SOCK_RDM

    SOCK_PACKET

    PF_INET

    TCP/IP

    UDP/IP

    Nope

    Nope

    Nope

    Nope

    PF_IPX

    Nope

    Yes, ?

    Nope

    Nope

    Nope

    Nope

    PF_APPLETALK

    Nope

    Yes, DDP?

    Nope

    Nope

    Nope

    Nope

    /* TCP socket */
    s = socket(PF_INET, SOCK_STREAM, 0); 
    /* UDP socket */
    s = socket(PF_INET, SOCK_DGRAM, 0);
    

    Socket(3) returns -1 if an error occurs, or a socket handle if it succeeds.

    The third argument, protocol, is used to specify a specific layer-4 protocol if the combination of the domain and type arguments results in some ambiguity as to which layer-4 protocol should actually be used. For TCP and UDP, this is never used, and the value of 0 should be used to tell the socket(3) function to let it decide. The Linux socket(2) man page has this to say about it:

    The protocol specifies a particular protocol to be used with the
    socket.  Normally only a single protocol exists to support a
    particular socket type within a given protocol family.  However, it
    is possible that many protocols may exist, in which case a particular
    protocol must be specified in this manner.  The protocol number to use
    is specific to the "communication domain" in which communication is
    to take place; see protocols(5).  See getprotoent(3) on how to map
    protocol name strings to protocol numbers.
    

    Sample code for creating a TCP socket:

    int s;
    int ret;
    
    s = socket(PF_INET, SOCK_STREAM, 0);
    if(s == -1) {
    	perror("socket");
    }
    
    /* Do stuff here. */
    
    ret = close(s);
    if(ret == -1) {
    	perror("closing socket");
    }
    
    Always check the return value of system calls. Really. You'll thank me for it in the long run, even though it means opening up those horrible man pages and grepping for ERRORS section, horrors. For fun, go punch "always check the return value" into a search engine and you can find hours of entertainment with the documents that say things like "it took me 5 hours to realize that I wasn't checking the return value of open("/dev/null");" and "gosh, who would have thought that calling setsockopt(2) would fail?"

    Don't forget to call close(2) on the socket when you're finished with it! And don't forget to check the return value of close(2).

  2. (optional) Create the source address data structure, if you care about the "return address" of your packets. If you skip this step, the operating system will usually fill in reasonable defaults. For example, a TCP connection to a web server works fine if you use the defaults.

    However, NFS and other old-school services (like the r* family of services) usually refuse connections unless they come from a source port below 1024. If you are, it's assumed that you have root on one of the boxes on the network (or root is ok with you pretending you do), because only root can create connections from a port below 1024. Also, certain DNS implementations require that incoming DNS requests have a source port of 53. In these cases (and probably many others), you must fill in the source address structure in order to specify the correct return address for the connection, or the server on the other end will refuse it.

    The source address data structure is (for PF_INET) sockaddr_in. Other protocols use different structures. Here are some examples:

    IPX

    Appletalk DDP

    ATM

    NetBEUI

    Windows sockets 2

    sockaddr_ipx

    sockaddr_at (?)

    sockaddr_atm

    ?

    Linux 2.4

    sockaddr_ipx

    sockaddr_atalk

    sockaddr_atmpvc, sockaddr_atmsvc

    sockaddr_netbeui

    FreeBSD 4.1-STABLE

    sockaddr_ipx

    Nope

    sockaddr_atm

    Nope



    Typical code:

    #include <sys/types.h>
    #include <sys/socket.h>
    
    struct sockaddr_in src;
    src.sin_family = AF_INET;
    src.sin_port = htons(INADDR_ANY);
    src.sin_addr.s_addr = htons(INADDR_ANY);
    

    What I've seen is that the C library headers define a structure called struct sockaddr, which is defined (on Linux, at least) as:

    typedef unsigned short sa_family_t;
    
    struct sockaddr {
            sa_family_t sa_family;
    	char        sa_data[14];
    }
    

    You can think of struct sockaddr as a kind of container which the other sockaddr_* structs fit inside of. Many of the library routines return a struct sockaddr, and it's much easier to just cast whatever you need to do to struct sockaddr instead of having a zillion different library routines for the many protocols the library can handle. Here are how some of the sockaddr structs are defined:

    
    struct sockaddr_in {
            short int sin_family;         // 1: address family
    	struct in_addr sin_addr {     // 2: source address
    	        unsigned long int s_addr;
    	}
    	unsigned short int sin_port;  // 3: source port
    }
      
    #define IPX_NODE_LEN    6
     
    struct sockaddr_ipx
    {
            sa_family_t     sipx_family;
            __u16           sipx_port;
            __u32           sipx_network;
            unsigned char   sipx_node[IPX_NODE_LEN];
            __u8            sipx_type;
            unsigned char   sipx_zero;      /* 16 byte fill */
    };             
    
  3. Now that a socket has been created, you probably need to look up the address of the destination machine. The functions in question are
    #include <netdb.h>
    
    struct hostent *gethostbyname(const char *name);
    struct hostent *gethostbyname2(const char *name, int af);
     
    #include <sys/socket.h>        /* for AF_INET */
    struct hostent *gethostbyaddr(const char *addr, int len, int type);
     
    void sethostent(int stayopen);
     
    void endhostent(void);
     
    void herror(const char *s);
     
    const char * hstrerror(int err);
    
    struct hostent {
            char   h_name;
    	char **h_aliases;
    	int    h_addrtype;
    	int    h_length;
    	char **h_addr_list;
    	char  *h_addr;
    };
    
    /* 
     * resolve-yahoo.c
     */
    #include <stdio.h>
    
    #include <netdb.h>
    
    #include <sys/socket.h>
    #include <netinet/in.h>
    #include <arpa/inet.h>
    
    int main(void) {
      int i;
      struct hostent *h;
    
      h = gethostbyname("www.yahoo.com");
      if(h == NULL) {
        perror("gethostbyname");
        return 1;
      }
    
      if(h->h_length != 4) {
        fprintf(stderr, "Error: non-IPv4 address returned by resolver.\n");
        return 1;
      }
      
      printf("official name: %s\n\n", h->h_name);
      for(i = 0; h->h_aliases[i] != NULL; ++i) {
        printf("alias %2i: %2s\n", i, h->h_aliases[i]);
      }
    
      printf("\n");
    
      for(i = 0; h->h_addr_list[i] != NULL; ++i) {
        printf("addr  %2i: %2s\n", i, inet_ntoa(*((struct in_addr *)h->h_addr_list[i])));
      }
    
      return 0;
    }
    
    #include <netdb.h>
    
    /* for inet_network(3) */
    
    #include <sys/socket.h>
    #include <netinet/in.h>
    #include <arpa/inet.h>  
    
    
    struct sockaddr_in src;
    struct servent * service;
    int port = 0, ret = 0;
    char int_interface[] = "192.168.0.1";
    
    service = getservbyname("domain", "tcp");
    if(service == NULL) {
        perror("getservbyname");
        exit(1);
    }
    
    /* need to convert from network to host byte order using ntohs(3) */
    port = ntohs(service->s_port);
    
    addr = htonl(inet_network(int_interface));
    
    src.sin_family = AF_INET;
    src.sin_port = htons(port);
    src.sin_addr.s_addr = htonl(addr);
    


  4. This is required only if you executed step 2. Use bind(3) to bind the source address data structure to the socket handle (the return value of the socket(3) call). The usual prototype is:

    #include <sys/types.h>
    #include <sys/socket.h>

    int bind(int sockfd, struct sockaddr *my_addr, int addrlen);

    Typical usage is:

    bind(s, (struct sockaddr_in *) &source,

  5. (optional) set SO_LINGER if you don't want your socket to hang unreasonably. For example, if you're connecting to a computer which may not be present, and the default TCP timeouts are longer than you really need, you can set SO_LINGER to speed things up quite a bit.

  6. Create destination address data structure

  7. Connect with connect(3).

  8. Send data using send(3).

  9. Close the socket using shutdown(2).

A Name Like Any Other

Ok, so now you know how to connect to a system, great. But just where do you get those network addresses? When you use your web browser, you type in things like "www.foo.com", not "123.456.789.123". Well, it turns out that this task of having easy to use dot-com names for addresses of servers on the Internet is a really, really hard problem. The solution is the DNS, or Domain Name System.

The Domain Name System is physically a set of servers on the Internet, the most important of which are managed by big organizations like IBM, NASA, and the ISC. They maintain what's called the "top-level" or "root" domain name servers. Those servers are responsible for simply referring you to another organization's name server when you look up "lightconsulting.com", for example.

FIXME graphic

Sockets in Perl

Servers

To make a TCP server in Perl, you need to do the following steps:

  1. Enable taint mode. See the Perl docs for details on this.
  2. (optional if you know the numeric value of the port) Get the port to connect to using getprotobyname().
  3. (optional if you know that your combination of protocol family and socket type result in TCP on your platform) Get the protocol number for TCP using getprotobyname().
  4. Create the socket using socket().
  5. Optionally set options on the socket FIXME what's a socket level?
  6. Bind the socket to the address you want. You have to pass the bind() function a sockaddr_in structure, which you can get by handing a port and address to the sockaddr_in() helper sub in the Socket module.
  7. Listen for connections using the listen() perl builtin. If you're using UDP or a socket type that's not SOCK_STREAM or SOCK_SEQPACKET you don't have to listen(), you just read() from the socket. The second argument specifies the number of pending connections (ones that haven't completed the three-way handshake) that can be active until the system starts sending FIN packets back to new SYN connections.
  8. Set up a signal handler for your system's CHLD signal. Alternatively, you could set the CHLD signal to 'IGNORE'. On most systems, this will automatically deal with the reaping of child processes. If you want/need to write the signal handler yourself, a typical one looks like this:
    sub reaper {
      $waitedpid = wait;     # wait until the child dies then clean it up
      $SIG{CHLD} = \&reaper; # signal handlers are one-shot; this resets it 
    }
    
    $SIG{CHLD} = \&reaper # set up the signal handler
    
  9. Accept an incoming connection using the accept() perl builtin. Save the sockaddr_in structure returned, you'll need it for the next step.
  10. Use the sockaddr_in sub from the Socket module to take apart the structure returned from accept(). Log the port and address. Use gethostbyaddr() to do a reverse DNS lookup on the address. Then do another forward lookup on the results of the reverse lookup and search the list of addresses to see if one of them matches the address on the other side of the connection. So:

    accept(Client, Server) -> 1035, 1.2.3.4
    gethostbyaddr(1.2.3.4) -> www.whitehouse.gov
    gethostbyname(www.whitehouse.gov) -> 5.6.7.8, 5.6.7.9, 5.6.7.10, 5.6.7.11

    If you can't find a match, someone is spoofing their reverse DNS. Log a warning.

  11. Fork a subprocess to deal with the connection.
  12. Return to listening for an incoming connection.

Home | Site Index | Email me