Sockets and winsock

Winsock ('Windows Sockets') is the Windows API that deals with networking. Many functions are implemented in the same way as the Berkeley socket functions used in BSD Unix.

1. Sockets

So what's a socket?

Socket: As explained in the previous chapter, you will work with two-way connections. The endpoints of this connection are the sockets. Both the client and the server have a socket. A socket is associated with a certain IP and port number.

Almost all winsock functions operate on a socket, as it's your handle to the connection. Both sides of the connection use a socket, and they are not platform-specific (ie. a Windows and Unix machine can talk to each other using sockets). Sockets are also two-way, data can be both sent and received on a socket.

There are two common types for a socket, one is a streaming socket (SOCK_STREAM), the other is a datagram socket (SOCK_DGRAM). The streaming variant is designed for applications that need a reliable connection, often using continuous streams of data. The protocol used for this type of socket is TCP. I will only use this type in my tutorial as it's most commonly used for the well known protocols like HTTP, TCP, SMTP, POP3 etc.

Datagram sockets use UDP as underlying protocol, are connectionless, and have a maximum buffer size. They are intended for applications that send data in small packages and that do not require perfect reliability. Unlike streaming sockets, datagram sockets do not guarantee data will reach its destination nor that it comes in the right order. Datagram sockets can be slightly faster and useful for applications like streaming audio or video, where reliability is not as high on the priority list as speed and latency. Where the reliability is required, streaming sockets are used.

2. Binding sockets

Binding a socket means associating a specific address (IP & port number) with a given socket. This can be done manually using the bind function, but in some cases winsock will automatically bind the socket. This will become clear in the next paragraphs.

3. Connecting

The way you use a socket depends on whether you are on the client side or the server side. The client side initiates a connection by creating a socket, and calling the connect function with the specified address information. Before the socket is connected, it is not bound yet to an IP or port number. Because the client side can use any IP and port number for the connection with the server (provided that network the IP number is part of can reach the network of the destination IP), often many useable combinations are possible.

When connect is called, winsock will choose the IP and port number to use for the connection and bind the socket to it before actually connecting it. The port number can be anything that is free at the moment, the IP number needs a bit more care. PCs may have more than one IP. For example, a PC connected to both the internet and a local network has at least three IPs (the external IP for use with the internet, the local network IP (192.168.x.x, 10.0.x.x etc.) and the loop back address (127.0.0.1)). Here, it does matter to which IP the socket is bound as it also determines the network you are using for the connection. If you want to connect to the local PC 192.168.0.4, you cannot do that using the network of your internet provider, as that IP is never used in the internet and will not be found. So you would have to bind the socket to your IP in the same network (192.168.0.1 for example). Similarly, when you bind the socket to the local loop back address (127.0.0.1), you can only connect to that same address, as no other address exist in that 'network'.

Fortunately, winsock will choose a local IP it can use for the IP you want to connect to automatically. Nothing stops you from binding the socket yourself, but remember that you need to take the situations above in consideration.

Note that the bind function gives the user the option to set the IP or port number to zero. In this case, zero means 'let winsock choose something for me'. This is useful when you do want to connect using a specific IP on the client side, but do not care about the port number used.

4. Listening

Things are different on the server side. A server has to wait for incoming connections and clients will need to know both the IP and port number of the server to be able to connect to it. To make things easy, servers almost always use a fixed port number (often the default port number for the used protocol).

Waiting for incoming connections on a specified address is called listening:

Listening: A socket is listening when it is in a state where it will 'listen' for incoming connections. Usually, this is done on a socket bound to a specific address known to the client.

As you can see from the definition above, sockets are often bound to an address before putting it in the listening state. When the port number of this address is set to a fixed number, the server will listen for incoming connections on that port number specifically. For example, port 80 (the default for HTTP) is listened on by most web servers. The socket can be bound to a specific IP as well but when zero is chosen it will listen on any addresses available, effectively allowing connections from all networks. It may be set to a fixed IP, for example the IP of the local network interface, so computers from the local network can connect to the server but not the ones connected via the internet.

When a client requests a connection to a listening server, the server will accept it (or not) and spawn another socket which will be the endpoint of the connection. This way the listening socket is not used for any data transfer on the connection and can continue listening for more incomming connections.

5. Connections: an example

Here's a graphical example of a webserver that can handle multiple connections.

1. The server socket is created

Unbound socket

The server creates a new socket. When it's just created it is not yet bound to an IP or port number.

2. The server socket is bound

Bound socket

Because the server is a webserver, it will be bound to port number 80, the default for HTTP. However the IP number is set to zero, indicating the server is willing to recieve incomming connections from all IPs available for the machine it runs on. In this example, we assume the server has three IPs, one external (216.239.39.101), one internal (192.168.0.8) and of course the loop back address (127.0.0.1).

3. The server is listening

Listening socket

After the socket is bound, it is put into the listening state, waiting for incomming connections on port 80.

4. A client creates a socket

Listening socket and client socket

Assume a client in the same local network as the server (192.168.x.x) wants to request a webpage from the server. To do the data transfer it needs a socket so it creates one.

5. The client socket tries to connect

Connection request

The client socket is left unbound and tries to connect to the webserver.

6. The server accepts the request

Connection established

The listening socket sees some client wants to make a connection. It accepts it by creating a new socket (on the bottom right) bound to the one of the IPs of itself which can be reached by the client (ie. they are in the same network, being 192.168.x.x) and the server port (80). From this point, the client socket and the server connection socket just created will do the data transfers, while the listening socket will keep listening for other connections. Note that the client socket is now bound to an IP and port since it's connected. The dotted gray line shows the separation of the client and server side.

7. Another client connects

Another connection established

If another client (from the external network) connects, the server will again create a new socket to deal with the second connection. Note that the IP the socket on the server side is bound to is different than the one from the first connection. This is possible because the listening server socket was not bound to any IP. If it had been bound to 192.168.0.8, the second connection would not be possible.

6. Blocking

The original functions in the Berkeley unix implementation of sockets were blocking functions. This means that they will just wait when the operation requested cannot be completed immediately. For example, when connecting to a server using the connect function, it did not return until the connection had been made (or failed), thus making the program hang for a while. This is not really a problem when dealing with a single connection using a console mode application but in the Windows environment, this behavior is rarely acceptable. Any program with a window has a window procedure that has to be kept running. Stalling it would delay user input, window painting, notifications, and any other messages resulting in an application that seems to be hanging while it's using socket functions.

To deal with this problem, winsock can set sockets into blocking or non-blocking mode. The former (blocking mode) is the original way of using sockets, ie. not returning from the API before the operation has finished (it will literally block the application). The latter (non-blocking mode) is the mode you usually use when dealing with a real windows application (ie. not a console application). When calling a function on a socket that is in non-blocking mode, the function will always return as soon as possible, even when the operation to be performed could not be completed immediately. Instead, a notification of some sort will be sent to the program when the operation is finished, allowing the program to execute in the normal manner while the operation is unfinished.

Winsock provides several methods of notification for non-blocking sockets, including window messages and event objects. These methods will be discussed in detail later, for now just remember there difference between blocking and non-blocking.

7. Winsock versions

The most commonly used winsock version is version 2.x, usually just called winsock 2 as there are only minor differences. The latest version before version 2 was version 1.1. Some people say you should use this version for compatibility reasons, as Windows 95 and NT 3 only ship version 1.1. However, all later windows versions (98, ME, NT4, 2000 and XP) have version 2 by default and for Windows 95 an update is available. So I recommend you just start with winsock 2, it adds a lot of nice features and windows machines without winsock 2 are getting rare.

The two major versions of winsock reside in two different DLLs, wsock32.dll and ws2_32.dll, being version 1.1 and version 2.x respectively. The libraries to use are wsock32.lib and ws2_32.lib. The MASM32 package has most winsock constants in its windows.inc, for C++ programs including windows.h suffices, it will include the winsock 2 definitions if the _WIN32_WINNT constant is at least 0x400 (NT version 4). The winsock 2 API includes the full 1.1 API (with some minor changes), wsock32.dll is even just a wrapper for the actual winsock ws2_32.dll.

This tutorial will assume you are using winsock 2.

8. Winsock architecture

Winsock provides two interfaces, the Application Programming Interface (API) and the Service Provider Interface (SPI). This tutorial is about the API, it contains all the functions you need to communcate using the well-known protocols. The SPI is an interface to add Data Transport Providers (like TCP/IP or IPX/SPX) or Name Space Service Providers (like DNS). These extensions are transparent to the user of the API.