A Quick Introduction To Network Programming In Python

Patrick T Coakley 9 min read July 19, 2023 Networking
[ #python #networking ]

Creating software connecting systems over a network seems like a daunting task, but once you understand the fundamentals, the process of creating your own clients and servers isn't so impossible. While networking is a large topic, this guide will focus on the basics, including sockets, TCP, and HTTP. Sockets are the foundation of network programming and are available in almost every modern programming language as a standard library feature; once you understand how to use them in one language then you can apply that knowledge into pretty much all of the others.

This tutorial assumes basic programming knowledge about Python (or any other modern programming language), but I will try to cover all code so that it all makes sense. This code will focus more on getting something working and will avoid covering things you should probably think about when creating something important, such as error handling, concurrency, and deployment. Resources for learning more are included at the bottom of the page.

Sockets

So what is a socket? If we take a look at the manpage for socket, the socket API "creates an endpoint for communication and returns a descriptor." So what's a descriptor? File descriptors (generally referred to as handles in Windows) are essentially a non-negative integer used to reference a file table the operating system manages; the integer is not the file itself but rather a way to keep track of it. This allows the kernel to act on our behalf on the files without direct control.

Sockets were initially created for the Berkeley Software Distribution family of Unix operating systems (hence the name Berkeley sockets) but has been implemented on every major operating system in some form or another with varying degrees of differences (check out this link to see how WinSock on Windows differs from Berkeley sockets).

So, if we wanted to concisely explain what a socket is (at least on Unix-like operating systems, such as macOS and Linux), we can say that it's a reference to a file that is treated the same as any other file on the operating system, except it's actually pointing to an endpoint to communicate in memory on rather than a file on disk.

TCP

Sockets can be created to work with different protocols, but the one we're going to use in this tutorial is TCP. TCP (Transmission Control Protocol) is the protocol most web developers will come into contact with as HTTP was built on top of it. As a connection-oriented protocol, TCP ensures that connections follow certain rules before establishing a connection and maintaining that connection. The three-way handshake is part of that process and allows two connections to synchronize with one another to transmit data back and forth.

Echo Client-Server

Before we get started with HTTP, let's take a look at using just TCP so we can see how the former builds on the latter. So, what are we making? To start with, let's create a simple echo server and client. An echo server simply sends back what it receives and serves as a good introduction.

Server

First, let's write the code for the server and break it down:

# 1. Import the socket module
import socket

# 2. Use a `with` statement to create a new socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as server:
    # 3. Bind the socket to the address and port of our choosing
    server.bind(('127.0.0.1', 9999))
    # 4. Listen for a connection to come in
    server.listen() 
    # 5. Accept the connection
    conn, addr = server.accept() 
     # 6. Use a `with` statement to auto-close the connection
    with conn: 
        while True:
             # 7. Receive the data payload sent by the client                                                 
            payload = conn.recv(1024)
            if not payload:
                # 8. Leave the loop if no data is received
                break 
            # 9. Send the data back to the client
            conn.sendall(payload)

Using the socket module, we first create a socket by passing in two values, AF_INET and SOCK_STREAM. So, what exactly are these two things?

The AF in AF_INET refers to the address family that the socket is using, of which there are many. For our purposes, AF_INET simply refers to IPv4 addresses and ports, but there are many others depending the operating systems. AF_INET, AF_INET6, and AF_UNIX are the ones you will routinely see, with AF_INET6 referring to IPv6 addresses and ports, and AF_UNIX for Unix domain sockets, a way for Unix-like systems to communicate more directly at the kernel level instead of the user level, like with TCP/UDP.

The second parameter, SOCK_STREAM refers to what kind of protocol the socket is being used for. SOCK_STREAM is for a streaming socket, meaning it will be used with connection-oriented protocols, such as TCP. This differs from SOCK_DGRAM, which is for connectionless protocols like UDP. Finally, you can create SOCK_RAW for raw sockets, which is used for direct networking of a custom protocol to send raw packets.

The socket is being created inside of a with statement, which is a way for Python to handle resources that can be closed when they're no longer being used. The next step is to bind the socket to a particular address and port, which is the act of reserving that address and port for our socket to use. Then we listen for incoming connections, which will cause our server to sit and wait for something to connect. Once it receives a connection, if it's a valid one it will gives us another socket, this time the client, and it's address.

With the client socket, we once again use a with statement to make sure this socket is also closed and then create an infinite loop. Inside the loop we want to try and receive any data being sent by the client using 1024 bytes as the buffer size; this will differ depending on the type and size of the data you're dealing with as well as the protocol type, but for our purposes 1024 is good enough. If the payload is empty, we exit the loop, but otherwise we send it back to the client and start the process all over again.

We can verify the server is working correctly by using Netcat on macOS and other Unix-like operating systems, or by using ncat on Windows (replace nc with ncat in the following commands). Open up a new terminal, type in nc -l 127.0.0.1 9999 and you should be able to start writing text into the prompt. Once you hit enter, the server should receive that data and send it back.

Client

Creating the client should look very familiar to the server code with only a few minor changes:

# 1. Import the socket module
import socket

# 2. Use a `with` statement to create a new socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as client:
    # 3. Note that instead of binding 
    # we are connecting since we are a client this time
    client.connect(('127.0.0.1', 9999)) 
    # 4. Create a loop that saves our text and breaks when it's empty
    while text := input():
        # 5. Send our text encoded in utf-8 bytes 
        client.sendall(bytes(text, 'utf-8'))
        # 6. Wait to receive the payload back from the server
        payload = client.recv(1024) 
        # 7. Print out the decoded payload
     print(payload.decode()) 

You'll have to hit CTRL+C to kill your echo client or it will otherwise keep running.

The only major difference here is we are simply connecting to the server and don't have to bind or listen on our socket. Otherwise the process should look mostly the same.

With these pieces of code, you've already created your first client-server application!

HTTP Server

Taking what we've learned so far, we can create a very basic HTTP (Hypertext Transfer Protocol) server to understand the basics of the protocol and how it works. If you're not super familiar with HTTP then Mozilla's overview of HTTP and the other links in their documentation are a fantastic way to learn more. The server we're going to create will simply take a GET request and return a 200 response along with some basic HTML. A lot of the code we used to create the echo server will be reusable (with some minor changes), the main difference being that we will be inspecting our data and sending back something we create on the server:

# 1. Import the socket module
import socket
# 2. Import the time module
import time

# 3. A multi-line string to represent our response template
# including our response code, content type header
# and some HTML to show the current time to be added below
RESPONSE_TEMPLATE = '''HTTP/1.1 200 OK\r
Content-Type: text/html\r
\r
<html>
    <h1>Hello, World! The current time is {}</h1>
</html>
'''

# 4. Use a `with` statement to create a new socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as server:
    # 5. Bind the socket to the address and port of our choosing
    server.bind(('127.0.0.1', 9999)) 
    # 6. Listen for a connection to come in
    server.listen() 
    while True:
        # 7. Accept the connection
        conn, addr = server.accept()
        # 8. Use a `with` statement to auto-close the connection
        with conn: 
            # 9. Receive the data payload sent by the client
            payload = conn.recv(1024)
             # 10. Log our payload and client address
            print(f'Received {payload.decode()} from {addr[0]}:{addr[1]}')
            # 11. Grab the current time and format it to Hour:Minute:Second
            current_time = time.strftime("%I:%M:%S%p", time.localtime()) 
            # 12. Grab the timezone
            time_zone = time.tzname[0] 
            # 13. Format our response
            resp = RESPONSE_TEMPLATE.format(current_time, time_zone) 
            # 14. Send the data back to the client
            conn.sendall(resp.encode())

Just like the echo example you'll need to hit CTRL+C to close your server

Notice here we moved our loop up and the call to accept below; this is so we can keep the server running even after the client disconnects. Otherwise, the other major difference is that we are actually creating new data to send back to the client for each request, allowing for the ability to have dynamic interactions between the two. HTTP requests and responses use CR\LF, which stands for carriage return\line feed and represented by \r\n, to separate each line. Our template string is a multi-line string and allows us to omit the \n, but we must explicitly write the \r or we wouldn't be conforming to proper HTTP standards.

With the server running, you can test it either using curl, which is included in many Unix-like operating systems by default and is also available on Windows, or just by going to http://127.0.0.1:9999 on your browser. To use curl, just enter curl 127.0.0.1:9999 in the command line and you should receive a chunk of HTML back. If we check the URL in the browser, you should simply see a header greeting us and showing the time the response was sent.

Conclusion

Hopefully this helped you better understand the basics of network programming and showed how simple it can be to get started. There is a lot more to making a robust web server that is secure and scalable, but knowing how they fundamentally work can help demystify the core concepts. A great place to take a look how you could do more is the standard library's HTTP module, but do note that it is not considered a secure solution intended for production environments and should really only be used as a learning tool. A great production-ready piece of open-source software is gunicorn, which is commonly used in the Python world with popular web frameworks, such as Flask and Django.

Resources

  1. The venerable Beej's Guide to Network Programming is a great way to learn more about sockets and how they work under the hood in C
  2. A classic text, Unix Network Programming by W. Richard Stevens is a much deeper dive into the sockets API, and while it is a little old the content is still very applicable even today