Building HTTP server with Ruby

Building HTTP server with Ruby:

What is a web server?

A web server is a program that takes a request to your website from a user and does some processing on it. Then, it might give the request to the application layer. A few of the most popular web servers are Nginx, Apache. (They have more features like reverse proxy, load balancing, and many others, as well, but primarily they act as web servers)

Now, let me ask a question. The server that runs on your localhost during the development is that a web server? Cause, whatever request you sent, it processes it and then loads up the appropriate page. So, it might seem like a webserver, but more technically it is called an app server. The app server loads the code and keeps the app in memory. When your app server gets a request from your web server, it tells your app about it. After your app is done handling the request, the app server sends the response back to the webserver (and eventually to the user). For rails in particular there are many app servers like Unicorn, Puma, Thin, Rainbows.

But if there are so many servers that are tested by the community and used by thousands, why should we bother building another? Well, by building one from scratch we will have a better knowledge of how these works.

What actions does an HTTP server actually perform?

So, let’s break down what an HTTP server does.

Steps involved

So when we visit a particular URL, it sends a particular HTTP request to the server. Now, what is an HTTP request? It is an application-level protocol that every application connected to the internet has to agree upon. There are many other protocols like FTP (File Transfer Protocol), TCP (Transmission Control Protocol), SMTP (Simple Mail Transfer Protocol). HTTP or HyperText Transfer Protocol is just very popular among these and is used by web applications and web servers to communicate among themselves.

So, when we type one URL in the browser. It makes an HTTP “request” to the web server, to which the webserver processes that request and sends back an HTTP “response” which gets rendered to the user in the browser.

History

The first HTTP standard was released in 1996 which was HTTP/1.0 by Tim Berners Lee. Now we have HTTP/2 which is a more efficient expression of HTTP’s semantics “on the wire” and was published in 2015. Also, did you know that there is another successor which is HTTP 3 which is already in use by over 4% of the websites (It used UPDP instead of TCP for the transport protocol)

How should we start?

So we would need a tool that will listen for bi-directional communication between client and server. Basically a socketSocket is nothing but an endpoint for two-way communication between two programs running on a network i.e endpoints of a bidirectional communications channel. So it has to be bound to a port so the TCP layer can find the application that the data is sent to, the server forms the listener socket and the client reaches out to the socket. We will not be implementing sockets. Ruby already has a socket implemented in their standard library.

require "socket"

The socket library provides specific classes for handling the common transports as well as a generic interface for handling the rest, basically it interacts with the OS level and performs the necessary actions for us.

What should be the basic processes of the webserver

  1. Listen for connections
  2. Parse the request
  3. Process and send the response

1. Listen for connections

First, let’s open a port and listen to all messages sent to that particular port. We can do that using the TCPServer.new or TCPServer.open method. [ According to the docs they are synonymous ]

require "socket"

server = TCPServer.new("localhost", 8000)

Feel free to choose any port, but make sure it is available. Use the command “netstat -lntu” to look for the ports that are currently used by a process, don’t use those.

Now we would like to loop infinitely to process our incoming connections. When a client connects to our server, server.accept will return a Ruby Socket, which can be used like any other ruby I/O object. Since the connection was made by a request we would also love to read that request, which we can do using gets method. It will return the first line of the request.

So now we have:

require "socket"

port = (ARGV[0] || 8000).to_i # to get a port from the ARG

server = TCPServer.new("localhost", 8000)

while (session = server.accept)
  puts "Client connected..."
  puts "Request: #{session.gets}"
end

How to test this?

Open up two terminals in one run the ruby script, and in the other open up irb. Now follow my commands:

On the other terminal I write the commands

> require "socket"
> soc = TCPSocket.open("localhost", 8000)
> soc.puts "Hello There"

A much easier way to test is to run the script and visit that port using the browser. If your port is 8000 just visit
http://localhost:8000. You will see something like this:

Client connected...
Request: GET / HTTP/1.1

or can use the curl command for the same.

Why just GET / HTTP/1.1 ?

Because when you sent a request it gets parsed into a multi-line string. Try to run the command curl -v localhost:8000 you will notice something like this:

*   Trying ::1:8000...
* Connected to localhost (::1) port 8000 (#0)
> GET / HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.74.0
> Accept: */*
>

And in our script we used session.gets which only takes one line in the IO stream as input. So, let’s replace that with readpartial(2048) . Here 2048 represents the byte of data we would love to read. We can increase that, but for our case, it is enough.

So far we have:

require "socket"

port = (ARGV[0] || 8000).to_i

server = TCPServer.new("localhost", 8000)

while (session = server.accept)
  puts "Request: #{session.readpartial(2048)}"
end

Now run the script and the curl command again. It will print all of the HTTP request data.

2. Parsing the HTTP request

Right now we are just receiving the request as a string, we need to parse it so that our server can understand and further process it.

Let’s look into the request once again:

 GET / HTTP/1.1  # GET is the method, the / is the path, the HTTP part is the protocol
 Host: localhost:8000 # Headers
 User-Agent: curl/7.74.0
 Accept: */*

The first line gives us

  • method
  • path
  • protocol

All the lines after that comes under the header. So we write this function that will parse the raw request string

def parse(request_string)
  method, path, version = request_string.lines[0].split
  {
    method: method,
    version: version,
    path: path,
    headers: parse_headers(request_string),
  }
end

It calls another parse_headers to parse the headers

def normalize(header)
  header.tr(":", "").to_sym
end

def parse_headers(request)
  headers = {}
  request.lines[1..-1].each do |line|
    return headers if line == "\r\n"
    header, value = line.split
    header = normalize(header)
    headers[header] = value
  end
end

Now instead of just printing the request do it this way

server = TCPServer.new("localhost", 8000)

while (session = server.accept)
  ap parse(session.readpartial(2048))
end

I am using awesome_print to display the data in a formatted manner you can replace that with puts. Now you would get something like this.

3. Process and send the HTTP response

Now since we have all the data we now have to prepare and send the response. If the path of the request is “/” which refers to the home we will respond with something like index.html else, if it was something else like localhost:8000/about.html then we will respond with that path about.html.

def prepare(parsed_req)
    path = parsed_req[:path]
      if path == "/"
        respond_with("index.html")
      else
        respond_with(path)
      end
    end

What respond_with is supposed to is to check if the file exists, if it does then respond with the file, else return a 404.

def respond_with(path)
      if File.exists?(path)
        ok_response(File.binread(path))
      else
        error_response
      end
    end

For the responses, we will be sending a string of this format. This is according to the HTTP spec. You can read more about the HTTP spec here.

def response(code, body="")
    "HTTP/1.1 #{code}\r\n" +
    "Content-Length: #{body.size}\r\n" +
    "\r\n" +
    "\#{body}\r\n"
end

So our, ok_response and error_respnse will be like this:

def ok_response(body)
    MyServer::Response.new(code: 200, body: body)
end

def error_response
    MyServer::Response.new(code: 404)
end

Now after we have our response we can send it back to the client. I have refactored the codes a little bit, you can find the entire code here:

Once everything is in place, we can finally run the script and visit the URL http://localhost:8000 it will render all the contents of index.html. Also if you have any other pages in the same folder like about.html visiting http://localhost:8000/about.html will render that as well.

from Tumblr https://generouspiratequeen.tumblr.com/post/641576012019318784

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s