MyRole
Architecture Design, Config Parsing, HTTP Parsing, debugging log implementation, code refactoring, functional verification page development
Team
Jinho Heo
Hyeonjun An
Yunseon Im
Timeline
4 Months, starts August 2023
Tech Keywords
C++98, Makefile, HTML, CSS, Object Oriented Programming, Non-block I/O, I/O Multiplexing (kqueue), HTTP/1.1
Overview
Webserv is an ambitious project undertaken as part of the School 42 curriculum, challenging students to develop a robust HTTP/1.1 compatible web server in C++ 98.
This project pushes the boundaries of low-level network programming, requiring the implementation of non-blocking I/O operations and efficient request handling.
Key features include the use of I/O multiplexing with `poll()` (or equivalent), ensuring the server never blocks or hangs indefinitely, and the ability to serve static websites while supporting multiple HTTP methods.
By implementing these advanced concepts, Webserv not only serves as a practical learning tool for understanding web server architecture but also demonstrates the power and flexibility of C++ in creating high-performance network applications.
The successful completion of this project showcases a deep understanding of network protocols, system-level programming, and efficient resource management in a challenging, real-world scenario.
HIGHLIGHTS
A HTTP/1.1 server implementation in C++98.
THE PROBLEM
Couldn't we just refer to Nginx to build it?
Nginx Structure
Initially, I wrote code thinking of a multi-process, multi-threaded structure referring to the Nginx structure, where a master process creates worker processes, and worker processes register signals and send responses according to each thread.
However, when I had almost finished writing the structure, I discovered a user requirement that only one process should be used. Eventually, I had to be satisfied with only referring to Nginx for parsing the configuration file.
ARCHITECTURE
How should the web server be structured?
Single Process
Events are managed and handled through kqueue in a single process.
Based on the configuration file, ports are verified and connected. Events are managed by kqueue and processed separately as Read, Write, CGI, etc., according to inherited classes.
When a response comes in, it is parsed according to the HTTP format, the corresponding request is processed, and a response is sent out.
CONFIG
Nginx-like configuration file parsing structure.
Config Tree
The configuration file is in block format based on {}, ;, and if we consider the enclosing brackets as parents, the child elements reflect the settings of the parent elements.
Therefore, if we construct a tree based on {} and ;, even if there are countless location blocks, we can reflect all the settings of the parent elements.
class Common {
public:
static int mKqueue; // Description for mKqueue
static bool mRunning; // Description for mRunning
static Node *mConfigTree; // Root node for some configuration tree (assuming from the name)
static ConfigMap *mConfigMap; // A complex map with described structure
};
The tree and Map are globally managed in the Common class.
Common.cpp
WebServer.cpp
Node.cpp
example.conf
Config Map
Once the tree is created based on tokens, we create a Map based on locations. This is because if we remember where the location is in the tree, we don't need to search the entire tree to find the incoming location. We can just check the settings by going up from the location position through the parent elements.
class ConfigMap {
public:
typedef std::map<std::string, Node *> UriMap;
typedef std::multimap<std::string, UriMap>
HostnameMap; // Allows duplicate hostnames
ConfigMap(Node *configTree);
//...
private:
class PortMap {
//...
private:
HostnameMap mHostnameConfigs;
UriMap *mDefaultServer; // pointer to default server config
bool mbDefaultServerSet;
};
std::map<int, PortMap> mPortConfigs;
};
We create a Map based on port, server_name, and location. By using MultiMap, we reduced the access time from O(N) to O(logN) because we don't need to traverse the entire tree to find location nodes.
We could have further reduced it to O(1) by using std::unordered_map (hashmap), but this STL is not provided in C++98, so we couldn't use it.
ConfigMap.hpp
ConfigMap.cpp
example.conf
Http.cpp
REQUEST&&RESPONSE
The process of receiving requests and sending responses on the server.
Receiving Requests
The server receives data from clients through the recv() function and passes this data to the Http object owned by each Connection object to begin parsing the HTTP request.
Response
If the parsed request is an HTTP request, the server calls the appropriate handler for the HTTP method (GET, POST, DELETE) through the Router.
Each method processes the appropriate business logic according to the rules of the HTTP/1.1 protocol, then generates an appropriate Response message and responds to the client.
class Http {
public:
Http(int socket, int port, std::string &sendBuffer, bool &keepAlive,
int &remainingRequest);
//...
private:
std::string mBuffer;
Request mRequest;
Response mResponse;
RequestParser mRequestParser;
ResponseParser mResponseParser;
int mPort;
int mSocket;
bool &mKeepAlive;
int &mRemainingRequest;
std::string &mSendBufferRef;
std::vector<SharedPtr<CGI> > mCGIList;
};
Http.hpp
Http.cpp
Handling CGI Requests
In the case of CGI requests, the server's main process creates a child process through the fork() system call, and the child process executes the CGI script through execve().
This method is an asynchronous processing approach designed to prevent CGI requests from affecting the performance of the main process.
HTTP PARSER
Method for parsing HTTP requests.
HTTP Request Processing
After validating the request during the parsing process, we only proceed to the next step if the request is valid. Requests are analyzed meticulously character by character and processed according to their state.
The Response message is also generated in compliance with the HTTP/1.1 protocol, composing the message with content such as status codes, headers, and body.
void Http::SetRequest(eStatusCode state, std::vector<char> &RecvBuffer) {
//...
while (true) {
eStatusCode ParseState = mRequestParser.Parse(
mRequest, mBuffer.c_str(), mBuffer.c_str() + mBuffer.size());
if (ParseState == PARSING_INCOMPLETED) {
//...
} else if (ParseState == PARSING_COMPLETED) {
//...
} else {
//...
}
}
}
Since requests don't come in all at once but can be split or overlapped, we parse the values in mBuffer while running a loop.
Http.hpp
Enum.hpp
RequestParser.cpp
ResponseParser.hpp
TEST
Does the built web server function properly?
Testing Methods
We verify whether redirects to various locations work correctly, if the server continues to function well even after thousands of requests without shutting down, if it doesn't terminate when sent requests exceeding the maximum size, if sockets close properly when clients disconnect, and if it properly sends out errors when receiving invalid requests, among other things.