Web Stack Project
For this project option, you will implement a full Web stack that is capable of hosting dynamic applications with static file acceleration. This Web stack will include the nginx, Apache Httpd, and Apache Tomcat servers running inside a virtual machine. A browser connected to the virtual machine from the host computer on port 9443 will be able to reach the Web stack via an encrypted (TLS) connection. Alpine Linux will be used as the operating system inside the virtual machine.
Required Technology
To be able to complete this project successfully, you will need:
- A laptop or desktop computer that meets the Department of Computing Sciences Computing Requirements
- Virtualization software, such as QEMU or VirtualBox
If you have an Apple computer with an Arm-based M1/M2 CPU, you should use QEMU. One approach is to follow my QEMU macOS installation instructions and run it from the command line. Alternatively, you might try UTM, which puts a nice GUI in front of QEMU.
Background
Building a Web stack in a Linux environment is an extremely common task, and there is a considerable amount of documentation available online. The critical part to this project is integrating the server stack into a set of servers with nginx at the front. In this setup, nginx acts as a reverse proxy for both the Httpd and Tomcat servers.
Figure 1 depicts the design of the Web stack. The resulting system has a single virtual machine, configured to forward TCP port 9443 on the host to port 443 on the guest. Within the guest, nginx is listening on port 443 and implements Transport Layer Security (TLS) using a self-signed certificate. Static files (like HTML or image files) are served directly by nginx from a directory inside the virtual machine. Dynamic Web content written in PHP is served by Apache Httpd, but the connection to Httpd goes through nginx first (this is called reverse proxying). Similarly, dynamic Web content written in Java is served by Apache Tomcat, but the connection to Tomcat is also proxied through nginx.
This setup is extremely common, as it allows a single port on a single server to be exposed through a firewall, improving cybersecurity by reducing the number of services directly exposed to the Internet (i.e. decreasing the attack surface of the system). A firewall inside the virtual machine ensures that only TCP port 443 is made available for outside connections. All other ports should drop incoming traffic (except for an optional SSH server on TCP port 22).
The following are links to resources that may be helpful. However, some additional research will be required to implement this project.
- Apache (Alpine Wiki)
- Apache HTTP Server Version 2.4 Documentation
- Nginx (Alpine Wiki)
- nginx Documentation
- Tomcat (Alpine Wiki)
- iptables
Project Requirements
A successful implementation of this project:
- Has a working Alpine Linux environment running in a virtual machine.
- Has port 9443 on the host computer forwarded to port 443 on the virtual machine.
- Permits a Web browser on the host computer to connect to https://localhost:9443 and make a secure TLS (still also sometimes called SSL) connection using a self-signed certificate.
- Serves static files directly from nginx inside the virtual machine.
- Serves dynamic content, generated from PHP code, from Apache Httpd running inside the virtual machine. The connection to Httpd is reverse-proxied through nginx inside the virtual machine.
- Serves dynamic Java-based content from an Apache Tomcat server running inside the virtual machine. The connection to Tomcat is reverse-proxied through nginx inside the virtual machine.
- Has a working firewall that blocks all incoming connections to the virtual machine’s operating environment except those to TCP port 443. (Optionally, the virtual machine may permit SSH on TCP port 22.)
- Minimizes the server tokens displayed by nginx, Httpd, and Tomcat whenever error message pages are displayed. In particular, the operating system and software versions should not be displayed on an error page, as these pieces of data give information to potential hackers.
Completion of this project demonstrates the ability to integrate system components to build a typical Web stack that can run in any environment (container, virtual machine, or physical hardware).
Specific Rules for This Project
- You MUST use Alpine Linux to complete this project. Alpine Linux is widely used in industry inside Docker containers, which in turn are a popular way to deploy server-side applications. It is important to recognize that a significant part of this project is figuring out how to deploy server applications using an advanced Linux distribution like Alpine. Most of the documentation you will find easily will be written for Ubuntu, which is known in system administration circles as a beginner-friendly distribution. It will be hard to be taken seriously in an interview for a Linux system administrator position if your only Linux distribution is Ubuntu.
- Nginx must be configured as the front-end server in your Web stack. All requests from your browser will first go to Nginx, which will handle requests for static files itself. For dynamic content (PHP or Java), Nginx will make a request to your Apache Httpd or Apache Tomcat server and then forward the response back to your browser (reverse proxy setup). You will lose points if your browser is connecting directly to the Httpd or Tomcat server.
Academic Integrity Rules
- Generally speaking, this project is to be completed INDIVIDUALLY. You may use any Internet source for research purposes, and you may share and receive information through the course forum activity (in fact, sharing information through the forum is a separate requirement of this course). However, you are to implement your own solution to this project and produce your own videos.
- You MAY work with another student (or multiple other students) if each of you completes a SEPARATE project option in this course.
- If you’re taking this course with your best friend, roommate, family member, or other close associate, it is advisable to choose separate projects to avoid any appearance of questionable activity. As a bonus, you both will learn more.
Milestones
Milestone 1
For Milestone 1, prepare a video presentation that covers the following items:
- Demonstrate that Alpine Linux is installed and is running properly in a virtual machine.
- Show that you have the APK repository configuration set up correctly. Be sure to show the content of your /etc/apk/repositories file.
- Show that you have installed the base set of packages required for nginx, Apache Httpd, and Apache Tomcat. (Note that you might find you need more packages as the project progresses.)
Before submitting, review the Grading Rubric for Milestone 1.
Milestone 2
For Milestone 2, prepare a video presentation that covers the following items:
- Show how you created a self-signed SSL certificate.
- Show that you have port 9443 on your host system forwarded to port 443 on your virtual machine.
- Demonstrate that you have nginx running on port 443 in the virtual machine, with TLS implemented using a self-signed certificate. Show that you can connect to https://localhost:9443 on your host system and get an nginx test page (after bypassing the security warning about the self-signed certificate).
- Show the part of your nginx configuration that implements SSL.
- Show that you have configured nginx to serve static content (like HTML pages or images) from a directory inside the virtual machine (into which you have put some static content). Demonstrate that the server correctly serves the static content by visiting it in the browser.
- Visit a nonexistent page on your server, and show that the error message presented by nginx doesn’t give away the operating system or nginx version.
- Give a brief explanation of why it is better to have nginx serve static content directly, instead of configuring nginx to proxy static content served by Httpd. I haven’t told you why this design is better, but you should be able to find the answer with a small amount of online research.
Before submitting, review the Grading Rubric for Milestone 2.
Milestone 3
For Milestone 3, prepare a video presentation that covers the following items:
- By demonstrating commands inside your virtual machine, show that Apache Httpd is running, and state on which port you have it running.
- Show the source code for a PHP script, then show that script executing on your virtual machine. The script must be executing in Apache, but the connection to Apache must be reverse-proxied through nginx. In other words, the browser must be going to an address that starts with https://localhost:9443 (and not some other port).
- Visit a nonexistent page in the same directory as the one you proxied to Apache Httpd. Show that the resulting error pages do not give away details about the operating system or server version.
- Explain how you got PHP working in Httpd. Show which package(s) and configuration change(s) you made.
- Show the portion of your nginx configuration that implements reverse proxying to Httpd.
Before submitting, review the Grading Rubric for Milestone 3.
Milestone 4
For Milestone 4, prepare a video presentation that covers the following items:
- A demonstration that your entire Web stack works by visiting https://localhost:9443 on your host system. Show that static pages, PHP pages (proxied through to Httpd), and Java Web applications (proxied through to Tomcat) work properly.
- Show the output of the following commands inside your VM to verify the firewall is set up properly. Both commands need to be run as the root user.
iptables -S
ip6tables -S
- Show your Apache Tomcat configuration and your nginx configuration for reverse proxying to Apache Tomcat.
Before submitting, review the Grading Rubric for Milestone 4.
Tips
- See the QEMU Network options for the arguments to forward a host port to the guest if you use QEMU for virtualization. If you’re using a different virtualization tool, consult its documentation for the correct way to configure port forwarding.
- The software you need to create a self-signed certificate is OpenSSL. There are plenty of tutorials online for creating self-signed certificates.
- To be able to demonstrate that nginx is serving static files, you need some static files to serve. HTML files, pictures, and similar content would work for this purpose. I recommend configuring nginx to serve static files from a single directory on the VM, then showing the configuration (and a demo) in your presentation.
- You will need a PHP script to test that you have Apache running PHP correctly. The PHP script doesn’t need to be fancy at all, but it does need to have at least some dynamic content in it. There are several ways to make Httpd handle PHP, and you may use any of them in this project. PHP was the language you learned in CSCI 303.
- You will also need a Java web application to test Tomcat. It doesn’t matter what the application does, but I’d suggest finding a precompiled .war file to deploy instead of trying to build from source. You can find one online, or just deploy Tomcat’s Sample Application.
- It might be easier to take a divide-and-conquer approach to getting the server stack working. Start by setting up nginx and getting that part working. Then, configure Apache and PHP, connecting to Apache directly by forwarding an extra port from the host. Once you know Apache is working properly on its own, configure the reverse proxy from nginx. Do the same thing for the Tomcat server.
- There are several different ways to configure PHP to run in Apache httpd. The easiest way
is to use mod_php, which is documented in a somewhat convoluted way in the Alpine wiki
Setting Up Apache with PHP
article. If you’ve enabled the main and community repositories at install time, you just
need to add the requisite package to apache2. Note that following the Wiki article directly
might or might not work on Alpine Linux 3.19. Running
apk add php82-apache2
should be sufficient. Restart the apache2 service, and PHP should work. (You could also use php81-apache2 or php83-apache2 instead.) - As you search for documentation online, note that different distributions put configuration files in different places. The file locations in Alpine Linux will likely be different than they are in Red Hat or Ubuntu. In particular, if you see references to “sites-enabled” or “sites-available” in documentation, you’re seeing documentation for Ubuntu-based systems. You will need to adapt this documentation to the actual file locations in Alpine.