Courses & Projects by Rob Marano

ECE 465 Spring 2026 Weekly Course Notes

<- back to syllabus


Classes will be decided week-to-week.

Week(s) Week of Topic
01 1/27 Intro; centralized vs distributed systems; development environment setup
02 2/3 Multi-processing & network programming — Part 1
03 2/10 Multi-processing & network programming — Part 2
04 2/17 Multi-processing & network programming — Part 3
05 2/24 Containerization: Docker and Kubernetes
06 3/3 DevOps and CI/CD
07 3/10 Integrate application to infrastructure
08 3/17 Distributed Architectures
09 3/24 Communication and Coordination
10 3/31 Consistency & Replication
11 4/7 Fault Tolerance
12 4/21 Security
13 4/28 Deploying on k8s on cloud-based virtual bare metal nodes
14 5/5 Deploying on k8s on cloud-based k8s
15 5/12 Final individual projects due

Follow the link above to the respective week’s materials below.

Week 1 — 1/27 — Intro; centralized vs distributed systems; development environment setup

Chapter 1: Introduction to Distributed Systems

1.1 From Networked to Distributed Systems

A distributed system is defined as a collection of autonomous computing elements that appears to its users as a single coherent system. This definition highlights two key features:

  1. Independent Nodes: The system consists of autonomous devices (nodes) that act independently.
  2. Single System Image: To users and applications, the system behaves as a single entity, hiding the underlying network complexity.

Distributed vs. Decentralized Systems The distinction between these systems lies in how and why computers are connected:

The Centralization Myth A common misconception is that centralized solutions are inherently unscalable or vulnerable. However, a distinction must be made between logical and physical centralization.

1.2 Design Goals

Building distributed systems is complex and justified only when specific goals are met:

1. Resource Sharing

The primary goal is to make resources (storage, computing power, data, networks) easily accessible and shareable among users. This allows for economic efficiency and collaboration.

2. Distribution Transparency

The system should hide the fact that its resources are physically distributed. This is typically achieved through a middleware layer.

Transparency Type Description
Access Hides differences in data representation and how resources are accessed.
Location Hides where a resource is physically located.
Relocation Hides that a resource may move to another location while in use.
Migration Hides that a resource may move to another location.
Replication Hides that a resource is replicated (copied) across multiple nodes.
Concurrency Hides that a resource may be shared by several competitive users.
Failure Hides the failure and recovery of a resource.

Note on Transparency: Full transparency is often impossible or undesirable (e.g., hiding network latency in a real-time system is physically impossible).

3. Openness

An open system offers components that can be easily used by or integrated into other systems.

4. Scalability

Scalability is measured along three dimensions:

Scaling Techniques:

1.3 Types of Distributed Systems

High-Performance Distributed Computing

Distributed Information Systems

Focuses on integrating separate applications into an enterprise-wide system.

Pervasive Systems

Systems that blend into the environment, characterized by small, battery-powered, mobile devices.

1.4 Pitfalls

Developers often commit errors by accepting the following false assumptions about the underlying network:

  1. The network is reliable.
  2. The network is secure.
  3. The network is homogeneous.
  4. The topology does not change.
  5. Latency is zero.
  6. Bandwidth is infinite.
  7. Transport cost is zero.
  8. There is one administrator.

Python Coding Examples

The textbook emphasizes that distributed systems rely on message passing and hiding complexity (transparency). Below are Python examples illustrating Access Transparency (hiding data representation using serialization) and Basic Connectivity (the foundation of distributed systems).

Example 1: Access Transparency via Serialization

In distributed systems, machines may represent data differently. To achieve access transparency, data must be marshaled (serialized) into a standard format before transmission.

import pickle

# A complex data object (list of dictionaries)
# In a real scenario, this could be a database record or object state.
local_data = [
    {"id": 1, "action": "update", "value": 42},
    {"id": 2, "action": "delete"}
]

print(f"Original Data Type: {type(local_data)}")

# Marshaling (Serialization)
# This simulates preparing data to be sent over the network.
# It hides the internal memory representation of the Python list.
network_message = pickle.dumps(local_data)

print(f"Marshaled (Network) Data: {network_message}")

# --- Network Transmission Simulation ---

# Unmarshaling (Deserialization)
# The receiving node reconstructs the object without knowing
# the sender's internal memory layout.
received_data = pickle.loads(network_message)

print(f"Reconstructed Data: {received_data}")
print(f"Is data identical? {local_data == received_data}")

Ref: Concepts based on Section 1.2.2 and Python pickle usage in Note 4.4.

Example 2: Basic Client-Server Communication

This example demonstrates the “Networked” aspect of distributed systems using sockets. This is the low-level mechanism upon which higher-level distributed abstractions (like RPC) are built.

The Server (Run this first):

from socket import *

def start_server():
    # Create a TCP/IP socket
    server_socket = socket(AF_INET, SOCK_STREAM)
    
    # Bind the socket to the address and port
    server_socket.bind(('localhost', 8080))
    
    # Listen for incoming connections (queue up to 1 request)
    server_socket.listen(1)
    print("Server is listening on port 8080...")
    
    while True:
        # Accept a connection
        connection, client_address = server_socket.accept()
        try:
            print(f"Connection from {client_address}")
            
            # Receive data in small chunks
            data = connection.recv(1024)
            if data:
                print(f"Received: {data.decode()}")
                
                # Send data back to the client (Echo)
                response = "Acknowledged: " + data.decode()
                connection.sendall(response.encode())
        finally:
            # Clean up the connection
            connection.close()

if __name__ == "__main__":
    start_server()

The Client:

from socket import *

def start_client():
    # Create a TCP/IP socket
    client_socket = socket(AF_INET, SOCK_STREAM)
    
    # Connect the socket to the server's port
    client_socket.connect(('localhost', 8080))
    
    try:
        # Send data
        message = "Hello Distributed World"
        print(f"Sending: {message}")
        client_socket.sendall(message.encode())
        
        # Look for the response
        response = client_socket.recv(1024)
        print(f"Received: {response.decode()}")
        
    finally:
        print("Closing socket")
        client_socket.close()

if __name__ == "__main__":
    start_client()

Ref: Adapted from Note 2.1 illustrating basic connectivity principles discussed in Chapter 1.



Week 5 — 2/24 — Containerization: Docker and Kubernetes

Chapter 3: Processes (Distributed Systems by Tanenbaum & van Steen)

3.1 Introduction to Threads

In modern distributed systems, processes are often decomposed into threads to achieve higher performance and hide latency.

Why use threads in distributed systems?

  1. Hiding Latency: When one thread blocks (e.g., waiting for a network response or disk I/O), another thread can execute, keeping the CPU utilized.
  2. Performance: Multithreaded servers can handle multiple concurrent client requests much more efficiently than spawning a new process for each request.

3.2 Virtualization

Virtualization plays a foundational role in cloud computing by decoupling software from the underlying hardware.

Types of Virtualization:

  1. Hardware Virtualization (Virtual Machines): A hypervisor (Virtual Machine Monitor - VMM) allows multiple operating systems to run concurrently on the same hardware. Each VM contains a full OS, applications, and virtualized hardware resources.
    • Pros: Complete isolation, run any OS.
    • Cons: High overhead (resource-intensive to boot and run a full OS).
  2. OS-Level Virtualization (Containers): The host OS kernel is shared among multiple isolated user-space instances. Instead of virtualizing the hardware, it virtualizes the OS environment.
    • Pros: Extremely lightweight, fast startup, high density of instances on a single physical machine.
    • Cons: All instances must share the same host OS kernel type (e.g., Linux containers run on a Linux kernel).

Docker: Practical OS-Level Virtualization

Docker is the industry standard for creating and managing containers. It packages code and all its dependencies into a standard unit for software development.

Key Docker Concepts

Building a Containerized Application

To containerize our TCPServer from previous weeks, we use a Dockerfile. Consider the following week_05/netprog/Dockerfile snippet:

# Start from a base Ubuntu image
FROM ubuntu:24.10

# Install Java and networking tools
RUN apt-get update && apt-get install -y openjdk-21-jdk-headless dnsutils dos2unix

# Copy the compiled Java classes into the container
RUN mkdir -p /classes
COPY ./bin/TCPServer.class /classes
COPY ./bin/TCPServer\$ClientHandler.class /classes

# Copy and configure the startup script
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh && dos2unix /entrypoint.sh

# Expose the port the server listens on
EXPOSE 12345

# Define the command to run when the container starts
CMD ["/bin/bash","-c","/entrypoint.sh"]

Docker Workflow

  1. Build: Create an image from a Dockerfile.
    • docker build -t transcriptor:v1 .
  2. Run: Execute the application inside an isolated container.
    • docker run -d -p 12345:12345 transcriptor:v1
  3. Ship: Push the image to a registry (like Docker Hub) so it can run anywhere.

Kubernetes (K8s): Container Orchestration

While Docker is excellent for running single containers, managing thousands of containers across many physical machines requires an orchestrator. Kubernetes is an open-source platform designed to automate deploying, scaling, and operating containerized applications.

Why Kubernetes?

As distributed systems scale, they face challenges:

Kubernetes solves these problems by providing a framework to run distributed systems resiliently.

Core K8s Objects

  1. Pod: The smallest deployable computing unit in Kubernetes. A Pod contains one or more containers (usually one) that share storage and network resources.
  2. Deployment: A higher-level abstraction that manages a set of identical Pods, ensuring a specified number of replicas are always running (self-healing).
  3. Service: An abstract way to expose an application running on a set of Pods as a network service. Since Pod IP addresses change frequently, a Service provides a stable endpoint for communication.

K8s Architecture Summary

A Kubernetes cluster consists of a set of worker machines, called Nodes, that run containerized applications. Every cluster has at least one worker node. The worker nodes host the Pods. The Control Plane manages the worker nodes and the Pods in the cluster.


Session 5 Lab

In this week’s lab, you will:

  1. Compile the multi-threaded TCPServer Java code.
  2. Use the provided Dockerfile to build a Docker image for the server.
  3. Run the server as an isolated Docker container, exposing its port to your host machine.
  4. Use the TCPClient to communicate with the containerized application.

See the README.md inside weeks/week_05/netprog/ for step-by-step instructions.

<- back to syllabus