The University of Western Australia
Computer Science and Software Engineering
 
 

Department of Computer Science and Software Engineering

CITS3002 Computer Networks

Practical project 2020

(due 5pm Fri 22nd May - end of week 11)

See also: Getting Started and Clarifications

The goal of this project is to develop a server application to manage the data and permit queries about bus and train routes, such as those in the Transperth transport network. By successfully completing the project, you will have a greater understanding of the standard TCP and UDP protocols running over IP, communication between web-browsers and application programs using HTTP and HTML, and will have developed a simple text-based protocol to make and reply to queries for distributed information.

A transport network can be considered as a connected graph - the bus and train stations are nodes in the graph, and the bus and train routes are links joining the stations. Multiple (identical) bus and train trips commence at different times throughout the day, and pairs of stations are connected by multiple bus and train routes.

In this project each bus or train station will be represented by an executing instance of a station server (software). Each server will be a distinct operating system process (not a thread within a single process). Each station server runs the same software, but manages its own data and network connections.

A standard web-browser will provide the human-facing interface to each server, so this project doesn't require the development of client software. However, a small amount of basic HTML code will need to be developed to support the interface through the browser.

A very simple webpage (rendered via a web-browser) will accept queries to find the sequence of buses and trains (a journey) to travel from one bus or train station to another. The web-browser will transmit the query to the instance of the station server (software) representing the source station (presumably one close to the user's home), asking it how to travel to a destination station.

Each bus or train station server only maintains the timetable information of buses and trains leaving that station, including the destination of each, and the (multiple) times throughout each day that each bus leaves the station and arrives at the destination.

If the source and destination stations are directly connected (via a single bus or train trip), then the source station server will be able to immediately respond to the query because it has all necessary information.

If, however, the source and destination stations are not directly connected, the passenger will have to travel via two or more buses or trains, transferring at intermediate station(s). Because each station server only knows about buses and trains leaving that station, it will need to ask other stations' servers for information about the next segment (or hop) of the whole journey.

The result (the answer) returned from the source station back to the web-browser will indicate the number, and departure time of the next bus or train leaving the source station that enables the passenger to reach the required destination station, and the expected final arrival time at the destination. Ideally, the returned result will be the fastest journey - even if it leaves later or includes more segments (hops) than other journeys (but firstly, just report any any valid journey!)

Networking details

  1. each executing instance of a station server will run as a separate operating system process (not a thread in a single process).

  2. all station servers will execute on the same computer. Thus, all network traffic and URLs will refer to localhost or IP address 127.0.0.1 to identify the computer running each server. As localhost or 127.0.0.1 will be implied for all communication, we do not need to specify it as, for example, a command-line argument. Different port numbers will distinguish the operating system processes involved in the communication. A fully distributed project, running on different computers, would require both hostname and port information (why not try it if you have access to multiple computers?)

    A typical invocation of two station server processes is:

    shell>  ./station Warwick-Stn 2401 2408 2560 2566 .... &
    shell>  ./station.py Greenwood-Stn 2402 2560 2567 2408 .... &

    which indicates that the first process (a compiled C program) will manage the data of the station named "Warwick-Stn", will receive queries from web-browsers using TCP/IP port 2401, will receive datagrams from other stations using UDP/IP port 2408, and that "Warwick-Stn" is 'physically adjacent' to 2 other stations that are receiving station-to-station datagrams on UDP/IP ports 2560 and 2566.

    The second server process (a Python script) will similarly manage the data of the station named "Greenwood-Stn", and is 'physically adjacent' to "Warwick-Stn".

    Notice, also, that both processes have been 'started in the background', because neither needs to remain connected to the invoking keyboard.

  3. each station server will accept queries about its timetable data from a standard web-browser. The query and reply will be transmitted using the (minimum amount necessary of the) HTTP protocol, the Hypertext Markup Language (HTML), carried over a bidirectional TCP/IP connection. After the exchange of each query and its response the station server must close the connection.

  4. station servers will communicate with each other, if necessary, using UDP/IP datagrams.

  5. no station server should ever contain all knowledge about the whole network, timetabling data, or network connections. Each station's timetabling data, recorded in one textfile for each station, may change at any time (for example, if a bus breaks down, its next trip will be cancelled). Every query arriving at a station server (via a UDP/IP datagram) should be answered using the current up-to-date timetable information for that station.

Constraints

The constraints of the project require that:

  1. your project must be developed in two different programming languages (selected from Java and Python and (C or C++) - note, not C and C++). You must develop two implementations of the station server, in two different programming languages. The two implementations must perform identically - as a client of these servers you should not be able to tell (or care) what programming language is being used.

  2. you should employ the core networking functions (classes, methods, libraries,...) of your chosen programming languages and not employ specific 3rd-party frameworks or resources to complete large parts of the project. Specifically, you must not use Python's http.server module (even though it is a standard module), or use C++'s Boost library.
    The learning in this project comes from developing an understanding of how an operating system's system calls, and programming languages' standard libraries, may be used to address these types of problems. There is far less learning (or a different type of learning) required in just combining existing libraries and modules to solve this problem. If in doubt, please ask.

  3. your project will be marked on either Apple macOS or Linux. You must develop your project on (just one of) macOS or Linux, either natively or on Microsoft's Windows-10 using Windows Subsystem for Linux (WSL). Your project does not have to work on both operating systems (though that would not be difficult).
    Clearly indicate in your submission which operating system you used.

Project inputs

See the Getting Started page for an example of a simple 4-station network. While your servers will need to read in and parse the contents of the timetable files, you can assume that all their contents are correct (time-formats are correct, departure times precede arrival times, destination station names exist, etc).

The Getting Started page provides a small number of helpful shellscripts to generate and execute your transport networks (you do not have to use these shellscripts if you wish to manage these steps yourself). One shellscript extracts necessary information from downloaded Transperth GTFS datafiles, but you should not attempt to use such a large dataset until you have tested your project on a much smaller transport network.

You can define your own simple transport networks for testing your servers; just invent some station names, and timetable information for buses connecting the servers. You do not need to use the full set of Transperth GTFS files to start the project.

 


Important dates and project submission

  1. The project contributes 40% of your mark in CITS3002 this semester

  2. The project is to be completed as INDIVIDUAL WORK. You may discuss general ideas with other students, but you may not share code from your developed solution. You may use material found in books or tutorials (either physical or online) but must cite the sources of such material.

  3. The project's deadline is 5pm Friday 22nd May (end of week 11). By this deadline submit all source code files and (optionally, any new) scripts that you wish to be assessed. Do not submit any of the original Transperth datafiles or station timetable files.
    Submit your work via cssubmit.
    Ensure that each submitted file contains, as a comment, your name and student number.

  4. It is anticipated that you will undertake the project on your home or laptop computers, running macOS or Linux, either natively or on Microsoft's Windows-10 using Windows Subsystem for Linux (WSL). Although rather slow, you may prefer to develop your project on a computer in CSSE laboratories over UWA's implementation of UniDesk. Please report any difficulties you have in accessing a computer for the project.

Marking rubric (/40 marks)

  1. 5 marks
    Implementation of station-server processes in two programming languages, employing (specifically) the standard Berkeley socket networking features provided by each language, with no reliance on a shared file-system or other inter-process communication mechanisms. Both implementations are expected to address the following points.
  2. 5 marks
    Ability to receive a new TCP connection from a client (such as a web-browser, or curl), receiving a request written in simple HTTP, and replying to the same client using simple HTTP and HTML.
  3. 5 marks
    Ability to establish a UDP communication endpoint, used to exchange datagrams with neighbouring stations, and to communicate to (only) the UDP ports of neighbouring stations.
  4. 5 marks
    Ability to read a station's timetable file, storing it in a suitable data-structure, which is accessed for each query; ability to detect that a timetable file has changed, to delete/dispose of the previous information, and move to using the new information.
  5. 5 marks
    Design and implementation of a simple programming language independent protocol to exchange queries, responses, and (possibly) control information between stations.
  6. 5 marks
    Ability to find a valid (but not necessarily optimal) route between origin and destination stations, for varying sized transport-networks of 2, 3, 5, 10, and 20 stations (including transport-networks involving cycles), with no station attempting to collate information about the whole transport-network; ability to support multiple, concurrent queries from different clients.
  7. 5 marks
    Ability to detect and report when a valid route does not exist (on the current day).
  8. 5 marks
    Use of sound programming practices, including consistent indentation of source-code, use of significant and descriptive comments, meaningful choice and use of identifiers and parameters, minimal use of global variables or state, use of each language's scoping facilities (such as separate files, nested functions/methods) to restrict access to data and functions/methods, detecting and reporting errors returned from system- and library functions/methods, and citations made to written and online resources directly employed in your project.

Clarifications

Please post requests for clarification about any aspect of the project to help3002 so that all students may remain equally informed.
Clarifications will be also added to the project clarifications webpage.


Good luck,

Chris McDonald
April 2020.

This Page

Written by: [email protected]