CITS2002 Systems Programming | |
|
Project 2 2020
The goal of this project is to write a command-line
utility program in C99,
named mergetars,
which merges the contents of multiple tar archive files
into a single tar archive file.
Project DescriptionA medium-sized business has decided to migrate its files to cloud-based storage, requiring it to first identify all files to migrate. A critical disk failure at the worst possible time now requires all files to be recovered from recent backups. However, the business' IT wizard has recently left for a lucrative position at a cloud-based storage company. Management has located the backups, but they have been poorly labeled, making it impossible to easily identify what is contained in each backup and when each was made. The decision has been made to migrate just the latest copy of each file to the cloud, which will require an 'intelligent merging' of the backups' contents. The backups have been made using the widely available tar command, a well-defined file format whose name is a contraction of tape archive, reflecting the backup media with which the command was first used. While the tar command supports many actions to create, list, extract, and append tar archive files, it offers no support to merge archives together. The business has located many backups each holding thousands of files. The task to identify all duplicate files, and to find the most recent version of similar files, is too large to be performed manually, and your team has been contracted to develop a new command-line utility program to intelligently merge all of the backups' contents into a single (large) tar archive. Program invocationThe purpose of your mergetars command-line utility is to merge the contents of multiple tar archive files into a single tar archive. The program receives the name of one or more input filenames, and a single output filename (if only a single input filename is provided, then mergetars will act like a simple file-copying program, although there is no requirement to check for this special case). A typical program invocation is:prompt> ./mergetars input_tarfile1 [input_tarfile2 ...] output_tarfile Filenames will always end with the suffix .tar – indicating that the archive does not involve any compression – or with the suffix .tar.gz or .tgz – indicating that the archive is (or will be) compressed using the GZIP compression algorithm. The standard tar utility supports these cases using its -z command-line option. There is no requirement for mergetars to support any other compression schemes. The merging criteriaThe inputs are merged to form the output according to the following definitions and rules:
Suggested approachThe project can be completed by following these recommended (but not required) steps:
It is anticipated (though not required) that a successful project will use (some of) the following system-calls, and standard C99 & POSIX functions: Project requirementsGood luck! Chris McDonald. |