The University of Western Australia
Computer Science and Software Engineering
 
 

Department of Computer Science and Software Engineering

CITS4407 Open Source Tools and Scripting

Exercise sheet 2 - for the week commencing 9th March 2020

REDUCE YOUR FRUSTRATION - UNDERTAKE THESE EXERCISES WITH A FRIEND.

These exercises are performed using bash and the Terminal window application (under either Linux or macOS).

These tasks introduce you to quite a number of new commands (the command names written in italics). Remember, that the man command is your best friend (on your computer, not through a web-browser). Be careful when working and communicating with another student as, in some cases, the command switches are different between Linux and macOS.


  1. The Western Australian Government requires all fuel companies to register their selling price for fuel by 4PM each day. Companies must sell their fuel, the next day, at the registered price (or below), and may not alter the price during the day. To provide some transparency, the FuelWatch Historical Data website records all daily price details, and the data may be downloaded for examination.

    1. Using your web-browser (Firefox), download the data for February 2020 to your desktop. Notice that the named file (its URL) ends in ".csv.zip" . What do you think that means? What is the name of the file that your web-browser stores on your desktop?
    2. The newly downloaded file has been compressed with the Lempel-Ziv-Markov chain algorithm. We'll just refer to this file as being 'a zip file', requiring decompression (expansion). Firstly, without uncompressing the file, use the unzip command to see what is inside the compressed file.
    3. Use the ls command to see the initial size of the zip file. Now, use unzip to decompress it, observe that its filename has changed, and check its new size.

  2. OK, we now have our data (file) in a format we can use for the following tasks.

    1. Use the wc command to determine the number of lines in the file.
    2. Use the less command to view the contents of the data file (type the 'q' key when ready to quit less). You may like to resize your Terminal window, or reduce its font size.
    3. Observe what character is used to delimit (separate) the fields (columns) of data.
      Use the cut command to list all the service-station (garage, petrol station...) names present in the file.
    4. 🌶 (Getting harder) How many distinct service-stations are represented in the file?
    5. 🌶🌶 (Even harder) Using the grep command, find all of the full prices (across the month) sold by 'Caltex StarMart Nedlands'. Notice that the service-station name has spaces in it.
    6. 🌶🌶🌶 (Brain busting) On what day was PULP (premium unleaded petrol) the cheapest at 'Caltex StarMart Nedlands'?

  3. Another similar example, from the Australian Bureau of Meteorology.

    This time we're seeking plain text versions of some monthly weather data. Download the weather data for both February 2019 and February 2020.

    1. The command-line program curl enables us to downlowd files from websites without using a web-browser. You need to use a specific switch to curl to name the required output file, and the URL of the required file(data) from the website. Find the URLs for the February 2019 and February 2020 datasets, and same them to different files using curl.
    2. Now, choose a few metrics such as maximum temperature, rainfall, or maximum wind gust, and determine if February 2019 or February 2020 was hotter, wetter, or windier.


Chris McDonald
March 2020.

This Page

Written by: [email protected]