Department of Computer Science and Software Engineering
CITS4407 Open Source Tools and Scripting
Exercise sheet 2 - for the week commencing 9th March 2020
REDUCE YOUR FRUSTRATION - UNDERTAKE THESE EXERCISES WITH A FRIEND.
These exercises are performed using
bash and
the
Terminal window application
(under either Linux or macOS).
These tasks introduce you to quite a number of new commands
(the command names written in italics).
Remember, that
the man command is your best friend
(on your computer, not through a web-browser).
Be careful when working and communicating with another student as,
in some cases,
the command switches are different between Linux and macOS.
- The Western Australian Government requires all fuel companies to register
their selling price for fuel by 4PM each day.
Companies must sell their fuel, the next day, at the registered price (or below),
and may not alter the price during the day.
To provide some transparency,
the
FuelWatch Historical Data website
records all daily price details,
and the data may be downloaded for examination.
- Using your web-browser (Firefox),
download the data for February 2020 to your desktop.
Notice that the named file (its URL) ends in ".csv.zip" .
What do you think that means?
What is the name of the file that your web-browser stores on your desktop?
- The newly downloaded file has been compressed with
the Lempel-Ziv-Markov chain algorithm.
We'll just refer to this file as being 'a zip file',
requiring decompression (expansion).
Firstly, without uncompressing the file,
use the unzip command to see what is inside the compressed file.
- Use the ls command to see the initial size of the zip file.
Now, use unzip to decompress it,
observe that its filename has changed,
and check its new size.
-
OK, we now have our data (file) in a format we can use for the following
tasks.
- Use the wc command to determine the number of lines in the file.
- Use the less command to view the contents of the data file
(type the 'q' key when ready to quit less).
You may like to resize your Terminal window, or reduce its font size.
- Observe what character is used to delimit (separate)
the fields (columns) of data.
Use the cut command to list all the service-station
(garage, petrol station...) names present in the file.
- 🌶 (Getting harder)
How many distinct service-stations are represented in the file?
- 🌶🌶 (Even harder)
Using the grep command,
find all of the full prices (across the month) sold
by 'Caltex StarMart Nedlands'.
Notice that the service-station name has spaces in it.
- 🌶🌶🌶 (Brain busting) On what day was PULP (premium unleaded petrol)
the cheapest at 'Caltex StarMart Nedlands'?
- Another similar example, from the
Australian
Bureau of Meteorology.
This time we're seeking plain text versions of some monthly weather data.
Download the weather data for both
February 2019 and February 2020.
- The command-line program curl
enables us to downlowd files from websites without using a web-browser.
You need to use a specific switch to curl
to name the required output file,
and the URL of the required file(data) from the website.
Find the URLs for the February 2019 and February 2020 datasets,
and same them to different files using curl.
- Now, choose a few metrics such as maximum temperature,
rainfall,
or maximum wind gust,
and determine if February 2019 or February 2020
was hotter, wetter, or windier.
Chris McDonald
March 2020.