Home > Undergraduate > Open Source Tools and Scripting >    Labs

  CITS4407/CITS2003 OPEN SOURCE TOOLS AND SCRIPTING
 
 

Lab 8: awk

Questions

  1. /lab/week9/american_dates.txt contains a list of birthdays in mm/dd/yyyy format.
    1. Use awk to print the file with dates in dd/mm/yyyy format
    2. awk -F / '{print $2"/"$1"/"$3}' american_dates.txt
    3. Use awk to print the file with dates in yyyy/mm/dd format. This one is harder!
    4. awk -F'[/ ]' '{print $3"/"$1"/"$2" "$4}' american_dates.txt
  2. What is the difference between $0 and $1 in an awk script? Try printing both and see what you get for different files
  3. $0 is the entire line (all arguments). $1 is the first argument only.
  4. /lab/week9/add_this.txt contains a list of numbers
    1. Use awk to print the cumulative sum of the values on each line. It should look like this:
      1
      3
      6
      10
      15
      21
      ...
      
    2. awk '{x += $1; print x}' add_this.txt
    3. Use awk to print the sum of all values in the file. This should only print the sum at the end of the file.
    4. awk '{x += $1} END {print x}' add_this.txt
  5. Your friend has tried to fix up the university enrolment data file from last week, but they've made a mess. Write an awk script which prints all invalid lines in /lab/week9/australian-universities.csv. Valid lines must contain 4 fields and the word "University" (corractly capitalised) in the first field.
  6. awk -F"," 'NF!=4 || $1 !~ /University/ {print}' australian-universities.csv
    
    1. /lab/week9/original_151.csv is a list of popular portable cartoon creatures. The creatures are listed in numerical order, but the first column, which lists the number for each creature, has gone missing. Use awk to recreate this column and put the output in a file called numbered_151.csv. It should look like this:
      1,Bulbasaur,Grass,Poison,Fushigidane
      2,Ivysaur,Grass,Poison,Fushigisou
      3,Venusaur,Grass,Poison,Fushigibana
      4,Charmander,Fire,,Hitokage
      5,Charmeleon,Fire,,Lizardo
      ...
      
      Note: try and do this using > within awk instead of a shell output redirect.
    2. awk '{print FNR "," $0 > "numbered_151.txt"}' original_151.csv
    3. /lab/week9/height_weight_151.csv lists creature names followed by their height in m and weight in kg. Use printf in awk to print the heights and weights in the following format:
      Bulbasaur is 0.7m tall and weighs 6.9kg
      Ivysaur is 1.0m tall and weighs 13.0kg
      Venusaur is 2.0m tall and weighs 100.0kg
      Charmander is 0.6m tall and weighs 8.5kg
      Charmeleon is 1.1m tall and weighs 19.0kg
      ...
      
    4. awk -F',' '{printf("%s is %.1fm tall and weighs %.1fkg\n", $1, $2, $3)}' height_weight_151.csv
    5. Use awk to print the name of the tallest creature
    6. awk -F',' '{if ($2 > x) {x = $2; max = $1}} END {print max}' height_weight_151.csv
    7. Use awk to print the name of the tallest creature in both height_weight_151.csv and more_height_weight.csv
    8. awk -F',' '{if ($2 > x) {x = $2; max = $1}} END {print max}' height_weight_151.csv more_higher_weight.csv
    9. Create a pipeline using paste and awk to join original_151.csv and height_weight_151.csv and then print creature information in the following format:
      Name: Bulbasaur, Height: 0.7, Weight: 6.9, Type: Grass
      Name: Ivysaur, Height: 1.0, Weight: 13.0, Type: Grass
      Name: Venusaur, Height: 2.0, Weight: 100.0, Type: Grass
      Name: Charmander, Height: 0.6, Weight: 8.5, Type: Fire
      Name: Charmeleon, Height: 1.1, Weight: 19.0, Type: Fire
      ...
      
      Note: it will help if you use comma as the paste field delimiter
    10. paste original_151.csv height_weight_151.csv -d"," | awk -F"," '{printf("Name: %s, Height: %.1f, Weight: %.1f, Type: %s\n", $1, $6, $7, $2)}'
    11. Use awk to print the number of creatures of each type. Remember that each creature in original_151.csv has two type fields. You may find it easier to write an awk script instead of trying to cram it all on one line. You can call an awk script with awk -F"," -f types.awk original_151.csv
    12. {
          types[$2]++;
          if ($3 != "") {
              types[$3]++;
          }
      }
      END{
        for(i in types)
            print i ":" types[i]
      }
      
      # to call this script, use awk -F "," -f types.awk original_151.csv
      
    13. Modify this script to print a list of creatures for each type instead of a count
    14. {
          types[$2] = types[$2] " " $1;
          if ($3 != "") {
              types[$3] = types[$3] " " $1;
          }
      }
      END{
        for(i in types)
            print i ":" types[i]
      }
      
      # to call this script, use awk -F "," -f types.awk original_151.csv
      

Bonus

  1. Use awk to calculate the average creature height and weight
  2. Write an awk command to fix the incorrect capitalisation of university names using toupper and tolower


Department of Computer Science & Software Engineering
The University of Western Australia
Last modified: 8 February 2022
Modified By: Daniel Smith

UWA