Bash Scripting (DUE: 3/12 11:59 PM EDT)

This assignment requires you to write 3 Shell scripts from scratch (or more if you wish to break things across multiple files). These problems are designed to give you some hands on experience in the following areas:

  • Leveraging UNIX utilities
  • Utilizing UNIX redirection
  • Text processing
  • Regular expressions
  • Command line processing

The design and implementation (unless specified) of each problem is completely up to you. Suggested approaches may be mentioned in class, but they are just that, suggestions. You are welcome to implement down any path you see fit.

Some guidelines for the problems:

  • All problems should handle errors in a reasonable way. Your solutions will be tested for invalid input or usage. Give the user a meaningful message if they misuse the script or if an error occurs.
  • Projects will be graded electronically on one of the GL Linux servers — linux[123].gl.umbc.edu. You are welcome to develop your assignment at home, but make sure your scripts run correctly on the GL Linux servers.

The assignment will be submitted through GitHub. To get started with the assignment, go to https://classroom.github.com/a/AJeuageo and sign in to your GitHub account. This will create a repository for you containing three empty files, with the names you are expected to use for this assignment. Please include a README with your name in it.

Git on GL is outdated and requires a slightly different mechanism to you. On most systems, running a git command will prompt for your GitHub username and password. On GL, it expects the user name as part of the command. One was to achieve this is to modify the clone command so that it now reads

git clone https://YOUR_GITHUB_USERNAME@github.com/YOUR_REPO_INFORMATION

Prior to doing this, you may need to run the command unset SSH_ASKPASS. This is so GL knows not to try and prompt you for your password using a GUI, and instead asks for it on the command line.

If you are using tcsh or another c-shell, you need to type unsetenv SSH_ASKPASS to prevent the GUI error.

1. addressbook.sh (30 Points)

Task

In this problem, you are to implement a script, adressbook.sh, that manages a simple address book from the command line. The user will be able to add, remove, and search entries from this address book.

The addressbook.sh script takes numerous forms of command line arguments:

  • ./addressbook.sh help
  • ./addressbook.sh add "First Name" "Last Name" <Email>
  • ./addressbook.sh remove "First Name" "Last Name"
  • ./addressbook.sh search String [--email] [--whole-name]

Any command line arguments that do not adhere to one of the aforementioned forms should be treated as an error.

Help

The help should print a synopsis of available commands along with a brief description of what each command does

Add

The add command takes exactly three arguments. The quotes are only necessary if the first or last names have spaces in them.

The add should add a person to your address book, in the specified data location (see below). Duplicate entries are permitted.

Remove

The remove command takes exactly two arguments, the first and last name of someone. For each entry that has that name in the address book, the user should be prompted with they wish to remove that specific person, with all the information of the person shown. The user can respond with a "Y" or "N", with the default being "N" if nothing is entered. If the person is not in the address book, an appropriate message should be printed.

Search

The search command has one required argument and should print out all entries in the address book that match the search string. By default, the search should be done on the last name of each entry.

The optional flags change which field is searched. If --whole-name is specified, then both the first and last name fields should be used when searching. If --email is specified, only the email field should be used in the search. Both flags can be specified at once

Data File

The data for this problem must be stored under a directory called “.book” in the user's home directory. Your script should gracefully handle the case when ~/.book/ doesn't yet exist. If the user tries to perform any command and it doesn't yet exist, you'll need to handle that gracefully and create directories/files as needed. The format of any files in this directory is left completely open for you to decide.

You are not permitted to write to any location other than the ~/.book/ directory. If the user wishes to blow out the entire address book , they may simply remove the ~/.book/ directory and start over again from scratch.

Example Output

 
	
$ ./addressbook.sh add Michael Scott boss@dundermifflin.com
$ ./addressbook.sh add Michael Scott boss@mspaperco.com
$ ./addressbook.sh add Dwight Schrute bearsbeetsbattlestar@schrutefarms.io
$ ./addressbook.sh add Pam Halpert artist@dundermifflin.com
$ ./addressbook.sh add Jim Halpert prankster@dundermifflin.com 

$ ./addressbook.sh search Halpert
The following results were found for Halpert:

First Name 		Last Name 	Email
--------------------------------------
Pam				Halpert		artist@dundermifflin.com
Jim				Halpert		prankster@dundermifflin.com

$ ./addressbook.sh search dundermifflin
No results found for dundermifflin

$ ./addressbook.sh search dundermifflin --email
The following results were found for dundermifflin:

First Name 		Last Name 	Email
--------------------------------------
Michael			Scott		boss@dundermifflin.com
Pam				Halpert		artist@dundermifflin.com
Jim				Halpert		prankster@dundermifflin.com

$ ./addressbook.sh remove Michael Scott
Would you like to delete Michael Scott, with email adress boss@dundermifflin.com? (y/N)Y
Would you like to delete Michael Scott, with email adress boss@mspaperco.com? (y/N)Y

$ ./addressbook.sh search dundermifflin --email
The following results were found for dundermifflin:

First Name 		Last Name 	Email
--------------------------------------
Pam				Halpert		artist@dundermifflin.com
Jim				Halpert		prankster@dundermifflin.com

$ ./addressbook.sh help
Your clever help text here

	
	

How you will be graded

The points for this problem will be assigned according to the following rubric

RequirementPoints
Total30
All commands are validated for proper arguments4
Help prints a good summary of the script3
~/.book is created if it doesn't exist2
A properly formatted add command adds the person to the address book4
Remove prompts the user to remove each matching person4
Remove deletes a person from the address book upon recieving a 'Y'4
Search prints a message if no results are found1
Search finds matching people and prints them nicely4
Search flags allow the other fields to be searched correctly4

2. dupes.sh (30 points)

Task

For this problem you will write a Shell script called dupes.sh that allows a user to find duplicate files in a given directory tree. Specifically, this utility will check duplicate files based on content and not name.

The dupes.sh script shall take exactly 1 command line argument:

  1. A path to a directory on the file system

This script will generate a report which is displayed on stdout. The report should contain the following elements:

  • For each set of duplicate files, the report shall print a header that states:
    • The number of instances of that file
    • The size of each instance of the file (in a human readable format — e.g. B, K, M, G)
  • For each set of duplicate files, after the header, the report shall list each file (path + filename)
  • After all of the sets of duplicate files have been output, the script should output the number of duplicates and the total size (also in a human readable form).

Sample Output

    
./dupes.sh downloads

2 files (5B each)
  downloads/asdf
  downloads/foo/bar/asdf

3 files (116B each)
  downloads/dupes.sh
  downloads/foo/bar/dupes.sh
  downloads/foo/dupes.sh

2 files (85.56K each)
  downloads/bootstrap.zip
  downloads/foo/bootstrap.zip

2 files (92.62K each)
  downloads/jquery-1.9.1.min.js
  downloads/jquery-1.9.1.min.js-copy

2 files (2.54M each)
  downloads/Mou(2).zip
  downloads/Mou.zip

2 files (4B each)
  downloads/some-file
  downloads/some-other-file

===========================================

Total Duplicates: 7
Total Size: 2.72M
    
    

Notes

  • The number of duplicates is not the total number of files shown in the lists of duplicates. Rather, assume the user would want to keep at least one copy out of each group. So, the count is the total number of files - the number of groups.
  • For the purposes of this assignment 1K = 1000B, 1M = 1000K, 1G = 1000M.
  • You're permitted to write out temporary files if you feel that would be useful. If you do so, you must write them out to /tmp/<PID>* and cleanup when the script exits.

How you will be graded

The points for this problem will be assigned according to the following rubric

RequirementPoints
Total30
All temporary files are stored in the correct location3
All temporary files are deleted at the completion of the script2
Command arguments are validated (ie argument is a directory, etc.)2
Duplicate files are found (based on content, not name)8
Duplicate files are found at all levels of the directory structure, not just the immediate children2
Files are grouped correctly (no file in more than one group, all matching files in the same group)3
The size of each group is printed and correct3
The total size is printed and correct3
All sizes are printed in a human readable format2
The number of duplicates is printed and correct2

3. clean.sh (40 points)

Task

For this problem, you will write a script called clean.sh that allows a user to find large, old files, and optionally move them to the trash.

The clean.sh has one required command line argument, and two optional ones

  1. The directory to scan
  2. The size to find files larger than (Optional)
  3. The number of days ago to start looking for files, based on modification date (Optional)

If no size is given, the default size should be 10 MB. If no number of days is given, the default should be 365. Possible values for the size argument can be given in Bytes, Kilobytes, Megabytes, and Gigabytes, denoted by the suffixes B, K, M, and G respectively.

The script should print a line indicating how many files were found matching the criteria, and ask the user if they would like to start to move them to the trash

For each file that is found matching the criteria, the user should be prompted if they wish to move this file to the trash. The trash is a hidden directory in the home directory, ~/.trash. If it does not exist, it should be created. The user can respond to this prompt with a 'Y' to indicate the file should be moved to the trash, a 'N' to indicate it shouldn't be, and an 'A', to indicate to move all the files to the trash, with out prompting.

If a file is over 10 Megabytes, it should be compressed using gzip before being moved to the trash. If after moving a file to the trash, the directory it was in is now empty, the user user should be made aware of this fact, and prompted to remove the directory. This should happen even if they have selected 'A' for moving all files to the trash.

Testing Tips

To change the modification date of a file, the touch command can be used. The -m flag indicated that the modification date should be changed and the --date flag allows the new date to be set. For example:


bash-4.4$ ls -lh
total 44K
-rwxr-xr-x 1 bwilk1 grad 14K Feb 26 08:48 bash.php
-rwxr-xr-x 1 bwilk1 grad 27K Feb 14 16:07 regex.php

bash-4.4$ touch -m --date="1981-01-01" bash.php 

bash-4.4$ ls -lh
total 44K
-rwxr-xr-x 1 bwilk1 grad 14K Jan  1  1981 bash.php
-rwxr-xr-x 1 bwilk1 grad 27K Feb 14 16:07 regex.php


Sample Intercation

	
$ ls -lhR
.:
total 57M
-rwxr-x--- 1 bryan bryan  34M Jan  1 00:00 android.jar*
-rwxr-x--- 1 bryan bryan 1.5M Apr  1  2000 deer.jpg*
-rwxr-x--- 1 bryan bryan 9.3M Oct 10  1980 out.pdf*
-rwxr-x--- 1 bryan bryan  14M Oct 10  1980 punkt.zip*
drwxrwx--- 2 bryan bryan 4.0K Feb 26 10:34 something/

./something:
total 14G
-rwxr-x--- 1 bryan bryan  14G Feb 26  1990 pairs*
-rwxr-x--- 1 bryan bryan 7.0M Dec 31  1999 random.pptx*

$ ls -lh ~/.trash/
ls: cannot access '/home/bryan/.trash/': No such file or directory

$ ./clean.sh downloads/ 
Found 2 files to delete. Would you like to move them to the trash? [y/N]Y
Would you like to move downloads/something/pairs to the trash? [y/N/Always]Y
Would you like to move downloads/punkt.zip to the trash? [y/N/Always]
$ ls -lh ~/.trash/
total 3.5G
-rwxr-x--- 1 bryan bryan 3.5G Feb 26  1990 pairs.gz*

$ ls -lhR downloads
downloads/:
total 57M
-rwxr-x--- 1 bryan bryan  34M Jan  1 00:00 android.jar*
-rwxr-x--- 1 bryan bryan 1.5M Apr  1  2000 deer.jpg*
-rwxr-x--- 1 bryan bryan 9.3M Oct 10  1980 out.pdf*
-rwxr-x--- 1 bryan bryan  14M Oct 10  1980 punkt.zip*
drwxrwx--- 2 bryan bryan 4.0K Feb 26 10:56 something/

downloads/something:
total 7.0M
-rwxr-x--- 1 bryan bryan 7.0M Dec 31  1999 random.pptx*

$ ./clean.sh downloads/ 5M
Found 3 files to delete. Would you like to move them to the trash? [y/N]Y
Would you like to move downloads/something/random.pptx to the trash? [y/N/Always]A
After moving downloads/something/random.pptx to the trash, downloads/something is now empty. Would you like to remove it? [y/N]Y

$ ls -lh ~/.trash/
total 3.6G
-rwxr-x--- 1 bryan bryan 9.3M Oct 10  1980 out.pdf*
-rwxr-x--- 1 bryan bryan 3.5G Feb 26  1990 pairs.gz*
-rwxr-x--- 1 bryan bryan  14M Oct 10  1980 punkt.zip.gz*
-rwxr-x--- 1 bryan bryan 7.0M Dec 31  1999 random.pptx*

$ ls -lhR downloads/
downloads/:
total 35M
-rwxr-x--- 1 bryan bryan  34M Jan  1 00:00 android.jar*
-rwxr-x--- 1 bryan bryan 1.5M Apr  1  2000 deer.jpg*

$ ./clean.sh downloads/ 50
Found 1 files to delete. Would you like to move them to the trash? [y/N]Y
Would you like to move downloads/android.jar to the trash? [y/N/Always]Y

$ ls -lh downloads/
total 1.5M
-rwxr-x--- 1 bryan bryan 1.5M Apr  1  2000 deer.jpg*

$ ./clean.sh downloads/ 1K 5000 
Found 1 files to delete. Would you like to move them to the trash? [y/N]Y
Would you like to move downloads/deer.jpg to the trash? [y/N/Always]Y
After moving downloads/deer.jpg to the trash, downloads is now empty. Would you like to remove it? [y/N]Y




How you will be graded

The points for this problem will be assigned according to the following rubric

RequirementPoints
Total40
The arguments to the command are checked and validated3
If optional arguments are not supplied, the correct defaults are used2
The files identified for cleaning are correct on the basis of size5
The files identified for cleaning are correct on the basis of modification date5
~/.trash is created if it does not exist3
The total number of files found is printed2
The user is given the choice to start deleting files or not2
User is asked to move each file found, and the outcome is appropriate to the users response 8
Files over 10MB are compressed using gzip before moving to the trash3
After moving a file, if the directory is now empty, the user is prompted to delete it3
The empty directory is deleted successfully if the user confirms2
No errors are printed to std out about permission errors, etc. 2

Running your code

Your code should be able to be run without explicitly calling bash. You will need to set the permissions of your file appropriately as well as use the shebang line for this to work as intended.

Submitting your code

Your code should be committed and pushed back to GitHub before the due date. DO NOT rename the files. You do not need to commit anything other than the .sh files. If you create helper scripts, be sure to submit those as well.