Linux Blog

Auto Clean-up Downloaded Files – Part IV

Filed under: Shell Script Sundays — TheLinuxBlog.com at 8:00 am on Sunday, February 8, 2015

In order to avoid the complex task of file comparisons on unknown files and types for what should be a simple task, I’ve made an executive decision to handle statistics. Hopefully I will not regret this should I decide to tackle file comparisons. For the cleaning up of Downloaded files there are really only a few statistics that I can think of that are meaningful to the task of deleting multiple files.

The first being counts, this could be the count of files in the folder, the number that matches the (?) find pattern and the total count of deleted files.
For the second “metric” disk space is a good one, but could be tricky to calculate given different file size types (byte, kilobyte, megabyte, etc.)
Timing is another option. We’ll skip how long I spent on this, as it is useless. I’d rather spend my time writing something that can be reused rather than wasting time pointing and clicking – although it would be interesting to calculate how much time was spent writing vs. the total run time we wont cover that. What we will cover is how long did it take to discover and delete the files? A fun number if for nothing else bug giggles.

Fortunately for us, utilities exist for all of these items and can be added fairly simply. We’ll start with the first and work our way down, regardless of how I feel about the last two items.

For reference, this is the script as we left it in the last article.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#!/bin/bash
 
while getopts ":d" ARG;
do case "${ARG}" in 
d) echo "-d option set. DELETING FILES"; DELETE=true;;
esac;
done
 
# Get the Current working directory
PWD=`pwd`
 
# Get User Input
echo "Enter Directory to run in [\"$PWD\"]: " 
read inputline; 
# handle blankline (default) 
if [ -z "$inputline" ]; then
	inputline=$PWD
fi
# Check to make sure that it is a valid directory
if [ ! -d "$inputline" ]; then
   	# Doesn't exist, exit
	echo "Directory $inputline does not exist"
	exit;
fi
 
if [ ! $DELETE ]; then
	echo "Finding Files to delete in $inputline, use -d to delete"
	find "$inputline" -iname "*(?)*" | while read i; do echo rm "$i"; done;
else
	echo "Files to delete:"
	find "$inputline" -iname "*(?)*" | while read i; do echo rm "$i"; done;
	echo "Are you Sure? Y to continue"
	read confirm;
	if [ $confirm == "Y" ]; then
		find "$inputline" -iname "*(?)*" | while read i; do rm "$i"; done; echo "Done";	
	else
		echo "Cancelled"
	fi;	
fi;

So, to count the files due to the way the script is written we have a number of choices of how to implement it. Adding the total number of files in the directory is a good start and very simple to add. Although probably not the best way to do it we can use the wc command to achieve this. It’s pretty self explanatory with the -l option being used to count lines, we pipe the output of the find command to wc -l.
That gives a pretty ugly return value with spaces and tabs. You could probably use sed or awk to fix the formatting but, I’m a fan of cut due to simplicity.

So wc -l gives a number of spaces and then the value. Using cut -d \ -f 8- (cut on blank and start at 8th place to the end) will give us the number. Fantastic.

Placing the pipeline within the script at strategic locations will give us the results we want. It may affect the last item of run times, but who cares? This isn’t world saving science, it’s for fun. Because of the article where we added some fail-safes to stop people deleting their files, there may be some code repetition that will make the object-oriented programmers cringe, but again – who cares? Feel free to refactor until your heart’s content – who knows maybe I’ll even post an update and give all of the fame and glory to the Almighty who took a task and over complicated it more than myself. Well lets get back to the topic.
The two sections that really matter here are the dry-run (no deletion) and the section where it deletes. Line 28 and 31 respectively. Plug the full command in those two places, forget the timing metic (one won’t be called all the time anyway) and bob’s your uncle. You now have counts, assuming you remember to echo the output.

Now, if you followed along by copying and pasting (or not) you should end up with something that looks like what I prepared earlier:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#!/bin/bash
 
while getopts ":d" ARG;
do case "${ARG}" in 
d) echo "-d option set. DELETING FILES"; DELETE=true;;
esac;
done
 
# Get the Current working directory
PWD=`pwd`
 
# Get User Input
echo "Enter Directory to run in [\"$PWD\"]: " 
read inputline; 
# handle blankline (default) 
if [ -z "$inputline" ]; then
	inputline=$PWD
fi
# Check to make sure that it is a valid directory
if [ ! -d "$inputline" ]; then
   	# Doesn't exist, exit
	echo "Directory $inputline does not exist"
	exit;
fi
 
if [ ! $DELETE ]; then
	echo "Finding Files to delete in $inputline, use -d to delete"
	echo Found `find "$inputline" -iname "*(?)*" | wc -l  | cut -d \  -f 8-` files:
	find "$inputline" -iname "*(?)*" | while read i; do echo rm "$i"; done;
else
	echo "Files to delete:"
	echo Found `find "$inputline" -iname "*(?)*" | wc -l  | cut -d \  -f 8-` files:
	find "$inputline" -iname "*(?)*" | while read i; do echo rm "$i"; done;
	echo "Are you Sure? Y to continue"
	read confirm;
	if [ $confirm == "Y" ]; then
		echo Deleting `find "$inputline" -iname "*(?)*" | wc -l  | cut -d \  -f 8-` files:
		find "$inputline" -iname "*(?)*" | while read i; do rm "$i"; done; echo "Done";	
	else
		echo "Cancelled"
	fi;	
fi;

Boom! and by that I mean, after testing it for 20 seconds extensively you’ll see that it works flawlessly. And with that, I’ll leave it at a cliff hanger for another exiting edition of Shell Script Sundays – Stay tuned!


No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>