Linux Blog

Dealing with the HTML file input limitation of uploading multiple files

Filed under: General Linux — TheLinuxBlog.com at 8:59 am on Thursday, August 28, 2008

Everybody knows how annoying the <input type=”file”> HTML tag is right? Does it make you mad when you have to browse and upload each file individually? Sure you can use JavaScript to add / remove the input boxes, but you still need to browse for each file individually, which if you’re uploading lots of files doesn’t make sense.

Would you like a multiple file uploader like Facebook has? Perhaps more of a simple explorer like interface that will allow you to select multiple files? Possibly previewing them, and perhaps processing them on the client side?

Well, I wouldn’t say it was the easiest thing in the world to implement but there is an open source multiple file uploader that might suit your needs. Since its written in Java, its highly expandable (if you know how or pay a development company or freelancer) and can also be partially configured with JavaScript.

What is this fantastic sounding multiple file uploader you speak of?

Its called jupload and can be downloaded from jupload.sourceforge.net. Don’t let the website fool you because this tool is actually pretty neat.

If any one would like examples on how to use it, just write a blog post linking to me saying how cool it is and how much you need it, offer me cash, comment or participate in this blog, offer me goods / services, give me links from your website or just e-mail me politely asking for help and I’ll see what I can do.

If you don’t like it: start reading the documentation like I did, seriously its not that hard.

Using BASH to sort a book collection. ISBN Data Mining – Part 1

Filed under: General Linux,Shell Script Sundays — TheLinuxBlog.com at 2:47 am on Sunday, September 16, 2007

Many problems can be solved with a little bit of shell scripting.
This week I plan to show you a script that does a little data mining from Barnes and Noble.
I have a lot of books and wanted cataloged information on them. Each book has a unique identifier called an ISBN. So I collected all of my ISBN numbers and a simple loop that wrapped around a script a friend of mine made to find basic information.
Here is his script:

#!/bin/bash
ISBN=”$1″

function fetchInfo () {
### Using barnesandnoble.com to fetch info…
lynx -source “http://search.barnesandnoble.com/booksearch/isbninquiry.asp?ISBN=${ISBN}” |\
tr -d ‘[:cntrl:]‘ | sed ‘s/>\n

### Parsing book title.
if [ "${lineArray[0]}” == ”
echo “4|Title: ${lineArray[*]}” | sed ‘s/<[^>]*>//g;s/ ([^)]*)//g’### Parsing book author.
elif [ "$(echo ${lineArray[*]} | grep “id=\”contributor\”")” ]; then
echo “3|Author(s): ${lineArray[*]}” | sed ‘s/by //;s/<[^>]*>//g’

### Parsing additional data.
elif [ "${lineArray[0]}” == ”
[ "$(echo ${lineArray[*]} | grep -ve “bullet” -ve “title”)” ]; then
echo “1|${lineArray[*]}” | sed ‘s/<[^>]*>//g;s/:/: /;s/ / /’
fidone | sort -ur | awk -F\| ‘{print $2}’ | grep “:”

}

if [ "${#ISBN}" -ge "10" ]; then
fetchInfo
fi

The script should be called as followed (assuming the script name is eBook.sh):

sh eBook.sh ISBNNUMBER

The first step is to see if the ISBN is greater than 10 characters, if it is it goes to the fetchInfo() function.
It then takes the given ISBN number and searches the barnsandnoble.com site for any matches. To do this lynx is used, the -source option tells lynx to output the source instead of using browsing mode. The output of lynx is piped to tr and sed. tr is used to delete all line breaks from the source, the sed expression adds a line break at the end of each HTML tag. The while loop loops over each line from from the piped lynx, tr and sed.
Within the loop is where anything from the output of the search page can be pulled out. This script pulls out the book title, the author and additional data.

I formatted my ISBN’s in a text list and used the following loop to fetch information on my books and save them with the ISBN as the file name.

for i in $(cat list.txt); do sh eBook.sh $i > $i.txt; done;

In the next issue I plan to expand on this to format data in an even more presentable manor.
Used applications
tr, lynx, sed, awk, sort, grep