Linux Blog

Linux Tunneling Techniques

Filed under: Linux Software — TheLinuxBlog.com at 4:59 am on Wednesday, November 10, 2010


Video completely unrelated.
Ever tunneled or used tunneling for mobile Internet? Perhaps you have needed to otherwise tunnel to bypass a restrictive firewall or for a secure channel on an insecure wireless network. It seems that everyone knows how to tunnel using the ssh socks support and how to use Firefox’s about:config screen to set it to use a socks and remote DNS. While this is great for occasional web browsing it only takes you so far.

tsocks is a great application to let you tunnel other programs over socks. Its easy to install on most distributions and allows you to use many command line applications. I’ve used it on a number of occasions successfully and while it does its job its not the the best solution. This is because it was last updated in 2002 and doesn’t perform DNS lookups. I found myself using it to SSH to an IP address (memorized, or looked up through another SSH session) and using applications on the remote server.

proxychains is a bit of a better tunneling solution, it works the similarly to tsocks but It also resolves DNS and can chain multiple proxies. I’ve used it on numerous occasions with great success. ssh, lynx, lftp, irssi and a whole bunch of others work without any problems. Another plus is it has also been updated in the last 5 years (but not by much.)

One application I haven’t yet had the pleasure of trying on the desktop is 3proxy. I have used it on the iPhone but ended up using the ssh socks method more often. From its yum description and feature list, it sounds very promising and one definitely worth looking into.

Speaking from experience I know its kind of difficult to browse your distributions web repositories to find the files you need and install them (I had to do this since I didn’t have them) so I recommend you download these applications and save yourself some time before you need them on the road.

My Linux Box has a new video card!

Filed under: General Linux,Linux Hardware — TheLinuxBlog.com at 12:10 am on Friday, December 7, 2007

NVIDIA GeForce 4 Ti 4200 AGP 8X Driver IssueI’ve got a new to me video card to temporarily service as my video card until I can get a cooling kit for my GeForce FX 5200. Its an older GeForce 4 Ti 4200 AGP8X. I thought it would be as simple as plugging it into my agp slot, turning the computer on and re-installing the NVIDIA driver module but I was wrong.

The problem is, since this is an older card I have to use a legacy driver:

WARNING: The NVIDIA GeForce4 Ti 4200 with AGP8X GPU installed in this system
is supported through the NVIDIA 1.0-96xx legacy Linux graphics
drivers. Please visit http://www.nvidia.com/object/unix.html for
more information. The 100.14.11 NVIDIA Linux graphics driver will
ignore this GPU.

If you can’t see the screen shot click it. Its basically a pretty version of the above error message that tells me that I need to use the: “NVIDIA 1.0-96xx legacy Linux Driver”. Here is the download page for the driver if your running into the same problem: http://www.nvidia.com/object/linux_display_x86_96.43.01.html

You can temporarily use the “nv” driver in your XORG configuration but be warned this is not accelerated so you should just use it to download the legacy driver, quit X and then install the accelerated one. Unfortunately I could not get links or lynx to download from nVidia’s site because of some strange javascript code. I find it Ironic that the Unix drivers page isn’t even compatible with the basic Unix browsers.

Fetching Online Data From Command Line

Filed under: Shell Script Sundays — TheLinuxBlog.com at 6:12 pm on Sunday, December 2, 2007

Shell Scripts can come in handy for processing or re-formatting data that is available from the web. There are lots of tools available to automate the fetching of pages instead of downloading each page individually.

The first two programs I’m demonstrating for fetching are links and lynx. They are both shell browsers, meaning that they need no graphical user interface to operate.

Curl is a program that is used to transfer data to or from a server. It supports many protocols, but for the purpose of this article I will only be showing the http protocol.

The last method (shown in other blog posts) is wget. wget also fetches files from many protocols. The difference between curl and wget is that curl by default dumps the data to stdout where wget by default writes the file to the remote filename.

Essentially the following do the exact same thing:

owen@linux-blog-:~$ lynx http://www.thelinuxblog.com -source > lynx-source.html
owen@linux-blog-:~$ links http://www.thelinuxblog.com -source > links-source.html
owen@linux-blog-:~$ curl http://www.thelinuxblog.com > curl.html

Apart from the shell browser interface links and lynx also have some differences that may not be visible to the end user.
Both lynx and links re-format the code received into a format that they understand better. The method of doing this is -dump. They both format it differently so which ever one is easier for you to parse I would recommend using. Take the following:

owen@linux-blog-:~$ lynx -dump http://www.thelinuxblog.com > lynx-dump.html
owen@linux-blog-:~$ links -dump http://www.thelinuxblog.com > links-dump.html
owen@linux-blog-:~$ md5sum links-dump.html
8685d0beeb68c3b25fba20ca4209645e links-dump.html
owen@linux-blog-:~$ md5sum lynx-dump.html
beb4f9042a236c6b773a1cd8027fe252 lynx-dump.html

The md5 indicates that the dumped HTML is different.

wget does the same thing (as curl, links -source and lynx -source) but will create the local file with the the remote filename like so:

owen@linux-blog-:~$ wget http://www.thelinuxblog.com
–17:51:21– http://www.thelinuxblog.com/
=> `index.html’
Resolving www.thelinuxblog.com… 72.9.151.51
Connecting to www.thelinuxblog.com|72.9.151.51|:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: unspecified [text/html][ <=> ] 41,045 162.48K/s

17:51:22 (162.33 KB/s) – `index.html’ saved [41045]

owen@linux-blog-:~$ ls
index.html

Here is the result md5sum on all of the files in the directory:

owen@linux-blog-:~$ for i in $(ls); do md5sum $i; done;
a791a9baff48dfda6eb85e0e6200f80f curl.html
a791a9baff48dfda6eb85e0e6200f80f index.html
8685d0beeb68c3b25fba20ca4209645e links-dump.html
a791a9baff48dfda6eb85e0e6200f80f links-source.html
beb4f9042a236c6b773a1cd8027fe252 lynx-dump.html
a791a9baff48dfda6eb85e0e6200f80f lynx-source.html

Note: index.php is wget’s output.
Where ever the sum matches, the output is the same.

What do I like to use?
Although all of the methods (excluding dump) produce the same results I personally like to use curl because I am familiar with the syntax. It handles variables, cookies, encryption and compression extremely well. The user agent is easy to change. The last winning point for me is that it has a PHP extension which is nice to avoid using system calls to the other methods.

Using BASH to sort a book collection. ISBN Data Mining – Part 1

Filed under: General Linux,Shell Script Sundays — TheLinuxBlog.com at 2:47 am on Sunday, September 16, 2007

Many problems can be solved with a little bit of shell scripting.
This week I plan to show you a script that does a little data mining from Barnes and Noble.
I have a lot of books and wanted cataloged information on them. Each book has a unique identifier called an ISBN. So I collected all of my ISBN numbers and a simple loop that wrapped around a script a friend of mine made to find basic information.
Here is his script:

#!/bin/bash
ISBN=”$1″

function fetchInfo () {
### Using barnesandnoble.com to fetch info…
lynx -source “http://search.barnesandnoble.com/booksearch/isbninquiry.asp?ISBN=${ISBN}” |\
tr -d ‘[:cntrl:]‘ | sed ‘s/>\n

### Parsing book title.
if [ "${lineArray[0]}” == ”
echo “4|Title: ${lineArray[*]}” | sed ‘s/<[^>]*>//g;s/ ([^)]*)//g’### Parsing book author.
elif [ "$(echo ${lineArray[*]} | grep “id=\”contributor\”")” ]; then
echo “3|Author(s): ${lineArray[*]}” | sed ‘s/by //;s/<[^>]*>//g’

### Parsing additional data.
elif [ "${lineArray[0]}” == ”
[ "$(echo ${lineArray[*]} | grep -ve “bullet” -ve “title”)” ]; then
echo “1|${lineArray[*]}” | sed ‘s/<[^>]*>//g;s/:/: /;s/ / /’
fidone | sort -ur | awk -F\| ‘{print $2}’ | grep “:”

}

if [ "${#ISBN}" -ge "10" ]; then
fetchInfo
fi

The script should be called as followed (assuming the script name is eBook.sh):

sh eBook.sh ISBNNUMBER

The first step is to see if the ISBN is greater than 10 characters, if it is it goes to the fetchInfo() function.
It then takes the given ISBN number and searches the barnsandnoble.com site for any matches. To do this lynx is used, the -source option tells lynx to output the source instead of using browsing mode. The output of lynx is piped to tr and sed. tr is used to delete all line breaks from the source, the sed expression adds a line break at the end of each HTML tag. The while loop loops over each line from from the piped lynx, tr and sed.
Within the loop is where anything from the output of the search page can be pulled out. This script pulls out the book title, the author and additional data.

I formatted my ISBN’s in a text list and used the following loop to fetch information on my books and save them with the ISBN as the file name.

for i in $(cat list.txt); do sh eBook.sh $i > $i.txt; done;

In the next issue I plan to expand on this to format data in an even more presentable manor.
Used applications
tr, lynx, sed, awk, sort, grep