Wget - Download manager

Wget shell scripts for easier usage and how to make filenames DOS/Windows compatible.

If you don't have a copy of wget you get one from a GNU mirror close to you or if you can't find one from GNU directly.

Download monitor

This is a simple script to check on the progress wget has made while downloading something, you have to specify background = on in your run control file or add -b at the command like to tell wget that output will be written to a log file.

#!/bin/bash
j=0
while true
do
	clear
	echo "===	Iteration $j	==="
	for i in `ls ~/downloads/wget-log*`
	do
		/usr/bin/head -1 $i
		printf "n"
		/usr/bin/tail -3 $i
		printf "nn"
	done
	let j++
	sleep 5
done

Download files from a list

And the following reads a list of URIs (along with parameters to wget if any) and starts a maximum of $max_proc instances of it. All URIs that wget started processing are appended to a file called done.txt in the downloads subdirectory of your HOME.

#!/bin/bash

cd $HOME/downloads/
PATH=/bin:/usr/bin

line=1
max_proc=3
list_file="$HOME/downloads/todo.txt"
prog="/usr/local/bin/wget"

while true
do
	while true
	do
		proc=`ps -f -u $USER | grep -c $prog`
		# grep is in the list too
		let proc--

		lines=`grep -c "" $list_file`
		echo "Proc: $proc / $max_proc	Line: $line / $lines"

		[[ $proc -ge $max_proc || $line -gt $lines ]] && break

		params=`grep -n "" $list_file | grep "^$line:" |
		                               sed -e "s/^$line://"`
		echo $params | tee -a done.txt archive.txt

		# ignore empty lines
		if [ "$params" ]; then
			$prog -b $params
			sleep 3
		fi
		let line++
	done

	echo "Waiting..."
	sleep 10
done

DOS/Windows compatible filenames

If you download sites when running Linux/Unix and then try to copy the files to your DOS/Windows partition you may have experienced problems with ? in filenames. Thanks to this little patch contributed by Herold Heiko you will have no more of that - wget will change ?'s to @ on the fly.

This requires wget version 1.8.2 (probably some changes may be made to future versions, please don't rely on this information only if you are using a different version of wget), you will have to edit url.c in the src directory.

#if WINDOWS || __CYGWIN__
      /* Use '_' instead of ':' here for Windows. */
      dirpref[len] = '_';
#else
       dirpref[len] = ':';
#endif

...

#if WINDOWS || __CYGWIN__
      /* Temporary fix.  Use '@' instead of '?' here for Windows. */
      *to++ = '@';
#else
       *to++ = '?';
#endif

...

/* DOS-ish file systems don't like `%' signs in them; we change it
     to `@'.  */
#ifdef WINDOWS
  {
    char *p = file;
    for (p = file; *p; p++)
      if (*p == '%')
    *p = '@';
  }
#endif /* WINDOWS */

You need to change #if WINDOWS || __CYGWIN__ or #ifdef WINDOWS to #if 1 in the three blocks above.

The second fragment is found twice in the source, the third one is optional as neither I nor Herold could prove that FAT/NTFS doesn't allow % in filenames - it may be that DOS doesn't allow them but who uses this nowadays.

Related links

Comments

monitoring wget log file ...

FYI: I use "watch tail -n36 <wgetlogfile> to constantly monitor download progress. Setting --dotstyle might help for large files.
Regards

monitoring wget log file

I use tail -f <wgetlogfile> to constantly monitor wget progress. Simple and good :)
BR.

Great list script!

I love this script for downloading files from a list! I was going to write one, but found yours, and it works great! ;) Thanks!

Wget

What is the Proc command for in WGET,
for example if a url has 20 sub-urls or links in its htm page are all these links retrieved as well ?
for example
wget proc-3 http:/linux.org
would it retrieve
linux.org proc-1
linux.org/linuxdocs proc-2
linux.org/programs proc-2
linux.org/linuxdocs/bins proc-3

nice script

but what if the user name of the password directory is an email address, how do I go around that?

Knowledge and Society

wget c <filename>
does not continue the download, but starts a fresh download . Why ?

looking for a wget gui

indigen,
you miss the dash "wget -c"
and the resume function is dependent of the server

gui?

review guis for wget?