Wget/1.12 (linux-gnu).Web is becoming the face of technology and the central access point for data processing. Although shell scripts can not do everything that languages like PHP can do on a web page, there are many tasks where the shell scripts fit.
In this article, we will learn about the wget command in Linux, one of the most common commands used to load content on the Internet.
You might also like to read:
Wget is a command-line utility, which is used for downloading files and content on the Internet, which can be a website or an FTP site. Wget is very flexible and has many options for many of our various uses.
The general syntax for wget is:
$ Wget http://slynux.org--2010-08-01 07: 51: 20-- http://slynux.org/ Resolving slynux.org ... 188.8.131.52 Connecting to slynux.org | 184.108.40.206 |: 80 ... connected. HTTP request sent, awaiting response ... 200 OK Length: 15280 (15K) [text / html] Saving to: "index.html" 100% [================= ======================] 15.280 75.3K / s in 0.2s 2010-08-01 07:51:21 (75.3 KB / s) - "index .html "saved [15280/15280]
We can specify multiple paths for wget as follows:
$ Wget URL1 URL2 URL3
2. Save the file with another name
Usually, the downloaded file has the same name as the file in the URL, and the load information is displayed on the screen.
We can save the downloaded file with a different name from the existing one, using the -O option . If the file with the specified name already exists, then the contents of the downloaded file will overwrite the existing file.
Instead of displaying information about the download process, we can save this information in a file, using the -o option .
$ Wget ftp://example_domain.com/somefile.img -O dloaded_file.img -o log
By using the above command, no information will be printed on the screen. The log or process will be written to the log file , and the downloaded file will be named dloaded_file.img.
3. Automatic reload when a failure
During the download, if the connection is unstable, the file download may be interrupted and failed. In these cases, we usually re-load the load. However, instead of having to manually reload the download, wget provides us with an option to reload the download automatically every time the download is lost.
To do this, we use the -t argument in wget as follows:
$ Wget -t 5 URL
In the above command, 5 is the number of times that wget will attempt to reload the file when the connection is lost during the download, replacing 5 by the number of times we want wget to execute.
If we do not want to specify the number of reloads but want wget to repeat the load until the new stop, in this case we do the following:
$ Wget -to URL
4. Speed limit and load rating
When we have a limited Internet bandwidth and many applications share this connection, and if we need to load a large file, it will take up the bandwidth of other applications, making these applications inoperable. Move.
To limit the download speed in wget, use the -limit-rate parameter as follows:
$ Wget --limit-rate 20k http://example.com/file.iso
- K - Kilobyte (KB)
- M - Megabyte (MB)
We can also specify the maximum load limit. The load will stop when it reaches the norm.
To specify the load limit, use the -quota parameter or -Q as follows:
$ Wget -q 100m http://example.com/file1 http://example.com/file2
5. Restart and resume loading
If the download is interrupted before it is complete, we can resume uploading where it is interrupted by using the -c option as follows:
$ Wget -c URL
6. Copy the whole site
Wget has 1 option to download an entire web page by simply recruiting all the URL links in web pages and downloading all of them. Therefore, we can download all the pages of a website.
To load web pages, use the -mirror option as follows:
$ Wget --mirror --convert-links example.com
Or use the following command:
$ Wget -r -N -k -l DEPTH URL
- -l => indicates the depth of web pages as levels. This means that wget will only go through the number of levels that we specify.
- DEPTH => depth of site.
- -r (recursive) => recursively, shared with -l.
- -N is used to activate lock time for files.
- URL is the basic path for a website where the load should be initialized
- -k or -convert-links => instructs wget to convert links to other pages in the page loaded to local copies of those pages.
In addition to loading a web page on the machine, we can use the lynx command as follows:
$ Lynx -dump URL> webpage_as_text.txt
$ Lynx -dump http://google.com> plain_text_page.txt
6. HTTP or FTP authentication
Some sites require authentication for HTTP or FTP links. To perform this task, use the argument -user and -password as follows:
$ Wget --user username --password pass URL
Entering the password as plain text in the statement content above will not be secure and secure. In this case, we should replace -password with the -ask-password as follows:
$ Wget --user username - password-password URL Password for user 'username': <Password will not display on screen>
We wrap up the wget command in Linux here. Good luck!