Thursday, June 25, 2015

bash - downloading complete web pages (not sites)


I often use offline time (e.g. when on the car/train/plane) to read articles pulled from the web.
To store the content, I currently use Firefox's Save Page As.
(Note that's not recursively downloading entire websites, but just individual pages plus their styles, images etc.)


I have looked into automating this, but neither wget nor HTTrack give me what I need (it's either too much or too little - or even both).


Any recommendations would be most welcome!


Answer



Have you tried wget --page-requisites ?


       This option causes Wget to download all the files that are neces‐
sary to properly display a given HTML page. This includes such
things as inlined images, sounds, and referenced stylesheets.

No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...