I'm trying to figure out WGET for downloading some videos at http://windowsclient.net/learn/videos_wpf.aspx.
This page is an index page, and contains links to a page for each video; each page then has a direct link to a video. Something like;
http://windowsclient.net/learn/videos_wpf.aspx
-> http://windowsclient.net/learn/video.aspx?v=300881
-> http://download.microsoft.com/[...]/HDI-WPF-ipod-AccelerometerJoystick(2).mp4
What I'd like to do is tell WGET to spider the site by following either video.aspx?*
or .mp4
links, recurively, for two levels. I can figure out
WGET -r -l2 http://windowsclient.net/learn/videos_wpf.aspx
and then I get stuck. Any suggestions greatly appreciated.
EDIT: Thanks to @mloskot, I got the answer;
WGET -r -l2 -A.mp4,video*.aspx*
--domains=windowsclient.net,download.microsoft.com
--span-hosts
http://windowsclient.net/learn/videos_wpf.aspx
Answer
Learn about option -A
in the 2.11 and 4.2 chapter of wget manual and use this option to specify comma-separated lists of file name suffixes or patterns to accept. In other words, use -A
to specify that you want to download only files with mp4
extension:
WGET -r -l2 -A.mp4 http://windowsclient.net/learn/videos_wpf.aspx
No comments:
Post a Comment