Monday, December 8, 2014

windows - Automatically extracting text files via commandline and combining them into a single text file


In-Depth Explanation of Problem:


I'm trying to automate a process on my computer and having some difficulty. Each day, I'm emailed a zip file that contains a txt file. I have a script in place that automatically takes the attachment and dumps it into a local directory.


Now, I'm trying to figure out how I can get the contents of the txt file from the directory and append it to a "master file". So basically, how it works now is I have an archive emailed to me each day of [TODAY'S_DATE].zip. A script moves that archive to a folder. I then have a .bat file that extracts the contents to a folder named [TODAY'S_DATE] and moves the archive to a folder titled "Completed". I need to know how to take the text out of the file as it is extracted and dump it into a "Master.txt" file that will continuously get larger.


This would avoid the need of extracting the files to their own directories and manually copying the text from them and into the master file.


My setup:



  • Using Windows 7

  • Using 7zip Command Line for extraction

  • Using .Bat file to extract file from archive to directory


Problem:



  • Need to take txt from archive and merge it into a "master.txt" file.


Current contents of .bat file:


7za x *.zip -o*
COPY /Y *.zip " \Completed\"
@echo extraction complete

I really appreciate any help that can be offered. I know this was really long-winded, but often when I see these types of questions, not enough detail is presented. Thank you again.


Answer



I want to thank everyone who took the time to answer. I did more digging in 7Zip's documentation and tested some things and eventually arrived at the answer.


I'm sure many of the answers on this page can be used in various ways to achieve the same result, but I wanted to avoid as much filehandling as possible.


What was needed was a switch from 7zip (-so). This allows you to extract the file to STDOUT and you can redirect the stream to any file you want.


This is what it looks like:


7za e *.zip -y -so >> masterlist.txt

This allowed me to skip extracting the file to a directory altogether which helps to save disk space and file handling. If you have a structure to the directory and you know the filename (I didn't have that luxury), then you can use:


7za e *.zip -ir!PATH\FILENAME.txt -y -so >> masterlist.txt

Also, in case anyone wants to see the working .bat file, this is what I have.


7za e *.zip -y -so >> masterlist.txt
MOVE /Y *.zip Completed
@echo extraction complete

Line 1: 7zip extracts the single file and sends the output to STDOUT and that is then appended (>>) to masterlist.txt.


Line 2: The zip archive is moved to the Completed folder so it will not be processed again in the future.


Line 3: It lets you know the extraction is complete, although you probably won't see this (at least with the way I'm using it).


I hope this helps someone out. :-)


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...