i currently have a folder containing thousands of 10-K financial reports.
All of these files are .txt files.
The files look like this:
20170103_10-K_edgar_data_1662618_0001376474-17-000004_1.txt
20170104_10-Q_edgar_data_886137_0000886137-17-000006_1.txt
20170106_10-K-A_edgar_data_1590496_0001477932-17-000065_1.txt
20170106_10-Q_edgar_data_1275187_0001275187-17-000005_1.txt
I only need files that are "10-K" files and contain a certain CIK-number.
e.g. only the files which contain the string "10-K_edgar_data" and the CIK-number "0001376474"
Is there a way to achieve this for multiple 10-K files with different CIK-numbers?
EDIT: I use Mac OS High Sierra. I also have python 2 & 3 and I use it with PyCharm. But for now I only know the absolute Python basics (I am still learning the language...)
Related
I wanted to ask for some help to extract a few things.
I have more or less 2TB of files that are zipped, and I want to unzip the .txt files and that inside that .txt file it says "Função: finalizar vendas"
There are several .rar .zip .7z files that have inside them several folders with the names of the employees, and inside the folders their respective functions, which would be inside some of the notepads.
The only thing I managed to get closer to this was in 7-Zip, with a command to extract only the .txt files from the compact files, but there are many unnecessary notepads and I just wanted the one that contains the previously said sentence.
I need to store data in a file in this format
word, audio, jpeg
How would I store that all in one file? Is it even possible do would I need to store links to other data files in place of the audio and jpeg. Would I need a custom file format?
1. Your own filetype
As mentioned by #Ken White you would need to be creating your own custom file format for this sort of thing, which would then mean creating your own parser type. This could be achieved in almost any language you wanted but since you are planning on using word format, then maybe C# would be best for you. However, this technique could be quite complicated and take a relatively large amount of time to thoroughly test your file compresser / decompressor, but may be best depending on your needs.
2. Command line utilities
Another way to go about this would be to use a bash script to combine all of the files into one file, and then decompress it at the other end. For example the steps could involve:
Combine files using windows copy / linux cat command on command line
Create a metdata file of your own that says how many files are in this custom file, and how much memory each one takes up (could be a short XML or JSON file for example...)
Use the linux split command or install a Windows command line file splitter program (here's just one example) to split the file back into whatever components have made it up.
This way you only have to create a really small file type, and let the OS utilities handle the combining of them for you.
Example on Windows:
Copy all of the files in your current directory into one output file called 'file.custom'
copy /b * file.custom
Generate custom file format describing metadata (i.e. get the file size on disk in C# example here). This is just maybe what I would do in JSON. SO formatting was being annoying so here's a link (Copy paste it into an editor or online JSON viewer).
Use a decompress windows / linux command line tool to decompress each files to the exact length (and export it back to the exact name) specified in the JSON (metadata) file. (More info on splitting files on this post).
3. ZIP files
You could always store all of the files in a compressed zip file, and then just use a zip compressor, expander as and when you like to retreive any number of file formats stored within.
I found a couple of examples of :
Combining multiple files into one ZIP file in only C# .net,
Unzipping ZIP files in C#
Zipping & Unzipping with only windows built-in utilities
Zipping & Unzipping in Linux command line
Good Zipping/Unzipping library in Java
Zipping/Unzipping in Python
I have a long list of files that are auto-produced every month. they'll all have the same file name, with a sequential file extension like this:file.001, file.002, file.003
Each file has differing information, despite having the same name. What I need to do is copy them from their home directory and paste them into a new directory with names that reflect their purpose, and as text files, like this: Budget.txt, Expense.txt, Retention.txt
Is it possible to do this with a batch file? I've been unable to find a method that works. Any help would be appreciated.
EDIT: I've tried that solution, and it works as far as it goes. the frustrating thing is that the extensions are not always the same, but always sequentially numbered.
I have multiple CSV files from different sites within my company that contain multiple names and email addresses from several SQL databases that many different company users enter data into. I have a SQL export program that exports the names and emails to CSV files. I have noticed that occasionally some of the email addresses have the hex 0x1F separator either before or after the email address (in Notepad++ it looks like a black "US" box).
How can I write a simple batch file to find and remove just that separator from any CSV file that may have it. And save the output using/replacing the original file. Preferably using simple batch commands, not Powershell or Java or anything else like that. This will be running on a standard install of Windows 2008 R2 without any extra programs added.
Example:
Directory:
C:\Uploads
Filenames (up to 23 files with a random 2 digit prefix followed by date):
"a1-20151101.csv", "b2-20151101.csv", "cd-20151101.csv", etc.
Inside CSV (FirstName;LastName;Email):
John;Doe;john.doe#johndoe.com
Jane;Smith;jsmith#google.com
You could spend precious time writing some 400-line behemoth of a cmd file.
Or you could simply go and get the tr program from GnuWin32 (ports of the popular UNIX tools to native Windows) which is perfectly suited to doing this sort of thing.
Then your batch file will basically consist of the line:
tr -d "\37" inputFile >outputFile
The tr program is contained within the coreutils package.
I have to write a batch file which finds and replaces values of version and build with their new values given by the user as an input.
The user initially gives 3 values: Apps, version and build values (the new values).
The search is then made across multiple .property files in a folder. The Apps value is searched across multiple files and wherever there is a match, the version and build of that file will be replaced by the values given by the user.
To Find and replace across multiple files easily, use FART!