How to split a large .txt file into many smaller .txt files using a specific line as a splitting point? - python-3.10

I have a large .txt file containing location names/coordinates/other info. The format of the .txt file is as such.
LOCATION: ______
(all the data for that location
listed on multiple lines)
"______________________________________________________________________"
LOCATION: ______
(all the data for that location
listed on multiple lines)
"______________________________________________________________________"
etc.
Each set of smaller data within the larger .txt file is separated by a string of 70 underscores
I would like to be able to split the large .txt file into many smaller .txt files containing just one Location and all its specific data (all the lines between "LOCATION" and the 70 underscore string.
I would have used a line counter to do this but each smaller dataset contains a differing number of total lines.
Here is the code I've been working on to try and solve this.
import tempfile
import re
print("Enter the Name of File: ")
fileName = input() ### gets name of .txt file
fileHandle = open(fileName, "r")
largeData = fileHandle.read().splitlines()
NewfileName = fileName.replace('.txt', '_') ### creates a new file name without the .txt at the end)
i=0 ### counter for new smaller .txt names
tempData = tempfile.TemporaryFile(mode='w+t')
for line in largeData: ### goes though each line in largeData
tempData.writelines(line) ### adds each new line to tempData
if "LOCATION" in line: ### if the word "LOCATION" is found -> indicates a new set of data
Newfile = open(NewfileName + str(i) +".txt", 'w') ### creates a new file using the original file name plus the counter and replaces .txt at the end
for sLine in tempData: ### for every line currently in the tempData temporary file
Newfile.writelines(sLines) ### writes all lines from tempData to the Newly created file
i+=1 ### updates counter for next file name
Right now when I run the code and give it the large .txt file name I end up with the correct number and names for all the smaller .txt files but they are all empty. How do I get the files to contain each individual smaller dataset? Thank you!

Related

How do I copy only 20170906.1 from the text file by using batch file

Hi I wanted to Copy only 20170906.1 from the demo.txt file which contain \tfs-server\TFSBUILDS\OB-MAINMVC2.0\OB-MAINMVC2.0_20170906.1\_PublishedWebsites\In.OfficeBox.Api.Task\bin and keep it in another txt file whether 20170906.1 is dynamic value it will change every day how do I get value after _ and before \

Open file without Filename with lua skript

Good evening,
I am currently working on a programm that takes information from a file into a Database, for testing purposes I used to open Testfiles in the classical way via IO:
function reader (file, delimeter)
local f = io.open(file)
for line in f:lines() do
lines[count] = splitty(line, delimeter)
count = count + 1;
end
end
(this part also containes the first part of a splitter)
But in the actual environment, the database programm imediatly moves the file in another directory with a name change to, for example this:
$30$15$2016$09$26$13$27$24$444Z$.Pal.INV.csv
Now I know the directory but I can't really predict the name, so I wanted to know if there might be a way to open files without knowing their name.
(and delete them after reading them)
I had ideas to use a modified link:
local inputFile = "D:\\Directory\\(*all)"
but it failed.
Other aviable information:
The system is until now only planned on Windows PCs.
The directory will always only contain the one file that is to ready, no other files.
You can use the lfs.dir iterator from LuaFileSystem to iterate through the contents of the directory. A small example:
local lfs = require("lfs")
local path = "D:\\Directory\\" -- Your directory path goes here.
for filename in lfs.dir(path) do
print(filename) -- Work with filename, i will just print it
end
If you keep a record of the files you will be able to know which one is the new one. If it is only one file, then it will be easier, you can just check the extension with a string function. From what i remember the iterator includes .. and .. lfs documentation can be found here.
-- directory name and file name should consist of ASCII-7-bit characters only
local dir = [[C:\Temp\New Folder]]
local file = io.popen('dir /b/s/a-d "'..dir..'" 2>nul:'):read"*a":match"%C+"
if not file then
error"No files in this directory"
end
-- print the file name of first file in the directory
print(file) --> C:\Temp\New Folder\New Text Document.txt

want to rename a file only for the first loop and for next loops need to delete last written file

I have a big text file called 'filename.txt'. I want to extract data from it and want to create the file with the same name (filename.txt). because i will be required to extract data from my main file for many time.
How can I change a name of the main file to filename1.txt for the first time only, when i extract a data. so, that i can write a file with the name filename.txt. and to extract a data for the next loop from filename1.txt, I will be required to create a file with the name filename.txt. for that I will be required to delete the old file (filename.txt) first.
I am getting consfuse in it.
how can i do it?

How to replace text from a source file into different files

So I have been using Notepad++ to do some little clean-up tasks and now I am left with the biggest task..
I have a file called Artists.txt which looks like
Butta Mohamed
Daler Mehndi
Daljit Mattu
Darshan Khela
Davinder Deep
Davinder Deol
etc...
I have another file called Keywords.txt (located in hundreds of other folders). The folders are named like below and they all contain a text file called Keywords.txt
butta-mohamed-lyrics
daler-mehndi-lyrics
daljit-mattu-lyrics
darshan-khela-lyrics
davinder-deep-lyrics
davinder-deol-lyrics
The Keywords.txt contains the text _1 (several instances within the Keywords.txt).
What I would like to do is get each line from Artists.txt and have the _1 replaced. The folders are in the same order as Artists.txt.
So read Artists.txt get first line Butta Mohamed get first folder butta-mohamed-lyrics edit Keywords.txt find _1 replace (all) with Butta Mohamed. Save changes. Rinse and repeat so read Artists.txt get next line Daler Mehndi get next folder daler-mehndi-lyrics edit Keywords.txt find _1 replace (all) with Daler Mehndi. Save Changes.
Wondering if something like this is possible? Otherwise it would take me a week to manually do this via copy/pasting or even the replace function in Notepad++
I've tried the Macro function in Notepad++ but CTRL-V rather then pasting whats in the clipboard the macro seems to replace the CTRL-V function with whatever text the macro was recorded with has.
So just adding some extra information...
I don't have Notepad++ installed as my favorite text editor is UltraEdit (shareware).
Although Stack Overflow is not a free code writing service and we expect that the questioner shows us some programming efforts already made to solve a task, it was very easy for me to write the little UltraEdit script for this task and therefore here is an UltraEdit script for this task.
C:\\Temp\\Test\\ at top of the script must be replaced by path of parent folder for the *lyrics folders. UltraEdit scripts are executed with the JavaScript core engine. Strings in UltraEdit scripts are therefore JavaScript strings where backslash is the escape character. So it is necessary to escape each backslash in parent folder path by one more backslash.
To run this script in UltraEdit, open Artists.txt as first file in UltraEdit.
As second file create a new ASCII file with Ctrl+N, copy and paste the lines below into this new file, edit the parent folder path/name in script code and save this script for example with name KeywordsReplace.js into any folder.
Now run the script by clicking in menu Scripting on command Run Active Script.
You can see after script finished in automatically showed output window how many replaces have been made in which Keywords.txt files.
if (UltraEdit.document.length > 0) // Is any file opened?
{
// Parent folder containing all the *lyrics folders.
var sParentFolder = "C:\\Temp\\Test\\";
// Define environment for this script.
UltraEdit.insertMode();
UltraEdit.columnModeOff();
// Select everything in first file.
UltraEdit.document[0].selectAll();
// Is first file not an empty file?
if (UltraEdit.document[0].isSel())
{
// Determine line terminator type for first file.
var sLineTerm = "\r\n";
if (UltraEdit.document[0].lineTerminator == 1) sLineTerm = "\n"
else if (UltraEdit.document[0].lineTerminator == 2) sLineTerm = "\r"
// Get all lines of first file into an array of strings
var asArtists = UltraEdit.document[0].selection.split(sLineTerm);
// Remove last string if it is empty because file ended with
// a line termination.
if (!asArtists[asArtists.length-1].length) asArtists.pop();
// Define once the parameters for all the replace in files executed
// below in the loop with changing directory and replace strings.
UltraEdit.frInFiles.filesToSearch=0;
UltraEdit.frInFiles.searchSubs=false;
UltraEdit.frInFiles.ignoreHiddenSubs=false;
UltraEdit.frInFiles.openMatchingFiles=false;
UltraEdit.frInFiles.searchInFilesTypes="Keywords.txt";
UltraEdit.frInFiles.regExp=false;
UltraEdit.frInFiles.matchCase=true;
UltraEdit.frInFiles.matchWord=false;
UltraEdit.frInFiles.logChanges=true;
UltraEdit.frInFiles.useEncoding=false;
UltraEdit.frInFiles.preserveCase=false;
// Run for each artist a replace of all occurrences of _1
// in the artists lyrics folder by name of the artist.
for (nArtist = 0; nArtist < asArtists.length; nArtist++)
{
// Build folder name by converting artists name to
// lower case and replacing all spaces by hyphens.
var sFolder = asArtists[nArtist].toLowerCase().replace(/ /g,"-");
// Define directory for replace in files by appending
// additionally the string "-lyrics" to folder name.
UltraEdit.frInFiles.directoryStart = sParentFolder + sFolder + "-lyrics\\";
UltraEdit.frInFiles.replace("_1",asArtists[nArtist]);
}
// The output window contains the summary information
// about the replaces made and therefore open it.
UltraEdit.outputWindow.showWindow(true);
}
}
Script was tested with the provided data with each Keywords.txt containing exactly 3 times _1 in the 6 *lyrics folders. Result of output window was:
Running script: C:\Temp\KeywordsReplace.js
============================================================
C:\Temp\Test\butta-mohamed-lyrics\Keywords.txt, 3
3 items replaced in 1 files.
C:\Temp\Test\daler-mehndi-lyrics\Keywords.txt, 3
3 items replaced in 1 files.
C:\Temp\Test\daljit-mattu-lyrics\Keywords.txt, 3
3 items replaced in 1 files.
C:\Temp\Test\darshan-khela-lyrics\Keywords.txt, 3
3 items replaced in 1 files.
C:\Temp\Test\davinder-deep-lyrics\Keywords.txt, 3
3 items replaced in 1 files.
C:\Temp\Test\davinder-deol-lyrics\Keywords.txt, 3
3 items replaced in 1 files.
Script succeeded.
In case of downloading and installing UltraEdit is not acceptable for you, you have to wait for another answer providing a batch file solution or a Notepad++ macro solution, or you make the necessary code writing by yourself.

Iterate through list of file in a list for a directory in a batch file?

I need to create CMD batch file which should have a predefined list of files, we are keeping eye on certain files, for a particualr directory. I have to iterate through this list and have to produce a result may be in a form of text file that only gives files sizes greater than 1MB.
So for instance if there are files called (a.txt,b.txt,c.txt,d.txt) and have respective lengths 900k,1.1mb,500kb and 1.5MB, then my outout file should look like
length of file b.txt > 1MB = 1.1 MB
length of file d.txt > 1MB = 1.5MB.
I need help in initializing and storing the file list in an array ,and how can i iterate through the fil list and spit out the result in a txt file.
You can pass your list of files as parameters to the script, end then use SHIFT to iterate.
call batch.cmd a.txt b.txt c.txt
a skeleton of batch.cmd :
:START
IF "%1"=="" GOTO END
REM -- CHECK FILE SIZE HERE
DO STUFF
SHIFT
GOTO START
:END

Resources