I have multiple segments saved in an input file. The format is;
Use case 1:
host port start_byte end_byte
127.0.0.1 12345 0 2048
127.0.0.1 12346 0 1024
127.0.0.1 12347 1024 2048
Use case 2:
host port start_byte end_byte
127.0.0.1 12345 0 2048
127.0.0.1 12346 1024 2048
127.0.0.1 12347 0 1024
Here, the first line is for the reference to understand what each line has.
The host is the localhost but ports are different.
Here we have 3 ports. Port #12345 has entire file (say abc.txt). Port #12347 has
the second segment whereas the port #12346 has the first segment.
Now, I want to read file from the end of file towards the start of file(from line 3 to 1).
The code to download each segment and write to a new file is given below.
def downloadSegment(threadName, fileNameTemp, server_addr, server_port, segment_beginaddr, segment_endaddr, fileName, maxSegmentSize,ip_address,peer_server_port, relevant_path):
downloadSegmentStr = "download," + fileName + ","+segment_beginaddr+"," + segment_endaddr
socket1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
socket1.connect((server_addr, int(server_port)))
socket1.send(downloadSegmentStr)
lock.acquire()
with open(fileNameTemp, 'ab') as file_to_write:
file_to_write.seek(int(segment_beginaddr),0)
while True:
data = socket1.recv(maxSegmentSize)
#print data
if not data:
break
#print data
file_to_write.write(data)
file_to_write.close()
lock.release();
socket1.close()
When I write the segment in increasing order (Use case 1), then, it works perfectly. But, when I try using the out of order like explained in above Use case 2, it doesn't work.
Any help is appreciated. Thanks.
You missed these parts of the Python Documentation:
open(name[, mode[, buffering]])
… 'a' for appending (which on some Unix systems means that all
writes append to the end of the file regardless of the current seek
position).
file.seek(offset[, whence])
… Note that if the file is opened for appending (mode 'a' or 'a+'),
any seek() operations will be undone at the next write. If the file is
only opened for writing in append mode (mode 'a'), this method is
essentially a no-op…
Thus, mode 'a' is unsuitable for the task. Sadly, stdio's fopen() offers no mode without a that creates a file if it doesn't exist and doesn't truncate it if it exists. So I'd use os.open() like in:
with os.fdopen(os.open(fileNameTemp, os.O_CREAT|os.O_WRONLY), 'wb') as file_to_write:
Related
Is it possible to search at first inside the file after an specific byte and find the position and read just the bytes from the file in until that specific byte?
At the moment it is just possible for me to read some bytes or the whole file in and afterwards search for that specific byte.
like this:
local function read_file(path)
local file = open(path, "r") -- r read mode and b binary mode
if not file then return nil end
local content = file:read(64) -- reading 64 bytes
file:close()
return content
end
local fileContent = read_file("../test/l_0.dat");
print(fileContent)
function parse(line)
if line then
len = 1
a = line:find("V", len +1) --find V in content
return a
else
return false
end
end
a = parse(fileContent) --position of V in content
print(a)
print(string.sub(fileContent, a)) -- content until first found V
In this example i find at position 21 the first V. So it would be cool to read in only 21 bytes except of 64 bytes or the whole file. But then i need to find the position before reading something in. Is this possible ? (The 21byte are variable, it could be 20 or 50 or so on)
You can specify a file position using file:seek and read a certain number of characters (bytes) by providing an integer to file:read
local file = file:open(somePath)
if file then
-- set cursor to -5 bytes from the file's end
file:seek("end", -5)
-- read 3 bytes
print(file:read(3))
file:close()
end
You cannot search in a file without reading it. If you don't want to read the entire file you can read it in chunks either by reading it linewise (if there are lines in your file) or by reading a specific number of bytes each time until you find something.
Of course you can also read it byte-wise.
You can argue if it makes more sense to read a 64 byte file as a whole or in chunks. I mean in most scenarios you won't notice any difference.
So you could file:read(1) in a loop that terminates once you found a V or reach the end of the file.
local file = io.open(somePath)
if file then
local data = ""
for i = 1, 64 do
local b = file:read(1)
if not b then print("no V in file") data = nil break end
data = data .. b
if b == "V" then print(data) break end
end
file:close()
end
vs
local file = io.open("d:/test.txt", "r")
if file then
local data = file:read("a")
local pos = data:find("V")
if pos then
print(data:sub(1, pos))
end
file:close()
end
(Or) Correct your code to...
local function read_file(path)
local file = io.open(path, "r") -- r read mode and b binary mode
if not file then return nil end
local content = file:read(64) -- reading 64 bytes
file:close()
return content
end
local fileContent = read_file("test/l_0.dat") -- '../' causing error
print(fileContent)
local function parse(line)
if line then
local len = 1
local a = line:find("V", len +1) --find V in content
return a
else
return false
end
end
print(fileContent:sub(1, parse(fileContent))) -- content until first found V
That puts out...
0123456789VabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789V
If you want that V is a (single) delimiter you probably dont want to put it out.
Meet the strength of string.sub(text, start, stop)...
print(fileContent:sub(1, parse(fileContent) - 1)) -- before V
-- 0123456789
print(fileContent:sub(parse(fileContent) + 1, -1)) -- after V
-- abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
A simple question. I have 1 file test.txt in userPath().."/log/test.txt with 15 line
I wish read first line and remove first line and finally file test.txt with 14 line
local iFile = 'the\\path\\test.txt'
local contentRead = {}
local i = 1
file = io.open(iFile, 'r')
for lines in file:lines() do
if i ~= 1 then
table.insert(contentRead, lines)
else
i = i + 1 -- this will prevent us from collecting the first line
print(lines) -- just in case you want to display the first line before deleting it
end
end
io.close(file)
local file = io.open(iFile, 'w')
for _,v in ipairs(contentRead) do
file:write(v.."\n")
end
io.close(file)
there must be other ways to simplify this, but basically what I did in the code was:
Open the file in reading mode, and store all lines of text except the first line in the table contentRead
I opened the file again, but this time in Write mode, causing the entire contents of the file to be erased, and then, I rewrote all the contents stored in the table contentRead in the file.
Thus, the first line of the file was "deleted" and only the other 14 lines remained
I'm unable to write 1514 bytes (including the L2 information) via write to /dev/bpf. I can write smaller packets (meaning I think the basic setup is correct), but I see "Message too long" with the full-length packets. This is on Solaris 11.2.
It's as though the write is treating this as the write of an IP packet.
Per the specs, there 1500 bytes for the IP portion, 14 for the L2 headers (18 if tagging), and 4 bytes for the checksum.
I've set the feature that I thought would prevent the OS from adding its own layer 2 information (yes, I also find it odd that a 1 disables it; pseudo code below):
int hdr_complete = 1;
ioctl(bpf, BIOCSHDRCMPLT, &hdr_complete);
The packets are never larger than 1514 bytes (they're captured via a port span and start with the source and destination MAC addresses; I'm effectively replaying them).
I'm sure I'm missing something basic here, but I'm hitting a dead end. Any pointers would be much appreciated!
Partial Answer: This link was very helpful.
Update 3/20/2017
Code works on Mac OS X, but on Solaris results in repeated "Interrupted system call" (EINTR). I'm starting to read scary things about having to implement signal handling, which I'd rather not do...
Sample code on GitHub based on various code I've found via Google. On most systems you have to run this with root privileges unless you've granted "net_rawaccess" to the user.
Still trying to figure out the EINTR issue. Output from truss:
27158/1: 0.0122 0.0000 write(3, 0x08081DD0, 1514) Err#4 EINTR
27158/1: \0 >E1C09B92 4159E01C694\b\0 E\005DC82E1 #\0 #06F8 xC0A81C\fC0A8
27158/1: 1C eC8EF14 Q nB0BC 4 V #FBDE8010FFFF8313\0\00101\b\n ^F3 W # C E
27158/1: d SDD G14EDEB ~ t sCFADC6 qE3C3B7 ,D9D51D VB0DFB0\b96C4B8EC1C90
27158/1: 12F9D7 &E6C2A4 Z 6 t\bFCE5EBBF9C1798 r 4EF "139F +A9 cE3957F tA7
27158/1: x KCD _0E qB9 DE5C1 #CAACFF gC398D9F787FB\n & &B389\n H\t ~EF81
27158/1: C9BCE0D7 .9A1B13 [ [DE\b [ ECBF31EC3 z19CDA0 #81 ) JC9 2C8B9B491
27158/1: u94 iA3 .84B78AE09592 ;DA ] .F8 A811EE H Q o q9B 8A4 cF1 XF5 g
27158/1: EC ^\n1BE2C1A5C2 V 7FD 094 + (B5D3 :A31B8B128D ' J 18A <897FA3 u
EDIT 7 April 2017
The EINTR problem was the result of a bug in the sample code that I placed on GitHub. The code was not associating the bpf device with the actual interface and Solaris was throwing the EINTR as a result.
Now I'm back to the "message too long" problem that I still haven't resolved.
I have one problem, I would like to separate one file by condition to more files.
INPUT: One text file
variable chrom=chr1
1000 10
1010 20
1020 10
vriable chrom=chr2
1000 20
1100 30
1200 10
OUTPUT: two files for this example.
chr1.txt
variable chrom=chr1
1000 10
1010 20
1020 10
chr2.txt
variable chrom=chr2
1000 20
1100 30
1200 10
So, the separator condition if row starts with chrom=chr$i (i={1..22}) => separate to other text file.
Thank you
Something along these lines:
awk 'BEGIN { filename="unknown.txt" } /^variable chrom=/ { close(filename); filename = substr($0, index($0, "=") + 1) ".txt"; } { print > filename }'
Where the awk code is
BEGIN { filename="unknown.txt" } # default file name, used only if the
# file doesn't start with a variable chrom=
# line
/^variable chrom=/ { # in such a line:
close(filename) # close the previous file (if open)
# and set the new filename
filename = substr($0, index($0, "=") + 1) ".txt" filename
}
{ print > filename } # print everything to the current file.
The basic algorithm is very straightforward: Read file linewise, change filename when you find a line that starts a new section, always print the current line to the current file, so the devil is in the detail of isolating the file name from the marker line. The
filename = substr($0, index($0, "=") + 1) ".txt"
approach is simplistic but serviceable for the example you showed: It takes everything after the = and attaches .txt to get the file name. If your marker lines are more complicated than variable chrom=filenamestub, this will have to be amended, but in that case I could only guess your requirements and would probably guess wrong.
If you know how many lines there are between, you could use
split -l 4 textfile.txt
This will split the textfile every 4th line it finds, making the files xaa and xab, and so on.
Referring to Loop (read file contents), a quite strange thing happens every time I use a code like this one to run a script:
^+k::
{
Gosub, MySub
}
Return
MySub:
{
Send, +{Enter}
Loop, read, C:\MyFile.txt
{
temp = %A_LoopReadLine%
Send, %temp%
Send, +{Enter}
}
}
Return
MyFile.txt is a simple text file where sometimes the "plus" symbol (+) is used together with normal letters and numbers.
Despite of this, however, what I see if I run the hotkey on an empty text file, either a Notepad or Microsoft Word blank sheet, is that every + is replaced by an underscore (_), an exclamation mark (!) or a question mark (?). I've seen an occurrence with a dollar symbol ($) replacement, too.
I tried to debug it printing on screen a message box with
MsgBox, %temp%
before sending text and it shows the original content of MyFile.txt perfectly.
Thus the issue should be on Send rather than on file reading.
The content of my file is something like this (repeated for about 20 rows more):
+---------------------------------
120001267381 ~ TEXT 0 10/20/18 VARIABLE word text -> numbers: 17,000 x 108.99 | 109.26 x 15,000 /// number = +5.500% some text
+---------------------------------
120001267381 ~ TEXT 0 10/20/18 VARIABLE word text -> numbers: 17,000 x 108.99 | 109.26 x 15,000 /// number = +5.500% some text
+---------------------------------
120001267381 ~ TEXT 0 10/20/18 VARIABLE word text -> numbers: 17,000 x 108.99 | 109.26 x 15,000 /// number = +5.500% some text
+---------------------------------
120001267381 ~ TEXT 0 10/20/18 VARIABLE word text -> numbers: 17,000 x 108.99 | 109.26 x 15,000 /// number = +5.500% some text
+---------------------------------
What can be the cause of this?
Found the answer: due to the fact that + symbols read from my file are sent like pressing the Shift key, the output is amended by the pressing of such a key instead of sending the original symbol present in file.
In order to send the original content of my file without triggering special hotkeys, I have to use SendRaw instead of Send, like in this example:
^+k::
{
Gosub, MySub
}
Return
MySub:
{
Send, +{Enter}
Loop, read, C:\MyFile.txt
{
temp = %A_LoopReadLine%
SendRaw, %temp%
Send, +{Enter}
}
}
Return
Here's an updated version that pastes using CTRL-V instead of Send to "retype" rows of data:
^+k::
{
Gosub, MySub
}
Return
MySub:
{
Send, +{Enter}
Loop, read, C:\MyFile.txt
{
temp = %A_LoopReadLine%
Clipboard = %temp% ; Write to clipboard
Send, ^v+{enter} ; Paste from clipboard
Sleep 10
; Short delay so it doesn't try to paste again before the clipboard has changed
; This check can get a lot more complex, but just increase it if 10 doesn't work
}
}
Return