I'm trying to implement ftp commands GET and PUT thru a UNIX socket for file transfer using usual functions like fread(), fwrite(), send() and recv().
It works fine for text files, but fails for binary files (diff says: "binary files differ")
Any suggestions regarding the following will be appreciated:
Are there any specific commands to read and write binary data?
Can diff be used to compare binary files?
Is it possible to send the binary parts in chunks of memory?
the FTP protocol has 2 modes of operation: text and binary.
try it in any FTP client -- I believe the commands for switching in between are ASCII and BIN. The text mode has only effect from what I recall on the CR/LF pairs though.
If you're reading from a file and then writing the file's data to the socket, make sure you open the file in binary mode.
Yes, diff can be used to compare binary files, typically with the -q option to suppress the actual printing of differences, which rarely makes sense for binary files. You can also use md5 or cmp if you have them.
Related
I got 5 files generated by a fortran code like this
longP=8
OPEN(unit=20,FILE="GMt_2.dat",ACTION="write",ACCESS='Direct',RECL=longP)
count1=1
do J=K,fact
READ(10,*)XA,XB,YA,YB,ZA,ZB,rho
call Grv('f',Nx,Ny,dimg,Dx,Dy,XO,YO,XA,XB,YA,YB,ZA,ZB,rho,G,elev,Svec)
do I=1,dimg
WRITE(UNIT=20,rec=count1)Svec(I)
count1=count1+1
end do
WRITE(*,*)J
end do
dim(2)=J-1
fact=fact+fact1
call flush(20)
CLOSE(20)
which returned with an unreadable file format, my professor said "its binary, machine code" My goal here is to concatenate the information in those 5 files in one array to perform some processing. how can I achieve this?.
The code you show writes the data using unformatted I/O and direct access. You'll need to read it using unformatted I/O as well. You could use direct access or, and this would be my recommendation, stream access (ACCESS='STREAM' in the OPEN statement.) Open each file in sequence, read the data and then write it using the same mechanism to your single file. Your question is ambiguous enough to not allow a more detailed response.
I have a file that holds manufacturing orders for a machine.
I would like to read the content of this file and edit it, but when I open it in a text editor i.e. Notepad++, I get a bunch of wierd charecters:
xÚ¥—_HSQÀo«a)’êaAXŽâê×pD8R‰¬©s“i+ƒ´#¡$
-þl-ó/ÓíºIúPôàƒHˆP–%a&RÎÈn÷ü¹·;Ú;ç<ìòÝÃý}¿ó}‡{϶«rWg>˜›ãR‡)Çn0³Ûf³yÎW[5–šw½ÇRW{ñ’rO6¹ŽŸp¦ÙœcÏ.9yÀnýg
)Ë—e90ejÕø£rC. f¦}3ËŒ˜hü”å1g[…ø±ú ÜJøz®‹˜YfÈ,4`ŽKÉ—ù“ÔË¿d„þlG3#=˜Ž´+hF¬¦£€«šm¿áØ
ïÖµv‡ËpíÍ~™‡Aù
šëÈÚ]ÿç™DŒÉFØ ïƒæsij ¦y=-74Æ/t=ÕŠr\˜š»Âä‰Ý¨žã΢
dz·à‡'fœ½yâ½4qåPjácòÄŒeÊhñ“ý™ÙÎÕ÷5ôlñ=˜Õ{ú;ø=Û;4OêYä>Ìpxbæâ'è"oëB×1gQ9“'¹]Ô³’Ô³ø!ÌózÞyŸõžÓIŽù*&OÌXPÕ"ŽWžpíOÌè‚Þ3Òr0{Ž†R=_?…/¼žÞ0,ê=/?£ûÓËîy“2Z<ij³[ËÁì™÷–ôžÎ’Ããa÷<Maêéí…¼ž}©žYýZ-˜=”á¤}π>3°¢÷œ$ïè‰3ìž«ƒÄs¿—xnŒÀ*¯gi$ÕómDËÁìùIeоû‡À¬?3°x¾"~ª§c˜öÝÇî颌°›x¾Fßb>Ï}QXÓ{öFi-êÙßóR”œe^Ñ÷ü‘¿g[Lë ŽwJZϘë¹3”³L©gH‚,^Ïe 2ôžWGøëÙ2‚Î
øœL¾ÅqÈäõ,ýç\œË3¾þeྗ&`Ϻ<KÒf“’»ðù]í‰ãžU^wèþåÔÖy”H}ò•6ø6
It looks like the file is encoded.
Any idea how to find the encoding and make the file readable and editable?
It's binary and probably encoded so without knowledge of data structure you can't do much - just reverse engineering based on trying and checking what changed, operating with hex editor.
It isn't impossible, tho. If you can change the data the way you know (eg. change number of orders from 1 to 2) and export to file, you can compare binary values and find which byte holds that number. Of course if it is encrypted and you don't know the key... It's easier to find another way.
For further read, check this out - https://en.wikibooks.org/wiki/Reverse_Engineering/File_Formats
If you've got access to a Linux box why not use
hexdump -C <filename>
You will be able to get a much better insight into how the file is structured, than by using a text editor.
There are also many "hexdump" equivalent commands on Windows
I'm working on a project, and I made a C program that reads the date, time, and wave height from a .txt file stored on my computer, converts the date and time to GPS time for use at a scientific research institution, and outputs GPS time and wave height to the screen. However, the text file that I am working with is actually stored at http://www.ndbc.noaa.gov/data/realtime2/SPLL1.txt . Is there any way that I could have my C program open the text file from the web address rather than from my local hard drive?
FYI: to access the file on my computer I used fopen and to interact with the data contained I used a combination of fgets and fscanf.
It is much more involved to get a web-resource than to read a file from disk, but you can absolutely do it, for example by using a library such as libcurl.
An alternative strategy is to make components and tie them together with bash or other scripting. Your C program could for example read from standard input, and you could make a bash script something like this:
curl http://www.ndbc.noaa.gov/data/realtime2/SPLL1.txt | ./the_program
This way, you could keep your core C program simpler.
I want to read a .gz file (text.gz) with 300MB length and search a pattern in it. I opened the text file in a binary format using fopen with "rb" and stored it in a buffer. When I search a pattern that I know it exists in the text, the result is wrong. When I debug the program, the elements of the buffer are different from what I expect. Do I have to read and store these kind of files in other ways??????
You might try using zlib and gzread to read the file.
http://zlib.net/manual.html
Try this.
gunzip -c file.gz | grep <pattern>
If the program is exiting and failing to read the file, a real common problem is that you don't close the file in Notepad or whatever is using it and the FileIO fails due to not being able to access the file. Make sure you don't have anything with that file open before you test your program.
I'm intending to create a programme that can permanently follow a large dynamic set of log files to copy their entries over to a database for easier near-realtime statistics. The log files are written by diverse daemons and applications, but the format of them is known so they can be parsed. Some of the daemons write logs into one file per day, like Apache's cronolog that creates files like access.20100928. Those files appear with each new day and may disappear when they're gzipped away the next day.
The target platform is an Ubuntu Server, 64 bit.
What would be the best approach to efficiently reading those log files?
I could think of scripting languages like PHP that either open the files theirselves and read new data or use system tools like tail -f to follow the logs, or other runtimes like Mono. Bash shell scripts probably aren't so well suited for parsing the log lines and inserting them to a database server (MySQL), not to mention an easy configuration of my app.
If my programme will read the log files, I'd think it should stat() the file once in a second or so to get its size and open the file when it's grown. After reading the file (which should hopefully only return complete lines) it could call tell() to get the current position and next time directly seek() to the saved position to continue reading. (These are C function names, but actually I wouldn't want to do that in C. And Mono/.NET or PHP offer similar functions as well.)
Is that constant stat()ing of the files and subsequent opening and closing a problem? How would tail -f do that? Can I keep the files open and be notified about new data with something like select()? Or does it always return at the end of the file?
In case I'm blocked in some kind of select() or external tail, I'd need to interrupt that every 1, 2 minutes to scan for new or deleted files that shall (no longer) be followed. Resuming with tail -f then is probably not very reliable. That should work better with my own saved file positions.
Could I use some kind of inotify (file system notification) for that?
If you want to know how tail -f works, why not look at the source? In a nutshell, you don't need to periodically interrupt or constantly stat() to scan for changes to files or directories. That's what inotify does.