Edit String in library archive (filename.a) - c

I has been compiled C library, and I have file library example filelib.a , and I want to edit string in filelib.a because my source code C has been removed from my PC, I want to edit string there, In file filelib.a there are string "article seen".
If I grep:
$ grep -R "/etc/resolv.conf" *
Binary file filelib.a matches
Binary file filelib.so matches
So there are string "/etc/resolv.conf" in file filelib.a and filelib.so.
How to edit and replace string in binary file filelib.a and filelib.so, example I want to replace string "/etc/resolv.conf" to "/system/etc/resolv.conf"
I have edit with hex editor BLESS, but if I use this lib I get error:
could not read symbols: Malformed archive
collect2: error: ld returned 1 exit status
I'm using linux ubuntu.
Thanks.

If you really don't have the slightest chance to obtain/recover the soirce code, amd the new string is equally long as or shorter than the original one, you can open the archive using a hex editor, binpatch the string and pad with zeroes if it's shorter than before (there must always be at least one terminating zero byte).
If you want to change the string to something longer, that's not easy - your best chance would be perhaps to extract the archive, disassemble the object file in which you want to make changes, change the assembly, then reassemble it and use ar to update the modified object file in the library.

As long as the string you want to change in is shorter or equal to the one in the binary file, you can just use a hex editor and substitute the string, and replace any reaming characters with \0.

I believe the Bless Hex Editor should do the job for you.
Just make sure that you do not change the length of the file. It may be possible to use a shorter string than the old one, if you insert a '\0' terminator, but that all depends on how the program uses it, so I'd recommend against.

Related

Is it possible to prevent adding BOM to output UTF-8 file? (Visual Studio 2005)

I need some help.
I'm writing a program that opens 2 source files in UTF-8 encoding without BOM. The first contains English text and some other information, including ID. The second contains only string ID and translation. The program changes every string from the first file by replacing English chars to Russian translation from the second one and writes these strings to output file. Everything seems to be ok, but there is BOM appears in destination file. And i want to create file without BOM, like source.
I open files with fopen function in text mode with ccs=UTF-8
read string with fgetws function to wchar_t buffer
and write with fputws function to output file
Don't use text mode, don't use the MS ccs= extension to fopen, and don't use fputws. Instead use fopen in binary mode and write the correct UTF-8 yourself.

Program to compile files in a directory in openedge

Could someone help me in writing a program that has to compile all the files in the directory and report error, if any. For which my program has to get the list of all files under the folder with its full path and store it in a temp-table and then it has to loop through the temp table and compile the files.
Below is a very rough start.
Look for more info around the COMPILE statement and the COMPILER system handle in the online help (F1).
Be aware that compiling requires you to have a developer license installed. Without it the COMPILE statement will fail.
DEFINE VARIABLE cDir AS CHARACTER NO-UNDO.
DEFINE VARIABLE cFile AS CHARACTER NO-UNDO FORMAT "x(30)".
ASSIGN
cDir = "c:\temp\".
INPUT FROM OS-DIR(cDir).
REPEAT:
IMPORT cFile.
IF cFile MATCHES "*..p" THEN DO:
COMPILE VALUE(cDir + cFile) SAVE NO-ERROR.
IF COMPILER:ERROR THEN DO:
DISPLAY
cFile
COMPILER:GET-MESSAGE(1) FORMAT "x(60)"
WITH FRAME frame1 WIDTH 300 20 DOWN.
END.
END.
END.
INPUT CLOSE.
Since the comment wouldn't let me paste this much into it... using INPUT FROM OS-DIR returns all of the files and directories under a directory. You can use this information to keep going down the directory tree to find all sub directories
OS-DIR documentation:
Sometimes, rather than reading the contents of a file, you want to read a list of the files in a directory. You can use the OS–DIR option of the INPUT FROM statement for this purpose.
Each line read from OS–DIR contains three values:
*The simple (base) name of the file.
*The full pathname of the file.
*A string value containing one or more attribute characters. These characters indicate the type of the file and its status.
Every file has one of the following attribute characters:
*F — Regular file or FIFO pipe
*D — Directory
*S — Special device
*X — Unknown file type
In addition, the attribute string for each file might contain one or more of the following attribute characters:
*H — Hidden file
*L — Symbolic link
*P — Pipe file
The tokens are returned in the standard ABL format that can be read by the IMPORT or SET statements.

C program for reading doc, docx, pdf

I want to write a program in C(only c not c++ or java) that will read doc, docx, pdf and want to make it available on github to use for all who needs that code. So I started with .doc file I explored that if I open .doc file with simple notepad it will show you all text but just with some extra content which you can easily trim. So I did write a simple c program to read .doc wile in both 'r' and 'rb' mode but both time it gives me only 5-9 character in the file and those also not readable. I don't know why it's happening. Any comment or disccussion will be very helpful for me.
Here is the link for github Source code. Please help me to complete all three format.
To answer your specific question, the reason your little application stops reading is because it mistakenly thinks there is an EOF character in your file.
Look at your code:
char ch;
int nol=0, not=0, nob=0, noc=0;
FILE *fp;
fp = fopen("file.doc","rb");
while(1)
{
ch = fgetc(fp);
if(ch==EOF)
{
break;
}
You store the result of fgetc(fp) in a variable of type char, which is a single-byte variable. However, the result of fgetc is very purposefully "int", not "char".
fgetc always returns a positive result in the range 0 to 255, except for when you reach the end of the file in which case it returns EOF, which is often implemented as a -1 value.
If you read a byte of value 255 and store it in an int, everything is OK, it's stored as the value 255 and your loop can continue. If you store the result in a char, it's going to be interpreted equal to EOF. And your loop stops.
Don't expect to get anywhere with this idea. .doc is a huge binary file format that is inhumanly complicated to parse. With that said, Cubia mentioned the offset where the text section of the document starts. I'm not familiar with the details of the format, but if the raw text is contained in one location, use fseek to get at it and stop when you reach the end. This won't be the case for the other formats because they are very different.
.docx and .pdf should be easier to parse because they are more modern formats. If you want to read anything from a docx you need to read from a zip file with a ton of xml in it and use a parser to figure out which text you want.
.pdf should be the easiest of the three because you might be able to find a library out there that can almost do what you want.
As for why you are getting strange output from your program, remember that .doc is a binary format and the vast majority of the data is garbage from your perspective. Dumping it to the terminal will yield readable text but also a bunch of control characters that should screw with your terminal.
As a last note - don't try to read docx files directly using fread - they are compressed so you likely won't recover the text unaltered. Take a look at libarchive. Also - expect to have to read the document specifications. docx seems to be a microsoft extension to the openoffice format. See this and some PDF specification documents (there are multiple versions).
Look at the .doc file type as a txt file but with extra non-printable characters before, in the middle, and after your content. These non-printable characters are used for defining special formatting, metadata and other infos.
With this said, all .doc files follow a certain structure.
If you open two different .doc files in a hex editor, you will notice that the text content of both files start at an offset of 0xA00 (2560 bytes) from the beginning of the file. This means that when you open your file initially, you can ignore the first 2560 bytes of the file (Take a look at the fseek() function).
From this point on, you can read the contents of your file until you reach '\0'.
I have not seen the implementation of a .pdf or a .docx file, but you can take open up both files with a hex editor and figure out what pattern you can use the isolate the important contents of the files.
Hope this helps.
EDIT : You can always find documentation on the different file formats that you want to manipulate. Here are the specifications of the PDF file type :
http://www.adobe.com/devnet/pdf/pdf_reference.html
http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/PDF32000_2008.pdf

Do binary files have encoding? Confused

Suppose I write the following C program and save it in a text file called Hello.c
#include<stdio.h>
int main()
{
printf("Hello there");
return 0;
}
The Hello.c file will probably get saved in a UTF8 encoded format.
Now, I compile this file to create a binary file called Hello
Now, this binary file should in some way store the text "Hello there". The question is what encoding is used to store this text?
As far as I'm aware, vanilla C doesn't have any concept of encoding, although if you correctly keep track of multi-byte characters, you can probably use an encoding. By default, ASCII is used to map characters to single-byte characters.
You are correct about the string "Hello there" being stored in the executable itself. The string literal is put into global memory and replaced with a pointer in the call to printf, so you can see the string literal in the data segment of the binary.
If you have access to a hex editor, try compiling your program and opening the binary in the editor. Here is a screenshot from when I did this. You can see that each character of the string literal is represented by a single byte, followed by a 0 (NULL). This is ASCII.

Replacing a word in a file, using C

How do I replace a word in a file with another word using C?
For example, I have a file which contains:
my friend name is sajid
I want to replace the word friend with grandfather, such that the file is changed to:
my grandfather name is sajid
(I am developing on Ubuntu Linux.)
Update:
I am doing filing in C. I have created a .txt file and write some data into it, but as my program progresses I have to search some text and replace it with the other words. The problem I am facing is that suppose in my file I wrote
"I bought apple from the market"
If i replace apple with pineapples as apple has 5 char and pineapple has 9 char it will write it as
"I bought pineapple m the market"
It also has affected the words written after apple.
I have to do it using C, not a command line script.
How about using the exiting Linux sed program?
sed -i 's/friend/grandfather/' filename
That will replace friend with grandfather in the existing file. Make a copy first if you want to keep the original!
Edit:
Alternatively, load the file into an STL string, replace 'friend' with 'grandfather' using a technique such as this, and save the new string into a file.
As you've realized, you won't be able to make the change in place in the original file.
The easy solution is to read each string from the original file and write it to standard output, making the replacement as necessary. In pseudocode:
open original file
while not at end of original file
get next string from original file
if string == "friend"
write "grandfather" to standard output
else
write string to standard output
end while
You can then redirect the output to another file when you execute it, similar to how sed and awk work.
Alternately, you can create a destination file in the code and write to it directly. If you need to replace the original file with the new file, you can use the remove() and rename() library functions in stdio.

Resources