Extracting and Displaying data from text file in C - c

I am trying to write a C program that takes in two arguments, either [-url | -phone | -email] and a text file that the user will download from a website.
After the user inputs the flag and the name of the text file, the program is supposed to extract and display the contents based on the regular expression I have developed.
For example, for URL the regex is
/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/
I am having a hard time figuring out how I can implement such a task. Do I need to use fork()? How exactly can I read the data from the text file and display back results based on the regex?
Here is the example OUTPUT
$ gcc –o minor1 minor1.c
$ ./minor1
Usage:
./minor1 [-url | -email | -phone] input_file
URL CASE SCENARIO:
$./minor1 -url index.html
https://www.web.edu/
...
http://webpreview.web.edu/
...
httpL//policy.web.edu/
Based on the flag and the input file, this is what it is supposed to return

You may use curl to download the file from the web.
How to download a file from a URL in C, as a browser would?
C program for downloading files with curl
Then you can iterate and parse the data to extract the regex pattern of either url, email or phone.
Try to come out with some code yourself and if there is any problem, post what you did, snippet of the code that failed, and explain your own thoughts on why do you think it failed.

Related

Open website with C and search through source code

I made a program in C that is supposed to try a bunch of numbers on a url, like this:
example.com/cbc001 ...com/cbc002
The program runs throught from 000 to 999 and is supposed to add those numbers after the 3 letters in the url. I managed to get the program to run through all numbers, 000-999.
Now my problem is I have no idea how to open a website in C, and I don't really want to open it in a browser, what I just need is the program to try all those urls ".com/abc990" and then I want to check the source file of the webpage to search for a certain word.
How could I approach this?

How to read content of unknown file

I have a file that holds manufacturing orders for a machine.
I would like to read the content of this file and edit it, but when I open it in a text editor i.e. Notepad++, I get a bunch of wierd charecters:
xÚ¥—_HSQÀo«a)’êaAXŽâê×pD8R‰¬©s“i+ƒ´#¡$
-þl-ó/ÓíºIúPôàƒHˆP–%a&RÎÈn÷ü¹·;Ú;ç<ìòÝÃý}¿ó}‡{϶«rWg>˜›ãR‡)Çn0³Ûf³yÎW[5–šw½ÇRW{ñ’rO6¹ŽŸp¦ÙœcÏ.9yÀnýg
)Ë—e90ejÕø£rC. f¦}3ËŒ˜hü”å1g[…ø±ú ÜJøz®‹˜YfÈ,4`ŽKÉ—ù“ÔË¿d„þlG3#=˜Ž´+hF¬¦£€«šm¿áØ
ïÖµv‡ËpíÍ~™‡Aù
šëÈÚ]ÿç™DŒÉFØ ïƒæsij  ¦y=-74Æ/t=ÕŠr\˜š»Âä‰Ý­¨žã΢
dz·à‡'fœ½­yâ½4qåPjácòÄŒeÊhñ“ý™ÙÎÕ÷5ôlñ=˜Õ{ú;ø=Û;4OêYä>Ìpxbæâ­'è"oëB×1gQ9“'¹]Ô³’Ô³ø!ÌózÞyŸõžÓIŽù*&OÌXPÕ"ŽWžpíOÌè‚Þ3Òr0{Ž†R=_?…/¼žÞ0,ê=/?£ûÓËîy“2Z<ij³[ËÁì™÷–ôžÎ’Ããa÷<Maêéí…¼ž}©žYýZ-˜=­”á¤}π>3°¢÷œ$ïè‰3ìž«ƒÄs¿—xnŒÀ*¯gi$ÕómDËÁìùIeоû‡À¬?3°x¾"~ª§c˜öÝÇî颌°›x¾Fßb>Ï}QXÓ{öFi-êÙßóR”œe^Ñ÷ü‘¿g[Lë ŽwJZϘë¹3”³L©gH‚,^Ïe 2ôžWGøëÙ2‚Î
øœL¾ÅqÈäõ,ýç\œË3¾þeྗ&`Ϻ<KÒf“’»ðù]í‰ãžU^wèþåÔÖy”H}ò•6ø6
It looks like the file is encoded.
Any idea how to find the encoding and make the file readable and editable?
It's binary and probably encoded so without knowledge of data structure you can't do much - just reverse engineering based on trying and checking what changed, operating with hex editor.
It isn't impossible, tho. If you can change the data the way you know (eg. change number of orders from 1 to 2) and export to file, you can compare binary values and find which byte holds that number. Of course if it is encrypted and you don't know the key... It's easier to find another way.
For further read, check this out - https://en.wikibooks.org/wiki/Reverse_Engineering/File_Formats
If you've got access to a Linux box why not use
hexdump -C <filename>
You will be able to get a much better insight into how the file is structured, than by using a text editor.
There are also many "hexdump" equivalent commands on Windows

BATCH file: Scan an imput result for numbers

I'm handling a bat file basically to redirect a streaming media content to VLC, my goal is to pratically automate the whole following steps:
Open the program that bridges the video stream to VLC (DONE)
the program gives a list of available resolutions to use as below:
[cli][info] Found matching plugin ustreamtv for URL
blablabla.com/channel/test
Available streams: 480p+ (best), 480p+_alt_akamai,
480p+_alt_highwinds, mobile_240p (worst)
Now what i need to do is basically find a way to "scan" this information for numbers in order to automate the following string
"C:\Program Files (x86)\Livestreamer\livestreamer.exe"
ublablabla.com/channel/test %quality%"p+_alt_highwinds" > nul
%quality% is there only because I'm currently typing manually whatever resoloution comes out in "Available streams:" to complete the string.
Ther's any way i can filter this result like if there is a 3 digits number before p+_alt_highwinds and automatically complete the string?
I'm sorry if this question looks like a complete mess.

libwebsocket: how to constuct lextable for cookie field

I write the websocket server via libwebsocket. I need Cookie field to valid user. there is a array named lextable[] for parse http header . I don't know how to modify lextable for Cookie filed.
Ok, go to "lib" directory in your libwebsockets dir. Find the "minilex.c" file, open it with your favorite text editor. At the beginning of the file you will see the "set" string array. You only need to add the line "Cookie: " at the end.
The next step is compiling the program "minilex.c" with command gcc minilex.c -o minilex.
When the compilation is finished, run the binary file and you will see a console consisting of two-dimensional array with the HEX codes of letters and their positions.
Copy-past it into "parsers.c" file.

Python Eclipse output to an external text file

I have a long running code which outputs stuff every minute. I will be running it all night and will check results in the morning.
Is there a way to write all the Console stuff to an external text file. I do not want to modify the code, but just looking to direct all Console output to an external file. I am working in Eclipse. I tried the Run -> Run Configurations... -> Your Application -> Common tab -> File idea but the output is horrible - no line breaks.
Is there a way to get the exact same output as the Console into a text file - all formatted nicely?
Many thanks in advance.
RS
you can add a text file. If you write your code to a file which is in a specific directory or if you know the name of it, your program can recognize this. For example, if your variable is words, you can write words = open("what your file is called") then at the end of you code end by writing to file: file.write(string)

Resources