checking that a file is not truncated

checking that a file is not truncated - file

I have downloaded many gz files from an ftp address :
http://ftp.ebi.ac.uk/pub/databases/spot/eQTL/sumstats/
How can I check that whether the files have been truncated during the download (i.e. wget did not download the entire file because of network connection) ? Thanks.

As you can see in each directory you have file md5sum.txt.
You can use command like:
md5sum -c md5sum.txt
This will calculate the hashes and compare them with the values in the file.

How can I check that whether the files have been truncated during the
download (i.e. wget did not download the entire file because of
network connection) ?
You might use spider mode to get just headers of response, for example
wget --spider http://ftp.ebi.ac.uk/pub/databases/spot/eQTL/sumstats/Alasoo_2018/exon/Alasoo_2018_exon_macrophage_naive.permuted.tsv.gz
gives output
Spider mode enabled. Check if remote file exists.
--2022-05-30 09:38:55-- http://ftp.ebi.ac.uk/pub/databases/spot/eQTL/sumstats/Alasoo_2018/exon/Alasoo_2018_exon_macrophage_naive.permuted.tsv.gz
Resolving ftp.ebi.ac.uk (ftp.ebi.ac.uk)... 193.62.193.138
Connecting to ftp.ebi.ac.uk (ftp.ebi.ac.uk)|193.62.193.138|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 645718 (631K) [application/octet-stream]
Remote file exists.
Length is size of file (in bytes) so after comparing it with your local file you will be able to tell if it is complete or not.
If you want to download missing parts if any, rather than merely check for completeness, then take look at -c option, from wget man page
-c
--continue
Continue getting a partially-downloaded file. This is useful when you want to finish up a download started by a previous instance of
Wget, or by another program.(...)

Related

Watson Visual Recognition "Cannot execute learning task. : no classifier name given"

Getting cURL error: {"code":400,"error":"Cannot execute learning task. : no classifier name given"}
Getting the same result whether I use the beta GUI tool or a cURL entry:
curl -X POST \
-F "Airplanes_positive_examples=#Airplanes.zip" \
-F "Biking_positive_examples=#Biking.zip" \
-F "GolfPuttingGreens_positive_examples=#GolfPuttingGreens.zip" \
-F "name=AllJpegClassifier" \
"https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key={my-api-key}&version=2016-05-20"
I have read all previous SO questions for this problem and made sure of the following:
Classifier name is alphanumeric only
Zip filenames are alphanumeric only
Image filenames are alphanumeric with _ - . only
Zip files contain between 27 and 49 images each
All image files are the same format (JPEG)
All images conform to pixel size and file size limits

Your command looks fine, and when I try it with my API key and my own zip files, it works. So I suspect there is something in your zip files that the system is having trouble with. If you could provide the "owner" guid field (also called you instance-id) I could look into our logs to try to diagnose it. This is displayed when you do a GET /classifiers/{cid} of an existing classifier. Alternatively, you could let me know one of your other existing classifier_ids
Another way would be if you could open a Bluemix support ticket and include copies of the zip files which you're using in this example. Then we can reproduce the problem.

what is the difference between hadoop -appendToFile versus hadoop -put when used for updating stream data into hdfs continously

As per hadoop source code following descriptions are pulled out from the classes -
appendToFile
"Appends the contents of all the given local files to the
given dst file. The dst file will be created if it does not exist."
put
"Copy files from the local file system into fs. Copying fails if the file already exists, unless the -f flag is given.
Flags:
-p : Preserves access and modification times, ownership and the mode.
-f : Overwrites the destination if it already exists.
-l : Allow DataNode to lazily persist the file to disk. Forces
replication factor of 1. This flag will result in reduced
durability. Use with care.
-d : Skip creation of temporary file(<dst>._COPYING_)."
I am trying to update a file into hdfs regularly as it is being updated dynamically from a streaming source in my local File System.
Which one should I use out of appendToFile and put, and Why?

appendToFile modifies the existing file in HDFS, so only the new data needs to be streamed/written to the filesystem.
put rewrites the entire file, so the entire new version of the file needs to be streamed/written to the filesystem.
You should favor appendToFile if you are just appending to the file (i.e. adding logs to the end of a file). This function will be faster if that's your use case. If the file is changing more than just simple appends to the end, you should use put (slower but you won't lose data or corrupt your file).

How do Thunar and MC decide how to open a file?

It seems that Thunar and Midnight Commander (any maybe other tools) don't use Mailcap to decide how to open a file. What do they use instead?
Background of the question: On my system, Thunar and Midnight Commander open all ODT files with Okular instead of LibreOffice.
I tried to debug this by checking ~/.mailcap and /etc/mailcap, which do contain Okular rules for ODT, but the LibreOffice (soffice) rules clearly take precedence.
I verified this by running mailcap directly on an ODT file:
run-mailcap --norun /tmp/example.odt
The output is exactly what I expect:
soffice --nologo --writer '/tmp/example.odt'
Also, if I run that command, LibreOffice is indeed started and opens the file.
So to my understanding, MC and Thunar should open ODT files with LibreOffice. But they use Okular. Why?

To answer my own question, these applications use xdg-open instead of run-mailcap.
And indeed the following command runs Okular instead of LibreOffice:
Command:
xdg-open /tmp/example.odt
I can verify the assigned MIME type with:
Command:
xdg-mime query filetype /tmp/example.odt
Output:
application/vnd.oasis.opendocument.text
Then, I can check which application is assigned to that MIME type:
Command:
xdg-mime query default application/vnd.oasis.opendocument.text
Output:
kde4-okularApplication_ooo.desktop
This explains the issue. When I uninstall Okular, it leads to the correct response:
Output:
libreoffice-writer.desktop
So there's something wrong in either the Okular or the LibreOffice package.

How to know whether a particular video file (.mp4) is busy or not?

I have a media server application implemented with the help of Wowza (on Linux, Centos). There are some mp4 files stored in my local directory. I am streaming these files whenever any client requests them. At some point in time, all these files need to be deleted from the local directory, through bash/python script. Before deleting, I need to make sure that no client is accessing the video files. How can I know whether a particular video file is getting streamed or not at the present time?
I have tried the following commands, but no luck.
1) fuser xyz.mp4
2) lsof | grep xyz.mp4
Plz, suggest if you know any other alternative.

This is not an OS-platform solution, but rather implements the built-in HTTPProviders in your Wowza software. If you query the following URL:
http://localhost:8086/connectioncounts?flat
It should return an XML output that lists the stream names currently being played. For example:
<WowzaStreamingEngine>
<Stream vhostName="_defaultVHost_" applicationName="vod" appInstanceName="_definst_" streamName="sample.mp4" sessionsFlash="0" sessionsCupertino="3" sessionsSanJose="0" sessionsSmooth="0" sessionsRTSP="0" sessionsMPEGDash="0" sessionsTotal="3"/>
</WowzaStreamingEngine>
The above output shows that sample.mp4 is currently being played. The ?flat option simplifies the output. You would then only need to parse the streamnames.

Running a batch file on Kid3 ID3 MP3 tagger

I finally gave up after 120 hours of not finding a .NET or javascript/jquery plugin that can read and write to a COMPRESSED custom user frame (TXXX) in ID3v2 MP3 audio file.
UltraID3Lib: cannot read or write COMPRESSED Frames (last updated 2009, author Mitchell S. Honnert fell off face of the earth).
ID3Lib-sharp: cannot read or write COMPRESSED Frames (last updated 2012)
JavaScript-ID3-Reader: can return bytes but it's mostly the wrong bytes. Cannot write anything.
I cannot use the multitudes of Node.js or PHP scripts for my project so they are out of the question.
The only code I found that can read and write compress frames is Kid3.
http://sourceforge.net/projects/kid3/
However, it is written in C (I don't know that) and uses third party frameworks since it was not built in windows. The Command program required 13MB of support dlls, QMs, and whatevers.
I have no choice at his point but to try and use it's separate command program kid3-cli.exe
So here's my question:
Here is the way to read a TXXX frame using the program at the command prompt.
"71F3-15-FOO58A77" is the name of the TXXX frame and the "2" gets the text value it holds:
cd "C:\mp3folder"
select "test.mp3"
get "71F3-15-FOO58A77" "2"
export "clipboard" "CSV unquoted" "2"
QUESTION: HOW DO I use a Batch file to run these commands?
According to kid3 documents. It shows using -c as grouped commands. Windows cmd (or the program) on windows does know what -c is.
Example: I double click the batch file and it should:
start the program
sent the program (not cmd) the above 4 lines
each must be executed separately.
Sounds simple, but I can't get it to even execute one of the programs code after starting.
Any ideas? and and can someone write a ID3 tag program that can read and write COMPRESSED TXXX tags without using node.js, PHP or a server in Windows.
I will buy them a beer because I'm really a (cheap) designer by trade and a pert-time programmer only when I have too.
Here is a zip file of a COMPRESSED TXXX Frame in test.mp3 to test:
[http://robbiestewart.ca/test.zip][1]
Download Kid3 and use its windows GUI (kid3.exe) to view the custom user frame (TXXX).
Run the included kid3-cli.exe to do the same at the command prompt.
Try to do the same in a batch file.

According to the help file, you should be able to use the command
kid3-cli -c 'cd "C:\mp3folder"' -c 'select "test.mp3"' -c 'get "71F3-15-F0058A77" "2"' -c 'export "clipboard" "CSV unquoted" "2"'
I ran it on the file you provided and seven tabs followed by 0:00.00 were put on my clipboard, but the value of the TXXX field indicated by the GUI was output to my command prompt.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight