I am working on a website that reads out an FTP directory for files, and does something with them depending on their modification date. I do not have much experience with FTP so my question is if uploading files, pictures to be exact, changes their modification date to the time where they were uploaded?
Related
The project I’m working on requires a scanned file to be associated with a record in the database. The association is required so that scanned documents can be attached to an invoice when it is sent out by email. To meet this requirement, I have created a small QR code that is printed on a sticky label. The QR code contains the ID of the associated database record. I have asked the admin staff to stick a QR code label on each paper document that they scan. The scans are saved as JPG files.
I then have a cron job that runs every few minutes which looks at any JPG files in the scan folder and if it finds a QR code the file is renamed with the ID of the job. E.g. file1.jpg becomes DB-12345-file1.jpg where 12345 is the id of the database record. The file is then moved to a different folder.
The cron job runs the following command on each file it finds
zbarimg -Sdisable -Sqrcode.enable –raw
zbarimg is software that can locate QR codes embedded in JPG files.
This is all being tested at the moment and seems to work most of the time however the company now wants to scan the files in PDF format. This does not work with the zbarimg software. I have tried converting PDF to JPG but the quality of the QR Code is lost and zbarimg fails most of the time.
Is there another automated way of linking scanned PDF files to a database record?
I have already written some code using pdf2text and saving the data into the database but this does not provide an indexed association between the PDF files and the records. It just provides a nice way to search for text in a PDF file.
We have noticed a file come through a work flow which initially involves uploading the file from the clients computer to our server which the application there moves the completed file to a holding folder and then another application picks it up parses video information from it then moves it to a new folder.
After the entire flow completed it was noticed that Last modified date was older than when the file was even uploaded through the website and the date created was set to the time it was uploaded.
by the way the two dates were almost 24 hours apart.
any idea how this could happen>
It could have something to do with the way the files are processed, where the modified time is left to the time where the contents of the file changed, but the created time is when the file in the new location was created.
This happens, for example, when you download a zip file and unzip the contents. The created time is the time you extracted the archive, but the modified time is the time when the author last updated the contents. At least, that's what happens to me.
Keep in mind that time stamps on files are just properties of the file. Usually the OS takes care of updating them for you. But they can be changed at will if you know how.
This question has been asked around several time. Many programs like Dropbox make use of some form of file system api interaction to instantaneously keep track of changes that take place within a monitored folder.
As far as my understanding goes, however, this requires some daemon to be online at all times to wait for callbacks from the file system api. However, I can shut Dropbox down, update files and folders, and when I launch it again it still gets to know what the changes that I did to my folder were. How is this possible? Does it exhaustively search the whole tree in search for updates?
Short answer is YES.
Let's use Google Drive as an example, since its local database is not encrypted, and it's easy to see what's going on.
Basically it keeps a snapshot of the Google Drive folder.
You can browse the snapshot.db (typically under %USER%\AppData\Local\Google\Drive\user_default) using DB browser for SQLite.
Here's a sample from my computer:
You see that it tracks (among other stuff):
Last write time (looks like Unix time).
checksum.
Size - in bytes.
Whenever Google Drive starts up, it queries all the files and folders that are under your "Google Drive" folder (you can see that using Procmon)
Note that changes can also sync down from the server
There's also Change Journals, but I don't think that Dropbox or GDrive use it:
To avoid these disadvantages, the NTFS file system maintains an update sequence number (USN) change journal. When any change is made to a file or directory in a volume, the USN change journal for that volume is updated with a description of the change and the name of the file or directory.
Currently looking for a batch script that would allow me to compare a folder with the system time for inactivity.
For example, if a folder's modified date hasn't been updated in 4> hours than it will either send an email or generate a log file.
I'm a novice when it comes to writing up batch files or Powershel scripts.
Any help would be much appreciated.
Thanks.
Amazon Glacier does not have the concept of filepaths. However when I upload files to glacier via client tools like Cloudberry then my uploads do have a path structure.
If I am programmatically uploading an archive to Amazon Glacier, how can I upload it so it has a filepath and filename in Cloudberry? I think I may need add something to the 'x-amz-archive-description' field here http://docs.aws.amazon.com/amazonglacier/latest/dev/api-archive-post.html, but I do not know how to format it.
I am using the Amazon Javascript SDK: http://docs.aws.amazon.com/AWSJavaScriptSDK/guide/examples.html. I think I've been able to upload archives fine, though I haven't been able to see them in Cloudberry yet.
UPDATE: After getting it working, I put the code I was using here in case a sample is needed: https://github.com/fschwiet/mysql-glacier-backup
Our Glacier archive description metadata is a simple JSON with the following fields:
"Path": the full path of the source file. E.g., "c:\myfolder\myfile.txt" for file copied from local disk or "mybucket/myfolder/myfile.txt" for files copied from cloud storage like Amazon S3. The path is UTF7-encoded.
"UTCDateModified": ISO8601 utc date without milliseconds (format: "yyyyMMddTHHmmssZ"). This is modification date of the original file (not the archive creation date).
"Flags": integer flags value. 1 - compressed, 2 - encrypted.
Thanks,
Andy
I've been zipping the tree up (for ease of restore) and storing all the tree info in the archive. Thus photos_2012.zip or whatever. The long list of files just wasn't working for me from an ease-of-checking-things-were-cactually-backed-up perspective.
It's more costly to restore, because I'll have to pull a whole tree down, but given that my goal is never to need this archive, I'm OK with that.