Processing one file at a Time in talend - database

I have a problem with talend. Currently im using the open source software, and I have a file watcher. I need a way in which even if I enter 2 files at the same second, I want talend to process one at a time, once the first has gone through all the mappings, then the other file can proceed.
I can't use tparrelatize because its not available in the version.

Related

Moving file to another directory in ABAP

I have got a service running in a specific directory in 5-second-intervals which is picking up an XML file created in that directory sending it for some necessary authorization checks to another client and then requesting a response file.
My issue is that my Z_PROGRAM creating the XML file might take longer than 5 seconds as a result of the file's size. Therefore creating the file in that specific directory is not preferable. I thought about creating a new folder in that directory called "temporary" and creating the file inside that folder, then once I'm done with it, moving it back outside for the service to pick it up.
Is there any way to move files from one directory to another via ABAP code only?
Copying the file manually is not an option since the problem that I have during file creation still persists. I need 2 alternatives, one used for local directories and one for application server directories. Any ideas?
Generally, we create another empty file for completed files after the file creation process ends. Third parties must be firstly checked empty file is there. Example:
data file.csv
data file.ok
If you already completed your integration and it is not easy to make any change with third parties, I prefer using OS level file moving commands. Sample document here. You can use mv for Linux server and move for Windows. If your file is big, you will get same problem with OPEN DATASET concept. We have ARCHIVFILE_SERVER_TO_SERVER FM for moving files but it is also using OPEN DATASET.
there is no explicit move command in ABAP code that move or copy files between directories in application server.
there is two tips can be helpfull in your case. if you are writing big file you may seperate the logic behind collecting data and writing file. I would say don't execute transfer data inside your loop. instead collect you data into an internal table once you're done, loop over this internal table and write direclty strings without any delay you should be able to write a big files upp to several hundred of MB under 1 sec.
next tips is to not modify your program, or if you are using function modules to construct xml is, write to a temp directory after finishing, then have another program open you file on source directory by read dataset and directly write data to the new directory again just strings without interruptions.
you should be ok if you just write strings.
You can simply use System Call Commands to perform actions in Application Directory.
CALL 'SYSTEM'
ID 'COMMAND'
FIELD 'mv /usr/sap/temporary/File.xml
/usr/sap/final/file.xml'

Edit File During it is processing in Unity

I am making a unity app in which it read data from a text file and
take actions accordingly and I want to update this file after regular intervals.
When I tried to update my file it says file is in process you can't edit it.
I have also tried .xml file extension for editing it but it didn't work
Which file format can be edited during processing?
We we can’t help without code. But Unity can process any file format like .txt .xml etc but you need to give read/write permission and also you need to check, is the process done before writing the file. This is the whole how read/write works in any platforms. I hope you get idea.

Data loss on concurrent file write in camel

I am using camel technology for my file operation. My system is cluster environment.
Let say, I have 4 instances
Instance A
Instance B
Instance C
Instance D
Folders Structure
Input Folder: C:/app/input
Output Folder: C:/app/output
All the four instances will be pointing to Input folder location. As per, my business 8 files will be placed in the input folder and output will be consolidated file. here camel losing data when concurrently writing to output file.
Route:
from("file://C:/app/input")
.setHeader(Exchange.File_Name,simple("output.txt"))
.to("file://C:/app/output?fileExist=Append")
.end();
Kindly help me to resolve this issue. is there any thing like write lock in camel? to avoid concurrent file writer. Thanks in advance
You can use the doneFile option of the file component, see http://camel.apache.org/file2.html for more information.
Avoid reading files currently being written by another application
Beware the JDK File IO API is a bit limited in detecting whether another application is currently writing/copying a file. And the implementation can be different depending on OS platform as well. This could lead to that Camel thinks the file is not locked by another process and start consuming it. Therefore you have to do you own investigation what suites your environment. To help with this Camel provides different readLock options and doneFileName option that you can use. See also the section Consuming files from folders where others drop files directly.

Trying to find information on how to build a simple file version controll system

Im want to build a file system for non-tecks( dont care about old versions of the file so no merging or svn/git). The thougt is that a user should be able to download a file, in the same instance the file should be locked for other users. When the first user is done editing the, the file should then automaticaly upload to the server. When he closes the file, the lock should den be opend.
Is this even possible? Im thingking a sort of browser plugin, but I cant find anywone that has done the same thing. (besides microsoft, but who want to go down that road)
That would be: Sharepoint, Alfresco, (almost every WIKI), ...
Actually that is a basic feature of most document management systems. Even SVN has that already and IIRC you can set that up with mod_dav_svn without a line of code (considering configuration is not code).
Also the interesting question is, IMHO, not TheHappyCase where the described unit of work goes well but what about this*:
I Checkout 50 random documents you need
(get some popcorn and wait for your stresslevel to go up)
?????
I get bored and forget about it (everything still being checked out)
*: Points (1) and (2) may change order

Sharing file locks

I am currently working on a file processing service that looks at a fileshare, where files are uploaded to via FTP.
For scalability I've been asked to make this service to be able to be load balanced, so the service has to expect that other services on different machines may also be trying to process these files.
OK, so I thought I should be able achieve this by obtaining an exclusive lock for my process before processing a file, and skipping any files that may already be locked by another process.
The crux of this approach is shown below (I've left out the error handling for simplicity):
using(FileStream fs = File.Open(myFile, FileMode.Open, FileAccess.ReadWrite, (FileShare.Read | FileShare.Delete))
{
//Do work
}
Q1: My process now has a lock on this file. I thought this would mean I could then access the same file (without using the stream) and still have the correct access to it, but based on testing it seems I only have the benefits of the lock through the stream. Is this correct?
(For example, before I included FileShare.Delete, File.Delete(myFile) failed)
The above lock ultimately uses the 'Write' permission to determine which service has the file, but is intended to allow other processes to still Read the file. This is because the process that has the lock attempts to verify if the file is a valid zip file , which uses a third party library (Xceed.Zip). However this fails saying the file "is being used by another process". Using reflector I ultimately found the problematic call is:
stream = this.m_info.Open(FileMode.Open, FileAccess.Read, FileShare.Read);
Now I would have expected this to work as it only wants to read the file, but it fails. The reason appears to be outlined in a similar question. However, as this is a 3rd party API I can't change their code to use ReadWrite.
Q2: Is there a way I can correctly lock the file so it will not be picked up by the other services, but it can still be verified as a zip file using the external API?
I feel like there should be a 'correct' way to do this, but at the moment the best I can come up with is to lock the file, move it away from the shared directory, and then verify it at the new location.
If you're planning to reactively handle this situation by handling UnauthorizedAccessException I think you're making a serious mistake.
This can be handled by proactively renaming files. For example you can configure your service to only read files whose name is in the format 'Filename.YYYYMMDD.txt'. Prior to processing the file, you can rename it to 'Filename.YYYYMMDD.processing'. Then after processing the file you rename it to 'Filename.YYYYMMDD.done'.
You can even take it a step further by making another service that enqueues the filenames. This service will be a FileSystemWatcher that listens for FileAdd operations. Once it receives that event it proceeds to queueing the Filename to a global message queue. Then, each of your service will just be dequeueing filenames and no longer have to worry about concurrent access.
HTH

Resources