I have an application which is connected to an external webservice. The webservice sends messages with an ID to the laravel application. Within the controller I check if the ID of the message already exists in the database. If not, I store the message with the ID, if it exists I skip the message.
Unfortunately sometimes the webservice sends a message with the same ID multiple times within the same second. Its an external service, so I have no control over it.
The problem now is, that the messages come so fast, that the database has not saved the message before the next message comes into the controller. As a result, the check if the ID already exists fails and it tries to save the same message once more. This leads to an exception, because I have a unique identifier on the ID column.
What is the best strategy to handle this? To use a queue for it, is not a good solution, because the messages are time critical and the queue is even slower and it would lead to a message jam/congestion within the queue.
Any idea or help is appreciated a lot! Thanks!
You can send to your database INSERT IGNORE requests
INSERT IGNORE INTO messages (...) VALUES (...)
or
INSERT INTO messages (...) VALUES (...) ON DUPLICATE KEY UPDATE id=id.
You can try updating on duplicate. That is a way I have used in the past to get around issues like this. Not sure if it's the perfect solution, but definitely an option. I assume you are using mysql.
https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html
Related
I have a MariaDB and a Windows Service accessing this DB. For maintenance, I use HeidiSQL.
I now want to update a big table (8.000.000 entries) in HeidiSQL by inserting a new foreign key column and then filling the column with values using UPDATE. I suppose it may take about 30 minutes.
During this time, if a user wants to insert/read/delete values out of this table via the service, what will happen? Will the DB block the request? Should I stop the service to avoid corruption of data?
I made an example myself. The database seems to respond with the old values or status as long as the operations changing the data are still running in HeidiSQL.
What I tried:
I added a new column. While the adding process was still running I
sent a read message to my service. It responded without sending the
new column. As soon as the operation was finished, the new column has been sent, too.
I filled the new FK column with values. While the updating was
running I sent a read message. The service gave back the initial
values of the FK column (0) for all the rows. After the operation was
finished, the service would sent all the new values.
I have two different servers in two locations. I need to use asynchronous exchange of data.
Server A is our data server, we store customer info here.
Server B is our proccesing server, we process production.
Each production operation on server B has a production group. What I need to do is:
A to send a message to B with a question: What operations are planned for today in this group(GUID).
B has to answer with an XML list of operations scheduled for today.
A has to answer with an XML list of operations to cancel
B has to cancel operations and end conversation
My question is: What is the right way to go about this? Can I do this in just a single dialog using one contract? Should I?
With a contract like this:
CREATE CONTRACT [GetScheduledContract]
AUTHORIZATION [xxx]
(GetScheduledOutCalls SENT BY INITIATOR,
ReturnScheduledOutCalls SENT BY TARGET,
DeleteScheduledOutCalls SENT BY INITIATOR)
Or should I separate the tasks to different contracts and dialogs?
What you have seems good to me as an MVP (i.e. if things go right, it'll work). A couple of things:
Consider adding one more reply from the target saying "operation completed successfully" before closing the conversation. Upon receipt, the initiator can also close their end of it.
What happens if any of those operations is explicitly not able to be completed (e.g. in your step 4, the request is to delete something that's not present or that delete causes a foreign key violation)? I'd add in some sort of error message type (sent by any) that allows either side to tell the other "hey… something went wrong".
What happens if any of those operations is implicitly not able to be completed (e.g. the message never gets delivered)? The other side may not respond for some reason. Build in some way to at least detect and alert on that.
Imagine that we have a file, and some job that processes it and sends the data:
into the database
to an external service
Can we guarantee to process the file only once or at least to determine that something went wrong and notify the user so that he manually solved this problem?
Yes, you can.
What you can do is create a table in the database to store the name and a flag/status (if read, yes else no) of files. When process feeds the file in that location, make sure that the same process updates the name (if name is different each time) and flag/status for that file in the database. Your file read process can get the name of file from the database and dump that file in wherever you ant and when it's done, It should update the flag to read or whatever. This way, you can avoid reading the file more than one time.
I would store two tables of information in your database.
The processed file lines like you were already doing.
A record of the files themselves. Include:
the filename
whether the processing was successful, failed, partially succeeded
a SHA1 hashed checksum that can be used to check for the uniqueness of the file later
When you go to process a file, you first check whether the checksum already exists. If it does, you can stop processing and log the issue. Or you can throw that information on the file table.
Also be sure to have a foreign key association between your processed lines and your files. That way if something does go wrong, the person doing manual intervention can trace the affected lines.
Neither Usmana or Tracy answer actually guarantees that a file is not processed more than once and your job doesn't send duplicate requests to the database and the external service(#1 and #2 in your question). Both solutions suggest keeping a log and update it after all the processing is done but if an error occurs when you try to update the log at the very end, your job will try processing the file again next time it runs and will send duplicate requests to the database and external service. The only way to deal with it using the solutions Usmana and Tracy suggested is to run everything in a transaction but it's quite a challenging task in a distributing environment like yours.
A common solution to your problem is to gracefully handle duplicate requests to the database and external services. The actual implementation can vary but for example you can add a unique constraint to the database and when the job tries to insert a duplicate record an exception will be thrown which you can just ignore in the job because it means the required data is already in the db.
My answer don't mean that you don't need the log table Usmana and Tracy suggested. You do need it to keep track of processing status but it doesn't really guarantee there won't be duplicate requests to your database and external service unless you use a distributed transaction.
Hope it helps!
I need some guidance from anyone who has deployed a real-world, in-production application that uses the Sql Server Service Broker external activation mechanism (via the Service Broker External Activator from the Feature Pack).
Current mindset:
My specs are rather simple (or at least I think so), so I'm thinking of the following basic flow:
order-like entity gets inserted into a Table_Orders with state "confirmed"
SP_BeginOrder gets executed and does the following:
begins a TRANSACTION
starts a DIALOG from Service_HandleOrderState to Service_PreprocessOrder
stores the conversation handle (from now on PreprocessingHandle) in a specific column of the Orders table
sends a MESSAGE of type Message_PreprocessOrder containing the order id using PreprocessingHandle
ends the TRANSACTION
Note that I'm not ending the conversation, I don't want "fire-and-forget"
event notification on Queue_PreprocessOrder activates an instance of PreprocessOrder.exe (max concurrent of 1) which does the following:
begins a SqlTransaction
receives top 1 MESSAGE from Queue_PreprocessOrder
if message type is Message_PreprocessOrder (format XML):
sets the order state to "preprocessing" in Table_Orders using the order id in the message body
loads n collections of data of which computes an n-ary Carthesian product (via Linq, AFAIK this is not possible in T-SQL) to determine the order items collection
inserts the order items rows into a Table_OrderItems
sends a MESSAGE of type Message_PreprocessingDone, containing the same order id, using PreprocessingHandle
ends the conversation pertaining to PreprocessingHandle
commits the SqlTransaction
exits with Environment.Exit(0)
internal activation on Queue_HandleOrderState executes a SP (max concurrent of 1) that:
begins a TRANSACTION
receives top 1 MESSAGE from Queue_InitiatePreprocessOrder
if message type is Message_PreprocessingDone:
sets the order state to "processing" in Table_Orders using the order id in the message body
starts a DIALOG from Service_HandleOrderState to Service_ProcessOrderItem
stores the conversation handle (from now on ProcessOrderItemsHandle) in a specific column of Table_Orders
creates a cursor for rows in Table_OrderItems for current order id and for each row:
sends a MESSAGE of type Message_ProcessOrderItem, containing the order item id, using ProcessOrderItemsHandle
if message type is Message_ProcessingDone:
sets the order state to "processed" in Table_Orders using the order id in the message body
if message type is http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog (END DIALOG):
ends the conversation pertaining to conversation handle of the message
ends the TRANSACTION
event notification on Queue_ProcessOrderItem activates an instance of ProcessOrderItem.exe (max concurrent of 1) which does the following:
begins a SqlTransaction
receives top 1 MESSAGE from Queue_ProcessOrderItem
if message type is Message_ProcessOrderItem (format XML):
sets the order item state to "processing" in Table_OrdersItems using the order item id in the message body, then:
loads a collection of order item parameters
makes a HttpRequest to a URL using the parameters
stores the HttpResponse as a PDF on filesystem
if any errors occurred in above substeps, sets the order item state to "error", otherwise "ok"
performs a lookup in the Table_OrdersItems to determine if all order items are processed (state is "ok" or "error")
if all order items are processed:
sends a MESSAGE of type Message_ProcessingDone, containing the order id, using ProcessOrderItemsHandle
ends the conversation pertaining to ProcessOrderItemsHandle
commits the SqlTransaction
exits with Environment.Exit(0)
Notes:
specs specify MSSQL compatibility 2005 through 2012, so:
no CONVERSATION GROUPS
no CONVERSATION PRIORITY
no POISON_MESSAGE_HANDLING ( STATUS = OFF )
I am striving to achieve overall flow integrity and continuity, not speed
given that tables and SPs reside in DB1 whilst Service Broker objects (messages, contracts, queues, services) reside in DB2, DB2 is SET TRUSTWORTHY
Questions:
Are there any major design flaws in the described architecture ?
Order completion state tracking doesn't seem right. Is there a better method ? Maybe using QUEUE RETENTION ?
My intuition tells me that in no case whatsoever should the activated external exe terminate with an exit code other than 0, so there should be try{..}catch(Exception e){..} finally{ Environment.Exit(0) } in Main. Is this assumption correct ?
How would you organize error handling in DB code ? Is an error log table enough?
How would you organize error handling in external exe C# code ? Same error logging
table ?
I've seen the SQL Server Service Broker Product Samples, but the Service Broker Interface seems overkill for my seemingly simpler case. Any alternatives for a simpler Service Broker object model ?
Any cross-version "portable" admin tool for Service Broker capable of at least draining poison messages ?
Have you any decent code samples for any of the above ?
Q: Are there any major design flaws in the described architecture ?
A: Couple of minor perks:
- waiting for an HTTP request to complete while holding open a transaction is bad. You can't achieve transactional consistency between a database and HTTP anyway, so don't risk to have a transaction stretch for minutes when the HTTP is slow. The typical pattern is to {begin tran/receive/begin conversation timer/commit} then issue the HTTP call w/o any DB xact. If the HTTP call succeeds then {begin xact/send response/end conversation/commit}. If the HTTP fails (or client crashes) then let the conversation time activate you again. You'll get a timer message (no body), you need to pick up the item id associated with the handle from your table(s).
Q: Order completion state tracking doesn't seem right. Is there a better method ? Maybe using QUEUE RETENTION ?
A: My one critique of your state tracking is the dependency on scanning the order items to determine that the current processed one is the last one (5.3.4). For example you could add the information that this is the 'last' item to be processed in the item state so you know, when processing it, that you need to report the completion. RETENTION is only useful in debugging or when you have logic that require to run 'logical rollback' and to compensating actions on conversation error.
Q: My intuition tells me that in no case whatsoever should the activated external exe terminate with an exit code other than 0, so there should be try{..}catch(Exception e){..} finally{ Environment.Exit(0) } in Main. Is this assumption correct ?
A: The most important thing is for the activated process to issue a RECEIVE statement on the queue. If it fails to do so the queue monitor may enter the notified state forever. Exit code is, if I remember correctly, irrelevant. As with any background process is important to catch and log exceptions, otherwise you'll never even know it has a problem when it start failing. In addition to disciplined try/catch blocks, Hookup Application.ThreadException for UI apps and AppDomain.UnhandledException for both UI and non-UI apps.
Q: How would you organize error handling in DB code ? Is an error log table enough?
A: I will follow up later on this. Error log table is sufficient imho.
Q: How would you organize error handling in external exe C# code ? Same error logging table ?
A: I created bugcollect.com exactly because I had to handle such problems with my own apps. The problem is more than logging, you also want some aggregation and analysis (at least detect duplicate reports) and suppress floods of errors from some deployment config mishap 'on the field'. Truth be told nowadays there are more options, eg. exceptron.com. And of course I think FogBugs also has logging capabilities.
Q: I've seen the SQL Server Service Broker Product Samples, but the Service Broker Interface seems overkill for my seemingly simpler case. Any alternatives for a simpler Service Broker object model ?
finally, an easy question: Yes, it is overkill. There is no simple model.
Q: Any cross-version "portable" admin tool for Service Broker capable of at least draining poison messages ?
A: The problem with poison messages is that the definition of poison message changes with your code: the poison message is whatever message breaks the current guards set in place to detect it.
Q: Have you any decent code samples for any of the above ?
A: No
One more point: try to avoid any reference from DB1 to DB2 (eg. 4.3.4 is activated in DB1 and reads the items table from DB2). This creates cross DB dependencies which break when a) one DB is offline (eg. for maintenance) or overloaded or b) you add database mirroring for HA/DR and one DB fails over. Try to make the code to work even if DB1 and DB2 are on different machines (and no linked servers). If necessary, add more info to the messages payload. And if you architect it that way that DB2 can be on a different machine and even multiple DB2 machines can exists to scale out the HTTP/PDF writing work.
And finally: this design will be very slow. I'm talking low tens messages per second slow, with so many dialogs/messages involved and everything with max_queue_readers 1. This may or may not be acceptable for you.
I'm working on a project in dead ASP (I know :( )
Anyway it is working with a kdb+ database which is major overkill but not my call. Therefore to do inserts etc we're having to write special functions so they can be handled.
Anyway we've hit a theoretical problem and I'm a bit unsure how it should be dealt with in this case.
So basically you register a company, when you submit validation will occur and the page will be processed, inserting new values to the appropriate tables. Now at this stage I want to pull ID's from the tables and use them in the session for further registration screens. The user will never add a specific ID of course so it needs to be pulled from the database.
But how can this be done? I'm particularly concerned with 2 user's simultaneously registering, how can I ensure the correct ID is passed back to the correct session?
Thank you for any help you can provide.
Instead of having the ID set at the point of insert, is it possible for you to "grab" an ID value before hand, and then use that value throughout the process?
So:
Start the registration.
System connects to the database, creates an ID (perhaps from an ID table) and Stores in ASP Session.
Company registers.
You validate and insert data into DB (including the ID session)
The things you put in the Session(...) collection is only visible to that session (i.e. the session is used only by the browser windows on one computer). The session is identified by a GUID value that is stored in a cookie on the client machine. It is "safe" to store your IDs there (other users won't be able to read them easily) .
either your id can include date and time - so it will be example - id31032012200312 - but if you still think that 2 people can register at the same type then I would use recordset locks liek the ones here - http://www.w3schools.com/ado/prop_rs_locktype.asp
To crea ids like above in asp you do - replace(date(),"/","") ' and then same with time with ":"
Thanks