I have a select based server system, where I can manage multiple clients. The server automatically reads and responds to the client, which is great. But there's a minor issue. For instance user#1 changes directory (coded with chdir), all of the other users are affected by this change. I really do wish that prevented for happening.
There's two ways to solve this:
Fork off a separate process to handle each connection. This process can have its own state, including current working directory. The disadvantages are that you'll need to refactor your code quite a lot, and if you have a lot of concurrent connections then it can be a performance problem. This is harder on Windows that *nix, but not impossible.
Keep the current directory as a per-connection setting within your program, and (re)set the directory before executing every user command.
Related
Is there any way to preserve the load history when re-creating pipes (using CREATE OR REPLACE)?
We do a lot of automated CI/CD on Snowflake, and sometimes pipes need to get re-created. When this happens, the load history is lost. Right now, the accepted workaround is a manual process, which doesn't work very well in an automated workflow.
This makes refreshing pipes dangerous, as duplicate data could be loaded. There is also a danger of losing some notifications/files while the pipe is being re-created -- with or without the manual process, automated or not (which is unacceptable, for obvious reasons).
I wish there was a simple parameter to enable this. Something like:
CREATE OR REPLACE PIPE my_pipe
PRESERVE_HISTORY = [ TRUE | FALSE ]
AS <copy_statement>
An alternative to this would be an option/parameter for pipes to share the load history with the table instead. This way, when the pipe is re-created (but the table isn't), the load history is preserved. If the table is dropped/truncated, then the load history for both the table and the pipe would be lost.
Another option would be the ability to modify pipes using an ALTER command instead, but currently this is very limited. This way, we wouldn't even need to re-create the pipe in the first place.
EDIT: Tried automating the manual process with a procedure, but there's a still chance of losing notifications.
Creating a pipe creates a new object with its own history, I don't see how this is something that would be feasible to do.
Why do you need to re-create the pipes?
Your other option is to manage the source files, after content is ingested by a pipe remove the files that were ingested. The new pipe won't even know about the new files. This, of course, can be automated too
Since preserving the load history doesn't seem possible currently, I explored a few alternatives:
tl;dr: Here is the solution.
Deleting/Removing/Moving the files after ingestion
Thank you #patrick_at_snowflake for the recommendation! 🙏
This turned out to be a bit tricky to do with high reliability, because there's no simple way to tie the ingestion of files in Snowflake to their deletion/removal in could storage (i.e. life cycle management policies are not aware of whether or not the files were ingested successfully by Snowpipe).
It could be possible to monitor the ingestion using a stream or COPY_HISTORY as a trigger for the deletion/removal of the files, but this is not simple (would probably require the use of an external function).
Refreshing a subset of the pipe
Thanks #GregPavlik for the suggestion! 🙏
The idea here would be to save the timestamp at which the initial pipe is paused/dropped. This timestamp could then be used to refresh the new pipe with a "safe" subset of the staged files (in order to avoid re-ingesting the same files and creating duplicates records).
I think this is a great idea (my favorite so far), but I also had monitoring in mind and wanted to confirm that this would work, so I continued exploring alternatives for a while.
Replaying the missed notifications
I asked a separate question about this here.
The idea would be to simply replay the notifications that were neither processed by the initial pipe or the new pipe.
However this doesn't seem possible either.
Monitoring the load of every staged file
Finally, I arrived at this solution.
This is the one I went with as it not only allows to refresh missing files, but also to monitor the loading of all staged files as a whole (no matter the source of the failure).
I was already working on monitoring Snowpipe as part of a project, so this solution added another layer of monitoring. 👍
I have multiple databases in my local which I do not need. Can I run a curl script or a REST API command where I can delete the database, it's servers and all of the forests so that I can use gradle to just deploy them again?
I have tried to manually delete the server first, then the database and then the forests. This is a lengthy process.
I want a single command to do the whole job for me instead of manually having to delete the components one by one which is possible through the admin interface.
Wagner Michael has a fair point in his comment. If you already used (ml-)Gradle to create servers and databases, why not use its mlUndeploy -Pconfirm=true task to get rid of them? You could potentially even use a fake project, with stub configs to get rid of a fairly random set of databases and servers, though that still takes some manual work.
By far the quickest way to reset your entire MarkLogic, is to stop it, and wipe its data directory. This SO question gives instructions on how to do it, as part of a solution to recover when you lost your admin password:
https://stackoverflow.com/a/27803923/918496
HTH!
I'm wondering is there a way to recognize the OfflineComamd is being executed or internal flag or something to represent this command has been passed or mark it has been executed successfully. I have issue in recognizing the command is passed or not with unstable internet. I keep retrieve the records from database and comparing each and every time to see this has been passed or not. But due to the flow of my application, I'm finding it very difficult to avoid duplicates.IS there any automatic process to make sure commands executed automatically or something else?
2nd question, I can use UITimer to check isOffline() to make sure internet is connected or not on the forms. Is there something equivalent on server page or where queries is written to see internet is disconnected or not. When the control moved to queries and internet is disconnected I see the dialog open from form page being frozen for unlimited time and will not end. I have to close and re-open the app to continue the synchronization process.At the same time I cannot set a timeout for dialog because I'm not sure how long it will take the complete the Synchronization process. Please advise.
Extending on the same topic but I have created a new issue just to give more clarity on my questions.
executeOfflineCommand skips a command while executing from storage on Android
There is no way to know if a connection will stay stable as it requires knowledge of the future. You can work like transaction services do where the server side processes an offline command as a transaction using the approach of 2-phase commit.
In this approach you have an algorithm similar to this:
Client sends command to server
Server returns a special unique ID for the command
Client asks server to perform the unique id
Server acknowledges that the command was performed
If the first 2 stages didn't complete you just do that again. The worst thing that could happen is some orphan commands on the server.
If the 3rd option didn't complete you just do it again. The server knows whether it processed the command and will just acknowledge it if it was already processed.
This question has been asked around several time. Many programs like Dropbox make use of some form of file system api interaction to instantaneously keep track of changes that take place within a monitored folder.
As far as my understanding goes, however, this requires some daemon to be online at all times to wait for callbacks from the file system api. However, I can shut Dropbox down, update files and folders, and when I launch it again it still gets to know what the changes that I did to my folder were. How is this possible? Does it exhaustively search the whole tree in search for updates?
Short answer is YES.
Let's use Google Drive as an example, since its local database is not encrypted, and it's easy to see what's going on.
Basically it keeps a snapshot of the Google Drive folder.
You can browse the snapshot.db (typically under %USER%\AppData\Local\Google\Drive\user_default) using DB browser for SQLite.
Here's a sample from my computer:
You see that it tracks (among other stuff):
Last write time (looks like Unix time).
checksum.
Size - in bytes.
Whenever Google Drive starts up, it queries all the files and folders that are under your "Google Drive" folder (you can see that using Procmon)
Note that changes can also sync down from the server
There's also Change Journals, but I don't think that Dropbox or GDrive use it:
To avoid these disadvantages, the NTFS file system maintains an update sequence number (USN) change journal. When any change is made to a file or directory in a volume, the USN change journal for that volume is updated with a description of the change and the name of the file or directory.
I have a TFS build process that drops outputs on sandbox which is another server in the same network. In other words, the build agent and sandbox are separate machines. After the outputs are created, a batch script defined within the build template does the following:
Rename existing deployment folder to some prefix + timestamp (IIS can now no longer find the app when users attempt to access it)
Move newly-created outputs to deployment location
The reason why I wanted to rename and move files instead of copy/delete/overwrite is the latter takes a lot of time because we have so many files (over 5500). I'm trying to find a way to complete builds in the shortest amount of time possible to increase developer productivity. I hope to create a windows service to delete dump folders and drop folder artifacts periodically so sandbox doesn't fill up.
The problem I'm facing is IIS maintains a handle to the original deployment folder so the batch script cannot rename it. I used Process Explorer to see what process is using the folder. It's w3wp.exe which is a worker process for the application pool my app sits in. I tried killing all w3wp.exe instances before renaming the folder, but this did not work. I then decided to stop the application pool, rename the folder, and start it again. This did not work either.
In either case, Process Explorer showed that there were still uncollected handles to my outputs except this time the owner name wasn't w3wp.exe, but it was something along the lines of unidentified process. At one point, I saw that the owner was System, but killing System's process tree shuts down the server.
Is there any way to properly remove all handles to my deployment folder so the batch script can safely rename it?
https://technet.microsoft.com/en-us/sysinternals/bb896655.aspx
Use windows systernal tool called Handle v4.0
Tools like Process Explorer, that can find and forcibly close file handles, however the state and behaviour of the application (both yours and, in this case, IIS) after doing this is undefined. Some won't care, some will error and others will crash hard.
The correct solution is to allow IIS to cleanly release locks and clean up after itself to preserve server stability. If this is not possible, you can either create another site on the same box, or set up a new box with the new content, and move the domain name/IP across to "promote" the new content to production