Unable to run 'USING PERIODIC COMMIT' since last week on NEO4J

Unable to run 'USING PERIODIC COMMIT' since last week on NEO4J - database

I have downloaded the Desktop Version of NEO4J on my MAC last week. (It's version 1.2.4)
Neo4j Browser version: 4.0.3
Neo4j Server version: 3.5.14 (enterprise)
Last week I was using the USING PERIODIC COMMIT command of loading in a CSV as seen below, this set up my relationships fine. However as of a couple of days ago, I tried to do the exact same command however now I get an error which is shown as Executing queries that use periodic commit in an open transaction is not possible. Please can someone explain to me what has gone wrong please?
query:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM
"file:/Volumes/Twitter_Dataset/tweets.csv" AS csvLine
MATCH (tweet:Tweet {tweetID: csvLine.tweetID})
MATCH (user:User {username: csvLine.username})
MERGE (user)-[:POSTS]->(tweet);

The short answer:
Prefix your USING PERIODIC COMMIT queries with :auto
Changes were pushed out to provide more context here, so the error message now includes a link for more info about what's going on, as well as the :auto workaround above.
The long answer:
This is related to a recent feature improvement in the Neo4j Browser, which has a side effect with USING PERIODIC COMMIT operations, but there is a way around this, and a browser update was already pushed to provide more context with a clear workaround.
The latest round of Neo4j Browser updates include this change, which uses transactional functions instead of auto-committed transactions, giving queries through the browser automatic retry capability, and better ability to cope with member changes when hitting against a causal cluster.
The problem is that USING PERIODIC COMMIT needs to be run in an auto-committed transaction. This requires a means to switch whether we're using an auto-committed transaction or not.
You said you're using browser version 4.0.3. I believe that went out yesterday, and included with it changes providing details about what's going on and how to get this to work as normal. When encountering that error, you should now see a link with info on the :auto command, mentioning auto-committing transactions. Following the link should show an info card with:
The :auto command will send the Cypher query following it, in an auto committing transaction. In general this is not recommended because of the lack of support for auto retrying on leader switch errors in clusters.
Some query types do however need to be sent in auto-committing transactions, USING PERIODIC COMMIT is the most notable one.
An example is provided on the card for prefixing a USING PERIODIC COMMIT query with :auto to let it execute in an auto-committing transaction.

Related

Posting large directory of files to SOLR using post tool, how to commit after every file

I am using the java post tool for solr to upload and index a directory of documents. There are several thousand documents. Solr only does a commit at the very end of the process and sometimes things stop before it completes so I lose all the work.
Has anyone a technique to fetch the name of each doc and call post on that so you get the commit for each document? Rather than the large commit of all the docs at the end?

From the help page for the post tool:
Other options:
..
-params "<key>=<value>[&<key>=<value>...]" (values must be URL-encoded; these pass through to Solr update request)
This should allow you to use -params "commitWithin=1000" to make sure each document shows up within one second of being added to the index.

Committing after each document is an overkill for the performance, in any case it's quite strange that you had to resubmit anything from start if something goes wrong. I suggest to seriously to change the indexing strategy you're using instead of investigating in a different way to commit.
Given that, if you not have any other way that change the commit configuration, I suggest to configure autocommit in your Solr collection/index or use the parameter commitWithin, as suggested by #MatsLindh. Just be aware if the tool you're using has the chance to add this parameter.
autoCommit
These settings control how often pending updates will be automatically pushed to the index. An alternative to autoCommit is
to use commitWithin, which can be defined when making the update
request to Solr (i.e., when pushing documents), or in an update
RequestHandler.

Basic Approach to Diagnostic Logging in SSIS

Given help from this microsoft link, I am aware of many tools related to SSIS diagnostics:
Event Handlers (in particular, "OnError")
Error Outputs
Operations Reports
SSISDB Views
Logging
Debug Dump Files
I just want to know what is the basic, "go to" approach for (non-production) diagnostics setup with SSIS. I am a developer who WILL have access to the QA and UAT servers where I will be performing diagnostics.
In my first attempt to find the source of an error, I used SSMS to view operational reports. All I saw was this:
I followed the instructions shown above, but all it did was lead me in a circle. The overview allows me to see the details, but the details show the above message and ask me to go back to the overview. In short, there is zero error information beyond telling me which task failed within the SSIS package.
I simply want to get to a point where I can at last see SOMETHING about the error(s).
If the answer is that I first need to configure an OnError event in my package, then my next question is: what would the basic, "go to" designer-flow look like for that OnError event?
FYI, this question is similar to "best way of logging in SSIS"
I also noticed an overall strategy for success with SSIS in this answer. The author says:
Instrument your code - have it make log entries, possibly recording diagnostics such as check totals or counts. Without this, troubleshooting is next to impossible. Also, assertion checking is a good way to think of error handling for this (does row count in a equal row count in b, is A:B relationship really 1:1).
Sounds good! But I'd like to have a more concrete example...particularly for feeding me the basics of what specific errors were generated.
I'm trying to avoid learning ALL the SSIS diagnostic approaches, just for the purpose of picking one good "all around" approach.
Update
Per Nick.McDermaid suggestion, in the SSISDB DB I run this:
SELECT * FROM [SSISDB].[catalog].[executions] ORDER BY start_time DESC
This shows to me the packages that I manually executed. The timestamps correctly reflect when I ran them. If anything is unusual(?) it is that the reference_id, reference_type and environment_name columns are null. All the other columns have values.
Update #2
I discovered half the answer I'm looking for. The reason no error information is available, is because by default the SSIS package execution logging level is "none". I had to change the logging level.
Nick.McDermaid gave me the rest of the answering by explaining that I don't need to dive into OnError tooling or SSIS logging provider tooling.

I'm not sure what the issue with your reports are but in answer to the question "Which SSIS diagnostics should I learn", I suggest the vanilla ones out of the box.
In other words use built in SSIS logging (which does not require any additional code) to log failures. Then use the built in reports (once you get them working) to check those logs.
vanilla functionality requires no maintenance. Custom functionality (i.e. filling your packages up with OnError events) requires a lot more maintenance.
You may find situations where you need to learn some of the SSISDB tricks to troubleshoot but in the first instance, try to get everything you can out of the vanilla reports.
If you need to maintain an SQL 2012 or after existing system, then all of this logging is built in. Manual OnError additions are not guaranteed to be built in
The only other thing to be aware of is that script tasks never yield informative errors. I actually suggest you avoid the use of script tasks in SSIS. I feel that if you have to use a script task, you might be using the wrong tool

Adding to the excellent answer of #Nick.McDermaid.
I use SSIS Catalog error reporting. In most cases, it is sufficient and have the following functionality for error analysis. Emphasis is on the following:
Usually the first or second error message contains meaningful information on error. The latter is some error occurred in the dataflow....
If you look at the first/second error message at All Messages report at Error Messages section, you will see Error Context hyperlink. Invoking it will show you environment, connection managers and some variables at the moment of package crash.
Good error analysis is more an approach and practice than a mere tool selection. Here are my recommendations:
SSIS likes to report error code instead of meaningful explanation. So, Integration Services Error and Message Reference is your friend.
SSIS includes in error context (see above) dump those variables which have Include in ErrorDump property set to true.
I build SQL commands for SQL Task or DataFlow Source in variables. This allows to display SQL command executed at error in error context, when you set incude in Dump property on these variables.
Structure your variables well. If some variable is used only at some task - declare it on this task. Otherwise a mess of dumped variables will hurt you more than do any good.

Will Jira complain if I set the Resolution date to a date before the creation via direct DB write?

Some colleagues were using an Excel file to keep track of some issues, and they have decided to switch to a better system, asking me to setup a Jira project for them and to import all the tickets. A way or the other I have done everything, but the resolution date is now wrong, because it's the one of when I ran the script to import them into Jira. They would like to have the original one, so that they can know when an issue was really fixed. Unfortunately there's no way to change it from Jira's interface, so I have to access the DB directly. The command, for the record, is like:
update jiraissue
set RESOLUTIONDATE = "2015-02-16 14:48:40"
where pkey = "OV001-1";
Now, low-level writes to a database in general are dangerous, and I am wondering whether there can be any risks. Our test server is not available right now, so I'd have to work directly on the production one. One thing I had seen on our test server is that this seemed to work, except that JQL queries such as
resolved < 2015-03-20
are wrong because they still use the old Resolution date. Clearly, I have to reindex; but I'm wondering whether it is safe. Does Jira perform some consistency checks? Like, verifying that a ticket is solved after it is created. In my case, since I have modified the resolution date but not the creation, it is a clear inconsistency. Will Jira complain about this? Is there the risk to corrupt the DB? And if I also modify the creation date, do I have to watch out for other things?
We are using Jira 5.2.11.

I have access to the test server again, and I have tried it. I have modified all the RESOLUTIONDATE fields I had to fix, and when I reloaded the page the new date was there. Jira didn't complain about anything. I reindexed the server, so that queries yield correct results, and I saw no issues. Then I even ran the integrity checks (Administration -> System -> Integrity Checker), and no error was found.
Finally I did the same on the production server, and again everything is running fine.
I can therefore conclude that the operation is not dangerous at all, and it can be done safely.

SQL Timeouts in CRM 4.0 - import of customizations fail

When importing customizations to CRM 4.0, the import fails with a message "generic SQL error". Digging a bit deeper the error message is really that a timeout has occured. The same error occurs when trying to create a new entity.
I increased the timeout as suggested in the link below, but the timeout occured anyway - it just took longer time to happen.
Increasing the timeout:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSCRM\OLEDBTimeout
This value does not exist by default, and if it does not exist then the default value will be 30 seconds. To change it, add the registry value (of type DWORD), and put in the value you want (measured in seconds). Then you have to recycle the CrmAppPool application pool for the change to take effect
https://community.dynamics.com/product/crm/crmtechnical/b/crmdavidjennaway/archive/2008/09/04/sql-timeouts-in-crm-generic-sql-error.aspx
The SQL profiler displays a set of inserts and updates related to the metadata in CRM, and then a call to the stored procedure exec p_RecreateIndexes
This call is apparently the culprit and never completes in a timely fashion (30+ minutes now and not completed yet). This is an existing test instance of CRM and is quite extensively used and filled with lots of data. Creating new entities has never before taken this long. Just in case, I have run the asyncoperation cleaning scripts from MS. It did not have any visible effect.
Is there any way to find the reason for the delay in this procedure, or some other solution I can try?

Try splitting up your import into chunks. For example, import the first 20 entities, then the second 20 entities, and so on until you've imported all of them. Then publish. Then go back and try importing the entire customizations file at the same time and republish. Following this method exactly has been the only way we've found to import some customization files in particularly stubborn environments.

It sounds like the re-index operation is taking some time - which would be as expected if the data size is large, and the fragmentation is high. It also depends on exactly what that stored procedure does, and how many cores/CPUs you have - and how many SQL Server is allowed to use.
Does the app allow you to defer that operation? You'd be able to run it manually yourself through Management Studio - if that doesn't break the application.
You could be cheeky, and rename that procedure, and replace it with one of your own that does nothing.... and then rename back, and run. Again, it might break something.
Or just keep increasing the timeout until you get past this issue. Some of my re-index jobs on databases generally take hours....
Or contact the vendor if you have support?
If you ran that query through Management Studio, it would complete.... doing that would give you the approximate time required to put (temporarily) into the timeout setting.

I just experienced the exact same problem and I was only importing 2 entities.
I found your questions when I was googling for p_RecreateIndexes after seeing it in the Trace files.
I ended up running exec p_RecreateIndexes in SQL Server. After it completed - about 2 minutes - I reran the Entity import and it worked.

NHibernate will not insert a record

I have an application that is now 4+ years old that is exhibiting some odd behavior on our latest deployment. The application uses nHibernate for all inserts / updates / selects, etc. We are currently using .NET 2.0, and nHibernate 1.2 (I know, we need to upgrade)
This deployment is on Windows 2008 Server x64, IIS 7.5 - what I have seen so far is that the application runs, but is unable to insert or update records in the DB - reads seem fine so far, but writes are a problem. SOME writes actually work, inserts into some small tables, but most never even make it to the DB.
Using SQL Profiler, the insert / updates never make it to the server, and turning log4net up to DEBUG, and show_sql true - the select statements appear, but the insert / update statements never make it into the log at all, and never show up at the server.
What's even more odd is that the application seems to be oblivious to this - the commandandclose runs without exception (open session in view with an httpmodule), the domain objects come back with uuid's generated, etc. but never get persisted.
Certainly an upgrade is due, but I would hate to try it during a deployment, and without time to accurately test the app. Any ideas?

My guess is that the default ISession FlushMode has been changed from Auto to Never or Commit. Never means that the session will flush when Flush() is called by the application; Commit means that the session will flush when a transaction is committed.

Back out your current deployment and return to what you had before. Then look for the mistake someone made. If it used to insert and now does not, then something is wrong with your current code. If it isn't creating the insert/update statments, then I'd look first at where they are supposed to be created. Did the current deplyment actually insert record or update them in dev? Did anybody test that or were you relying onthe fact that it didn;t pop up an error? If it did work in dev and doesn't work in prod, I'd look at the envirnmental differnences between dev and prod.

Both good answers, the problem was in the deployment. The web.config was setup for IIS6, and the deployment to IIS7 did not properly setup the open session in view HttpModule that is used to commit the transaction. Changing the pipeline mode from Integrated to Classic solved the problem.