Is there any way to stop the build of a materialized view in Cassandra (3.7)?
Background: I created two materialized views A and B (full disclosure - I may have attempted to drop them before the build was complete) and those views seem to be perpetually stuck...any attempt to create another view C on the same table seems to hang. Using nodetool
nodetool.viewbuildstatus <keyspace>.<view>
shows a combination of STARTED and UNKNOWN for A and B, and STARTED for views in C. Using cql:
select * from system.views_builds_in_progress
all views are listed, but generation number and last_token have not changed in the last 24hrs (generation_number is in fact null for A).
Its not documented, but nodetool stop actually takes any compaction type, not just the ones listed (which the view build is one of). So you can simply:
nodetool stop VIEW_BUILD
Or you can hit JMX directly with the org.apache.cassandra.db:type=CompactionManager mbean's stopCompaction operation.
All thats really gonna do is set a flag for the view builder to stop on its next loop. If it threw an uncaught exception or something so its no longer doing anything (worth checking system/output logs) the stop wont do anything either. In that case its not really hurting anything though so can ignore it and retry. Worst case restart the node.
Related
Firstly, I am new to SSIS and not sure of what format to explain the problem in a repeatable way for testing
So I have a task that loops through a folder looking for flat files and if in the correct format loads them into a SQL server staging table. This seem to work correctly
The next bit is the strange bit. I have a split for success and a split for failure (for user notification etc) on both legs I run a SQL task to count the number of rows in the staging table to process. IF it greater than zero I want to run my SQL stored procedure that handles the load to into the main database table for the data, then clean up.
In my debug email I see the number of rows is greater than 0 but the task does not proceed. I can not understand why. I have tried a number of constrain combinations (On completion, On Success, On Success and RowsToProcess>0).
I have also tried removing and adding my SQL Task and remapping.
(BTW all my tasks function in SSMS etc)
Nothing seems to work The only thing is that 2 branches re-join at this point, but surely that would not affect it, would it? (see below original screen shots)
Here is my control flow, If I have missed anything please add a comment and I will supply the information if I know it
My results from executions
On further testing two executables does seem to be the problem! So, additional questions.
Is this expected behaviour that two constraints joining back to a SQL task stops the flow in SSIS (in BPA task centre it does not)?
If, this is the case does this mean you have to repeat code in parallel? (for example if I was to run a clean up script at the end of my flow I would have to write it tice once for success leg and once for failure leg). This seems inefficient, am I missing something?
I think I've found the answer you are looking for here (go give it an upvote)
I was not aware of it, but basically, it says that bot branches must complete successfully (logical AND) in order for the flow to proceed, which won' happen as the right branch in this case won't even run.
The default behavior can be changed in the "constraint properties" (right-click on the arrow that links the blocks) under the "Multiple Constraints" section.
That said
I've never used this method, but that's up to the use case. I would have used "sequence containers" to avoid replicating blocks or multiple constraints, which is not the same but works as long as you don't have to pass data from one block to the other.
In your case, I'd put all the blocks inside a sequence container except the last 2, and after the container completes execute the last two as they must be always run after the previous block of operations
note about containers:
Containers do not output data but are useful to group operations logically, and you can even run just a container instead of the whole package (from VS).
Seems today I've learned something new too...
Hope this helps you solve your issue
I faced today strange case when receiving customer database for investigation.
System settings:
Firebird server v 2.5.9.26074
Firebird client v 2.6.5
Database file is accessed directly by the application, i.e., it is NOT registered via aliases.conf.
When I first looked into database, everything seemed to be pretty consistent. However, during the first startup there are two rows added in certain table without any detected SQL execution. I have confirmed with debugger that the application is not adding these rows. I also used Audit and Trace inferface (fbtracemgr) and saw in log file that there are not such rows added to the database.
There is one hint that something is wrong in the original database. The table that contains the problem is using INSERT trigger to set the table row's ID column value from generator. Now the generator value seem to be one too high in the original database. This leads me to think that the "ghost data" has already been entered in the file in some sort of cache as the generator is already increment by one.
The result is that after these the two ghost rows are added, the next real addition to the table leads into exception:
FirebirdSql.Data.FirebirdClient.FbException (0x80004005): violation of
PRIMARY or UNIQUE KEY constraint "INTEG_275" on table "DATALOG" --->
violation of PRIMARY or UNIQUE KEY constraint "INTEG_275" on table
"DATALOG"
as there already exist row with equal ID that the generator suggests.
Is there persistent "unsaved data cache" that could contain row data entered during the previous application runs? What could lead to this situation? Power break during database writing or backuping?
Any thoughts?
Firebird server v 2.5.9.26074
There is no such version released.
Firebird-2.5.8.27089
http://www.firebirdsql.org/en/firebird-2-5/
Basically u seem to use some destabilized FB developers internal build, which can have any number of strange averse effects.
So I would advice to use standard released verison or if using snapshot builds is required for some untold reasons - to ask developers in firebird-support mail list - http://www.firebirdsql.org/en/support/
Though don't hold your breath for much of support over exotic Firebird builds.
UPD. Thanks to Mark, here it is: https://www.firebirdsql.org/en/firebird-2-5-0/
2.5.0 - was the first release after a significant reworking of the engine. Not the most stable, obviously. For example there was an issue with indices right in the next 2.5.1 version.
if the behavior would be repeated on standard 2.5.8 Firebird, then i would suggest exporting all the database (at least all the meta-data, but maybe the data as well) into a long text file, SQL script, and then searching for the said table name in it. For example there might be on-database-connect triggers adding some data. Or stored procedures. Or views made on triggers. Or something yet else. For example - though malpractice - even UDF function may make it's own database connection and do things, though this should be shown in FBTrace.
However, during the first startup there are two rows added in certain table
startup of what ?
will those rows still be added if you use standard tools like iSQL/FlameRobin/IBExpert/etc just to connect and then disconnect from the database?
as there already exist row with equal ID that the generator suggests
Generator can not suggest things like that. It can only suggest that once such a number was reserved for possibly being added to one or another table. It does not mean the row was actually inserted, was inserted into that table, was not deleted later.
You may try to search with indices prohibited, in case index corruption could occur, something like
select id+0, count(*) from tableName group by 1
Also http://www.firebirdfaq.org/faq324/
when receiving customer database for investigation
BTW, how exactly did they created a copy of the database to give you?
Did they made back-up (FBK) ? If not, did they stopped Firebird server before making copies?
I'm trying out ArangoDB and having some trouble. I successfully imported ~1.3 million documents and I'm trying to rearrange the document data in the database, but the following query (run through Arango shell) just slows Arango a crawl until eventually the shell gives me an error: [ArangoError 2001: Error reading from: 'tcp://127.0.0.1:8529' 'timeout during read']
FOR d IN DocumentCollection
UPDATE d WITH {'uid': d.property1.property2} IN DocumentCollection
Should this query work? Am I doing something wrong? Is there some way to speed it up?
It is (still) working.
You can use the queries Module to observe the query in action.
You can make arangosh wait more patiently with the --server.request-timeout - option.
The performance problem here is, that the whole collection has to be loaded into memory for this operation - since it can't chunk that internally (yet).
If you are able to splice that into a series of queries using FILTER and ranges, you'd probably be faster at your target.
We have a SQL Server database table that consists of user id, some numeric value, e.g. balance, and a version column.
We have multiple threads updating this table's value column in parallel, each in its own transaction and session (we're using a session-per-thread model). Since we want all logical transaction to occur, each thread does the following:
load the current row (mapped to a type).
make the change to the value, based on old value. (e.g. add 50).
session.update(obj)
session.flush() (since we're optimistic, we want to make sure we had the correct version value prior to the update)
if step 4 (flush) threw StaleStateException, refresh the object (with lockmode.read) and goto step 1
we only do this a certain number of times per logical transaction, if we can't commit it after X attempts, we reject the logical transaction.
each such thread commits periodically, e.g. after 100 successful logical transactions, to keep commit-induced I/O to manageable levels. meaning - we have a single database transaction (per transaction) with multiple flushes, at least once per logical change.
what's the problem here, you ask? well, on commits we see changes to failed logical objects.
specifically, if the value was 50 when we went through step 1 (for the first time), and we tried to update it to 100 (but we failed since e.g. another thread changed it to 70), then the value of 50 is committed for this row. obviously this is incorrect.
What are we missing here?
Well, I do not have a ton of experience here, but one thing I remember reading in the documentation is that if an exception occurs, you are supposed to immediately rollback the transaction and dispose of the session. Perhaps your issue is related to the session being in an inconsistent state?
Also, calling update in your code here is not necessary. Since you loaded the object in that session, it is already being tracked by nhibernate.
If you want to make your changes anyway, why do you bother with row versioning? It sounds like you should get the same result if you simply always update the data and let the last transaction win.
As to why the update becomes permanent, it depends on what the SQL statements for the version check/update look like and on your transaction control, which you left out of the code example. If you turn on the Hibernate SQL logging it will probably become obvious how this is happening.
I'm not a nhibernate guru, but answer seems simple.
When nhibernate loads an object, it expects it not to change in db as long as it's in nhibernate session cache.
As you mentioned - you got multi thread app.
This is what happens=>
1st thread loads an entity
2nd thread loads an entity
1st thread changes entity
2nd thread changes entity and => finds out that loaded entity has changed by something else and being afraid that it has screwed up changes 1st thread made - throws an exception to let programmer be aware about that.
You are missing locking mechanism. Can't tell much about how to apply that properly and elegantly. Maybe Transaction would help.
We had similar problems when we used nhibernate and raw ado.net concurrently (luckily - just for querying - at least for production code). All we had to do - force updating db on insert/update so we could actually query something through full-text search for some specific entities.
Had StaleStateException in integration tests when we used raw ado.net to reset db. NHibernate session was alive through bunch of tests, but every test tried to cleanup db without awareness of NHibernate.
Here is the documention for exception in the session
http://nhibernate.info/doc/nhibernate-reference/best-practices.html
So for the second day in a row, someone has wiped out an entire table of data as opposed to the one row they were trying to delete because they didn't have the qualified where clause.
I've been all up and down the mgmt studio options, but can't find a confirm option. I know other tools for other databases have it.
I'd suggest that you should always write SELECT statement with WHERE clause first and execute it to actually see what rows will your DELETE command delete. Then just execute DELETE with the same WHERE clause. The same applies for UPDATEs.
Under Tools>Options>Query Execution>SQL Server>ANSI, you can enable the Implicit Transactions option which means that you don't need to explicitly include the Begin Transaction command.
The obvious downside of this is that you might forget to add a Commit (or Rollback) at the end, or worse still, your colleagues will add Commit at the end of every script by default.
You can lead the horse to water...
You might suggest that they always take an ad-hoc backup before they do anything (depending on the size of your DB) just in case.
Try using a BEGIN TRANSACTION before you run your DELETE statement.
Then you can choose to COMMIT or ROLLBACK same.
In SSMS 2005, you can enable this option under Tools|Options|Query Execution|SQL Server|ANSI ... check SET IMPLICIT_TRANSACTIONS. That will require a commit to affect update/delete queries for future connections.
For the current query, go to Query|Query Options|Execution|ANSI and check the same box.
This page also has instructions for SSMS 2000, if that is what you're using.
As others have pointed out, this won't address the root cause: it's almost as easy to paste a COMMIT at the end of every new query you create as it is to fire off a query in the first place.
First, this is what audit tables are for. If you know who deleted all the records you can either restrict their database privileges or deal with them from a performance perspective. The last person who did this at my office is currently on probation. If she does it again, she will be let go. You have responsibilites if you have access to production data and ensuring that you cause no harm is one of them. This is a performance problem as much as a technical problem. You will never find a way to prevent people from making dumb mistakes (the database has no way to know if you meant delete table a or delete table a where id = 100 and a confirm will get hit automatically by most people). You can only try to reduce them by making sure the people who run this code are responsible and by putting into place policies to help them remember what to do. Employees who have a pattern of behaving irresponsibly with your busness data (particulaly after they have been given a warning) should be fired.
Others have suggested the kinds of things we do to prevent this from happening. I always embed a select in a delete that I'm running from a query window to make sure it will delete only the records I intend. All our code on production that changes, inserts or deletes data must be enclosed in a transaction. If it is being run manually, you don't run the rollback or commit until you see the number of records affected.
Example of delete with embedded select
delete a
--select a.* from
from table1 a
join table 2 b on a.id = b.id
where b.somefield = 'test'
But even these techniques can't prevent all human error. A developer who doesn't understand the data may run the select and still not understand that it is deleting too many records. Running in a transaction may mean you have other problems when people forget to commit or rollback and lock up the system. Or people may put it in a transaction and still hit commit without thinking just as they would hit confirm on a message box if there was one. The best prevention is to have a way to quickly recover from errors like these. Recovery from an audit log table tends to be faster than from backups. Plus you have the advantage of being able to tell who made the error and exactly which records were affected (maybe you didn't delete the whole table but your where clause was wrong and you deleted a few wrong records.)
For the most part, production data should not be changed on the fly. You should script the change and check it on dev first. Then on prod, all you have to do is run the script with no changes rather than highlighting and running little pieces one at a time. Now inthe real world this isn't always possible as sometimes you are fixing something broken only on prod that needs to be fixed now (for instance when none of your customers can log in because critical data got deleted). In a case like this, you may not have the luxury of reproducing the problem first on dev and then writing the fix. When you have these types of problems, you may need to fix directly on prod and you should have only dbas or database analysts, or configuration managers or others who are normally responsible for data on the prod do the fix not a developer. Developers in general should not have access to prod.
That is why I believe you should always:
1 Use stored procedures that are tested on a dev database before deploying to production
2 Select the data before deletion
3 Screen developers using an interview and performance evaluation process :)
4 Base performance evaluation on how many database tables they do/do not delete
5 Treat production data as if it were poisonous and be very afraid
So for the second day in a row, someone has wiped out an entire table of data as opposed to the one row they were trying to delete because they didn't have the qualified where clause
Probably the only solution will be to replace someone with someone else ;). Otherwise they will always find their workaround
Eventually restrict the database access for that person and provide them with the stored procedure that takes the parameter used in the where clause and grant them access to execute that stored procedure.
Put on your best Trogdor and Burninate until they learn to put in the WHERE clause.
The best advice is to get the muckety-mucks that are mucking around in the database to use transactions when testing. It goes a long way towards preventing "whoops" moments. The caveat is that now you have to tell them to COMMIT or ROLLBACK because for sure they're going to lock up your DB at least once.
Lock it down:
REVOKE delete rights on all your tables.
Put in an audit trigger and audit table.
Create parametrized delete SPs and only give rights to execute on an as needed basis.
Isn't there a way to give users the results they need without providing raw access to SQL? If you at least had a separate entry box for "WHERE", you could default it to "WHERE 1 = 0" or something.
I think there must be a way to back these out of the transaction journaling, too. But probably not without rolling everything back, and then selectively reapplying whatever came after the fatal mistake.
Another ugly option is to create a trigger to write all DELETEs (maybe over some minimum number of records) to a log table.