In vertica, Is there any way to access admintools without admin rights? - database

In vertica, Is there any way to access admintools without admin rights?. How to find nodes and cluster configure details in normal user in vertica?

You might have already tried it.
You must have access to the Linux shell level on one of the Vertica nodes, as a user that is allowed to write to /opt/vertica/log/adminTools.log . And that is, by default, dbadmin.
I would regard it as quite a security risk to tamper with the permissions around that, really.
A good start would be:
$ vsql -U <user> -w <pwd> -h 127.127.127.128 -d dbnm -x -c "select * from host_resources"
-[ RECORD 1 ]------------------+-----------------------------------------
host_name | 127.127.127.128
open_files_limit | 65536
threads_limit | 7873
core_file_limit_max_size_bytes | 0
processor_count | 1
processor_core_count | 2
processor_description | Intel(R) Xeon(R) CPU E5-2670 0 # 2.60GHz
opened_file_count | 7
opened_socket_count | 10
opened_nonfile_nonsocket_count | 7
total_memory_bytes | 8254820352
total_memory_free_bytes | 1034915840
total_buffer_memory_bytes | 523386880
total_memory_cache_bytes | 5516861440
total_swap_memory_bytes | 2097147904
total_swap_memory_free_bytes | 2097147904
disk_space_free_mb | 425185
disk_space_used_mb | 76682
disk_space_total_mb | 501867
system_open_files | 1440
system_max_files | 798044
-[ RECORD 2 ]------------------+-----------------------------------------
host_name | 127.127.127.129
open_files_limit | 65536
threads_limit | 7873
core_file_limit_max_size_bytes | 0
processor_count | 1
processor_core_count | 2
processor_description | Intel(R) Xeon(R) CPU E5-2670 0 # 2.60GHz
opened_file_count | 7
opened_socket_count | 9
opened_nonfile_nonsocket_count | 7
total_memory_bytes | 8254820352
total_memory_free_bytes | 1836150784
total_buffer_memory_bytes | 487129088
total_memory_cache_bytes | 4774060032
total_swap_memory_bytes | 2097147904
total_swap_memory_free_bytes | 2097147904
disk_space_free_mb | 447084
disk_space_used_mb | 54783
disk_space_total_mb | 501867
system_open_files | 1408
system_max_files | 798044
-[ RECORD 3 ]------------------+-----------------------------------------
host_name | 127.127.127.130
open_files_limit | 65536
threads_limit | 7873
core_file_limit_max_size_bytes | 0
processor_count | 1
processor_core_count | 2
processor_description | Intel(R) Xeon(R) CPU E5-2670 0 # 2.60GHz
opened_file_count | 7
opened_socket_count | 9
opened_nonfile_nonsocket_count | 7
total_memory_bytes | 8254820352
total_memory_free_bytes | 1747091456
total_buffer_memory_bytes | 531447808
total_memory_cache_bytes | 4813959168
total_swap_memory_bytes | 2097147904
total_swap_memory_free_bytes | 2097147904
disk_space_free_mb | 451444
disk_space_used_mb | 50423
disk_space_total_mb | 501867
system_open_files | 1408
system_max_files | 798044
In general, check the Vertica docu for system tables and information available there....

Related

what should be minimum ratio of dead tuple for a table to be considered for VACUUM FULL in Postgres

I am a developer and looking for an advise on optimisation or maintenance of Postgres database.
I am currently investigating on commands which helps in clean up/defragmentation of DB and release some memory to filesystem as DB disk storage space is usage is growing quickly. I found that "VACUUM FULL" can help release memory used by dead tuples. However could not find information on how many or percentage of dead tuples should be there before we consider running this command.
Currently we have two tables in Nextcloud Postgres database which has dead tuples. Also total relation size for these tables is higher than the disk usage reported by \dt+ command. I am providing the stats below. Please advise if they are eligible for "VACUUM FULL" based on given stats.
###########################################
Disk space usage per table (\dt+ command)
###########################################
Schema | Name | Type | Owner | Size | Description
--------+-----------------------------+-------+----------+------------+-------------
public | oc_activity | table | XXXXXXXX | 4796 MB |
public | oc_filecache | table | XXXXXXXX | 127 MB |
#################################
oc_activity total relation size
#################################
SELECT pg_size_pretty( pg_total_relation_size('oc_activity') )
----------------
pg_size_pretty
----------------
9666 MB
########################################
Additional stats for oc_activity table
########################################
relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count
-------+------------+-------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-------------+-----------------+--------------+-------------------------------+--------------+------------------+---------------+-------------------
yyyyy | public | oc_activity | 272 | 1046966870 | 4737 | 57914604 | 1548217 | 0 | 325585 | 0 | 11440511 | 940192 | 268430 | | | | 2023-02-15 10:01:36.657028+00 | 0 | 0 | 0 | 3
###################################
oc_filecache total relation size
###################################
SELECT pg_size_pretty( pg_total_relation_size('oc_filecache') )
----------------
pg_size_pretty
----------------
541 MB
#########################################
Additional stats for oc_filecache table
#########################################
SELECT * FROM pg_stat_all_tables WHERE relname='oc_filecache'
relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count
-------+------------+--------------+----------+--------------+------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-------------+-------------------------------+--------------+-------------------------------+--------------+------------------+---------------+-------------------
zzzzz | public | oc_filecache | 104541 | 28525391484 | 1974398333 | 2003365293 | 43575 | 695612 | 39541 | 348823 | 461510 | 19418 | 4069 | | 2023-02-16 10:46:15.165442+00 | | 2023-02-16 16:25:32.568168+00 | 0 | 8 | 0 | 33
There is no hard rule. I personally would consider a table uncomfortably bloated if the pgstattuple extension showed that less than a third or a quarter of the table are user data and the rest is dead tuples and empty space.
Rather than regularly running VACUUM (FULL) (which is downtime), you should strive to fix the problem that causes the table bloat in the first place.

is there any way to make vnode sync faster in TDengine database?

I created a database with two replicas, when I tried to write some data into the database, the vnode status is always syncing, how to make it sync faster?
taos> show vgroups;
vgId | tables | status | onlines | v1_dnode | v1_status | v2_dnode | v2_status | compacting |
=================================================================================================================
2 | 1000 | ready | 2 | 3 | master | 2 | slave | 0 |
3 | 1000 | ready | 2 | 2 | master | 1 | slave | 0 |
4 | 1000 | ready | 2 | 1 | master | 3 | slave | 0 |
5 | 1000 | ready | 2 | 3 | master | 2 | slave | 0 |
6 | 1000 | ready | 2 | 2 | master | 1 | slave | 0 |
7 | 1000 | ready | 2 | 1 | master | 3 | slave | 0 |
8 | 1000 | ready | 2 | 3 | master | 2 | slave | 0 |
9 | 1000 | ready | 2 | 2 | master | 1 | slave | 0 |
10 | 1000 | ready | 2 | 1 | master | 3 | syncing | 0 |
11 | 1000 | ready | 2 | 3 | master | 2 | slave | 0 |
Query OK, 10 row(s) in set (0.004323s)

Multi-dimensional data structure management in R

I have a concern about data organisation and the best approach to simplify some multi-layered data. Simply, I have a 10 replicates of small wood beams (BeamID, ~10) subjected to a 10 different treatment (TreatID, ~10), and each beam is load tested which produces a series data of a Load with consequent Displacement (ranging from 10 to 50 rows per test; I have code that corrects for disparities in row length). Each wood beam is tested multiple times (Rep, ~10).
My plan was to lump all this data into a 5-D array:
Array[Load, Deflection, BeamID, TreatID, Rep]
This way, I should be able to plot the load~deflection curves for a given BeamID, TreatID, for all Reps by using Array[ , ,1,1, ], right? So the hypothetical output for Array[ , ,1,1,1], would be:
+------------+--------+-----+
| Deflection | Load | Rep |
+------------+--------+-----+
| 0 | 0 | 1 |
| 6.35 | 10.5 | 1 |
| 12.7 | 20.8 | 1 |
| 19.05 | 45.3 | 1 |
| 25.4 | 75.2 | 1 |
+------------+--------+-----+
And Array[ , ,1,1,2] would be:
+------------+--------+-----+
| Deflection | Load | Rep |
+------------+--------+-----+
| 0 | 0 | 2 |
| 7.3025 | 12.075 | 2 |
| 14.605 | 23.92 | 2 |
| 21.9075 | 52.095 | 2 |
| 29.21 | 86.48 | 2 |
+------------+--------+-----+
Or I think I could keep it as a simpler, 'melted' dataframe, which would have columns for Load and Deflection, and BeamID, TreatID, and Rep would be repeated for each row of the test output.
+------------+--------+-----+--------+---------+
| Deflection | Load | Rep | BeamID | TreatID |
+------------+--------+-----+--------+---------+
| 0 | 0 | 1 | 1 | 1 |
| 6.35 | 10.5 | 1 | 1 | 1 |
| 12.7 | 20.8 | 1 | 1 | 1 |
| 19.05 | 45.3 | 1 | 1 | 1 |
| 25.4 | 75.2 | 1 | 1 | 1 |
| 0 | 0 | 2 | 1 | 1 |
| 7.3025 | 12.075 | 2 | 1 | 1 |
| 14.605 | 23.92 | 2 | 1 | 1 |
| 21.9075 | 52.095 | 2 | 1 | 1 |
| 29.21 | 86.48 | 2 | 1 | 1 |
+------------+--------+-----+--------+---------+
However, with the latter, I'm not sure how I could easily and discretely pull out all the Rep test values for a specific BeamID and TreatID, especially since I use a linear model to fit a 3rd order polynomial for an specific test to extract the slope of the curves. Having it as a continuous dataframe means I'd have to specify starting and stopping points to start the linear model, correct?
Thoughts, suggestions? Am I headed in the right direction in using a 5-D array? R is a new programming language for me, so please pardon my misunderstandings.

How to make a SQL "IF-THEN-ELSE" statement

I've seen other questions about SQL If-then-else stuff, but I'm not seeing how to relate it to what I'm trying to do. I've been using SQL for about a year now but only basic stuff and never this.
If I have a SQL table that looks like this
| Name | Version | Category | Value | Number |
|:-----:|:-------:|:--------:|:-----:|:------:|
| File1 | 1.0 | Time | 123 | 1 |
| File1 | 1.0 | Size | 456 | 1 |
| File1 | 1.0 | Final | 789 | 1 |
| File2 | 1.0 | Time | 312 | 1 |
| File2 | 1.0 | Size | 645 | 1 |
| File2 | 1.0 | Final | 978 | 1 |
| File3 | 1.0 | Time | 741 | 1 |
| File3 | 1.0 | Size | 852 | 1 |
| File3 | 1.0 | Final | 963 | 1 |
| File1 | 1.1 | Time | 369 | 2 |
| File1 | 1.1 | Size | 258 | 2 |
| File1 | 1.1 | Final | 147 | 2 |
| File2 | 1.1 | Time | 741 | 2 |
| File2 | 1.1 | Size | 734 | 2 |
| File2 | 1.1 | Final | 942 | 2 |
| File3 | 1.1 | Time | 997 | 2 |
| File3 | 1.1 | Size | 997 | 2 |
| File3 | 1.1 | Final | 985 | 2 |
How can I write a SQL IF, ELSE statement that creates a new column called "Replication" that follows this rule:
A = B + 1 when x = 1
else
A = B
where A = the number we will use for the next Number
B = Max(Number)
x = Replication count (this is the number of times that a loop is executed. x=i)
The results table will look like this:
| Name | Version | Category | Value | Number | Replication |
|:-----:|:-------:|:--------:|:-----:|:------:|:-----------:|
| File1 | 1.0 | Time | 123 | 1 | 1 |
| File1 | 1.0 | Size | 456 | 1 | 1 |
| File1 | 1.0 | Final | 789 | 1 | 1 |
| File2 | 1.0 | Time | 312 | 1 | 1 |
| File2 | 1.0 | Size | 645 | 1 | 1 |
| File2 | 1.0 | Final | 978 | 1 | 1 |
| File1 | 1.0 | Time | 369 | 1 | 2 |
| File1 | 1.0 | Size | 258 | 1 | 2 |
| File1 | 1.0 | Final | 147 | 1 | 2 |
| File2 | 1.0 | Time | 741 | 1 | 2 |
| File2 | 1.0 | Size | 734 | 1 | 2 |
| File2 | 1.0 | Final | 942 | 1 | 2 |
| File1 | 1.1 | Time | 997 | 2 | 1 |
| File1 | 1.1 | Size | 997 | 2 | 1 |
| File1 | 1.1 | Final | 985 | 2 | 1 |
| File2 | 1.1 | Time | 438 | 2 | 1 |
| File2 | 1.1 | Size | 735 | 2 | 1 |
| File2 | 1.1 | Final | 768 | 2 | 1 |
| File1 | 1.1 | Time | 786 | 2 | 2 |
| File1 | 1.1 | Size | 486 | 2 | 2 |
| File1 | 1.1 | Final | 135 | 2 | 2 |
| File2 | 1.1 | Time | 379 | 2 | 2 |
| File2 | 1.1 | Size | 943 | 2 | 2 |
| File2 | 1.1 | Final | 735 | 2 | 2 |
EDIT: Based on the answer by Sean Lange, this is my 2nd attempt at a solution:
SELECT COALESCE(MAX)(Number) + CASE WHEN Replication = 1 then 1 else 0, 1) FROM Table
The COALESCE is in there for when there is no value yet in the Number column.
The IF/Else construct is used to control flow of statements in t-sql. You want a case expression, which is used to conditionally return values in a column.
https://msdn.microsoft.com/en-us/library/ms181765.aspx
Yours would be something like:
case when x = 1 then A else B end as A
As SeanLange pointed out in this case it would be better to use an CASE/WHEN but to illustrate how to use If\ELSE the way to do it in sql is like this:
if x = 1
BEGIN
---Do something
END
ELSE
BEGIN
--Do something else
END
I would say the best way to know the difference and when to use which is if you are writing a query and want a different field to appear based on a certain condition, use case/when. If a certain condition will cause a series of steps to happen then use if/else

MySQL Import into Innodb table severely spikes at a certain point

I'm trying to migrate a 30GB database from one server to another.
The short story is that at a certain point through the process, the amount of time it takes to import records severely increases as a spike. The following is from using the SOURCE command to import a chunk of 500k records (out of about ~25-30 million throughout the database) that was exported as an sql file that was ssh tunnelled over to the new server:
...
Query OK, 2871 rows affected (0.73 sec)
Records: 2871 Duplicates: 0 Warnings: 0
Query OK, 2870 rows affected (0.98 sec)
Records: 2870 Duplicates: 0 Warnings: 0
Query OK, 2865 rows affected (0.80 sec)
Records: 2865 Duplicates: 0 Warnings: 0
Query OK, 2871 rows affected (0.87 sec)
Records: 2871 Duplicates: 0 Warnings: 0
Query OK, 2864 rows affected (2.60 sec)
Records: 2864 Duplicates: 0 Warnings: 0
Query OK, 2866 rows affected (7.53 sec)
Records: 2866 Duplicates: 0 Warnings: 0
Query OK, 2879 rows affected (8.70 sec)
Records: 2879 Duplicates: 0 Warnings: 0
Query OK, 2864 rows affected (7.53 sec)
Records: 2864 Duplicates: 0 Warnings: 0
Query OK, 2873 rows affected (10.06 sec)
Records: 2873 Duplicates: 0 Warnings: 0
...
The spikes eventually average to 16-18 seconds per ~2800 rows affected. Granted I don't usually use Source for a large import, but for the sakes of showing legitimate output, I used it to understand when the spikes happen. Using mysql command or mysqlimport yields the same results. Even piping the results directly into the new database instead of through an sql file has these spikes.
As far as I can tell, this happens after a certain amount of records are inserted into a table. The first time I boot up a server and import a chunk that size, it goes through just fine. Give or take the estimated amount it handles until these spikes occur. I can't correlate that because I haven't consistently replicated the issue to evidently conclude that. There are ~20 tables that have sub 500,000 records that all imported just fine when those 20 tables were imported through a single command. This seems to only happen to tables that have an excessive amount of data. Granted, the solutions I've come cross so far seem to only address the natural DR that occurs when you import over time (The expected output in my case was that eventually at the end of importing 500k records, it would take 2-3 seconds per ~2800, whereas it seems the questions were addressing that at the end it shouldn't take that long). This comes from a single sugarCRM table called 'campaign_log', which has ~9 million records. I was able to import in chunks of 500k back onto the old server i'm migrating off of without these spikes occuring, so I assume this has to do with my new server configuration. Another thing is that whenever these spikes occur, the table that it is being imported into seems to have an awkward way of displaying the # of records via count. I know InnoDB gives count estimates, but the number doesn't succeed the ~, indicating the estimate. It usually is accurate and that each time you refresh the table, it doesn't change the amount it displays (This is based on what it reports through PHPMyAdmin)
Here's the following commands/InnoDB system variables I have on the new server:
INNODB System Vars:
+---------------------------------+------------------------+
| Variable_name | Value |
+---------------------------------+------------------------+
| have_innodb | YES |
| ignore_builtin_innodb | OFF |
| innodb_adaptive_flushing | ON |
| innodb_adaptive_hash_index | ON |
| innodb_additional_mem_pool_size | 8388608 |
| innodb_autoextend_increment | 8 |
| innodb_autoinc_lock_mode | 1 |
| innodb_buffer_pool_instances | 1 |
| innodb_buffer_pool_size | 8589934592 |
| innodb_change_buffering | all |
| innodb_checksums | ON |
| innodb_commit_concurrency | 0 |
| innodb_concurrency_tickets | 500 |
| innodb_data_file_path | ibdata1:10M:autoextend |
| innodb_data_home_dir | |
| innodb_doublewrite | ON |
| innodb_fast_shutdown | 1 |
| innodb_file_format | Antelope |
| innodb_file_format_check | ON |
| innodb_file_format_max | Antelope |
| innodb_file_per_table | OFF |
| innodb_flush_log_at_trx_commit | 1 |
| innodb_flush_method | fsync |
| innodb_force_load_corrupted | OFF |
| innodb_force_recovery | 0 |
| innodb_io_capacity | 200 |
| innodb_large_prefix | OFF |
| innodb_lock_wait_timeout | 50 |
| innodb_locks_unsafe_for_binlog | OFF |
| innodb_log_buffer_size | 8388608 |
| innodb_log_file_size | 5242880 |
| innodb_log_files_in_group | 2 |
| innodb_log_group_home_dir | ./ |
| innodb_max_dirty_pages_pct | 75 |
| innodb_max_purge_lag | 0 |
| innodb_mirrored_log_groups | 1 |
| innodb_old_blocks_pct | 37 |
| innodb_old_blocks_time | 0 |
| innodb_open_files | 300 |
| innodb_print_all_deadlocks | OFF |
| innodb_purge_batch_size | 20 |
| innodb_purge_threads | 1 |
| innodb_random_read_ahead | OFF |
| innodb_read_ahead_threshold | 56 |
| innodb_read_io_threads | 8 |
| innodb_replication_delay | 0 |
| innodb_rollback_on_timeout | OFF |
| innodb_rollback_segments | 128 |
| innodb_spin_wait_delay | 6 |
| innodb_stats_method | nulls_equal |
| innodb_stats_on_metadata | ON |
| innodb_stats_sample_pages | 8 |
| innodb_strict_mode | OFF |
| innodb_support_xa | ON |
| innodb_sync_spin_loops | 30 |
| innodb_table_locks | ON |
| innodb_thread_concurrency | 0 |
| innodb_thread_sleep_delay | 10000 |
| innodb_use_native_aio | ON |
| innodb_use_sys_malloc | ON |
| innodb_version | 5.5.39 |
| innodb_write_io_threads | 8 |
+---------------------------------+------------------------+
System Specs:
Intel Xeon E5-2680 v2 (Ivy Bridge) 8 Processors
15GB Ram
2x80 SSDs
CMD to Export:
mysqldump -u <olduser> <oldpw>, <olddb> <table> --verbose --disable-keys --opt | ssh -i <privatekey> <newserver> "cat > <nameoffile>"
Thank you for any assistance. Let me know if there's any other information I can provide.
I figured it out. I increased the innodb_log_file_size from 5MB to 1024MB. While it did significantly increase the amount of records I imported (Never went above 1 second per 3000 rows), it also fixed the spikes. There were only 2 in all the records I imported, but after they happened, they immediately went back to taking sub 1 second.

Resources