cassandra read query taking time in different GCP[ INDIA and GCP US ] - database

We are migrating to GCP INDIA region from GCP US region. We have created 3 node cluster in INDIA and there is 3 node cluster in GCP US. While transferring the some read queries to GCP INDIA, we are facing high read response latency (30 Milliseconds) of the same query which is running in GCP US under 15 milliseconds.
GCP US (API) --> read from --> GCP Cassandra (US ) --> takes 10 -15 milliseconds.
But same flow for GCP INDIA region,
GCP INDIA (API ) ---> read from GCP Cassandra (INDIA )-->takes 30 milliseconds.
While checking the read queries tracing plan with local_quorum;
---------------------------------------------------------------------------------------------------+----------------------------+
Execute CQL3 query | 2021-07-13 11:39:26.806000 |
to_serv,eto_glusr_pref_loc FROM glusr_usr WHERE glusr_usr_id =12530; [Native-Transport-Requests-3] | 2021-07-13 11:39:26.807000 |
Preparing statement [Native-Transport-Requests-3] | 2021-07-13 11:39:26.808000 |
reading digest from /XX.XX.XX.XX [Native-Transport-Requests-3] | 2021-07-13 11:39:26.809000 |
Row cache miss [ReadStage-2] | 2021-07-13 11:39:26.809000 |
Row cache miss [ReadStage-2] | 2021-07-13 11:39:26.811000 | XX
Partition index with 0 entries found for sstable 139241 [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Row cache miss [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Executing single-partition query on glusr_usr [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Executing single-partition query on glusr_usr [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Acquiring sstable references [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Acquiring sstable references [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Skipped 0/13 non-slice-intersecting sstables, included 0 due to tombstones [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Skipped 0/13 non-slice-intersecting sstables, included 0 due to tombstones [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Partition index with 0 entries found for sstable 138332 [ReadStage-2] | 2021-07-13 11:39:26.811000 |
Partition index with 0 entries found for sstable 138620 [ReadStage-2] | 2021-07-13 11:39:26.813000 |
Partition index with 0 entries found for sstable 138331 [ReadStage-2] | 2021-07-13 11:39:26.815000 |
Partition index with 0 entries found for sstable 139240 [ReadStage-2] | 2021-07-13 11:39:26.815000 |
Partition index with 0 entries found for sstable 138783 [ReadStage-2] | 2021-07-13 11:39:26.817000 |
Partition index with 0 entries found for sstable 139719 [ReadStage-2] | 2021-07-13 11:39:26.817000 |
Partition index with 0 entries found for sstable 138621 [ReadStage-2] | 2021-07-13 11:39:26.817000 |
Partition index with 0 entries found for sstable 139855 [ReadStage-2] | 2021-07-13 11:39:26.819000 |
Partition index with 0 entries found for sstable 138950 [ReadStage-2] | 2021-07-13 11:39:26.820000 |
Partition index with 0 entries found for sstable 139096 [ReadStage-2] | 2021-07-13 11:39:26.820000 |
Bloom filter allows skipping sstable 139878 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Bloom filter allows skipping sstable 140774 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Bloom filter allows skipping sstable 139883 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Bloom filter allows skipping sstable 140789 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Bloom filter allows skipping sstable 139888 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Bloom filter allows skipping sstable 140794 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Bloom filter allows skipping sstable 139893 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Partition index with 0 entries found for sstable 140831 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Bloom filter allows skipping sstable 141000 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Bloom filter allows skipping sstable 141037 [ReadStage-2] | 2021-07-13 11:39:26.821000 |
Caching 1 rows [ReadStage-2] | 2021-07-13 11:39:26.821001 |
Partition index with 0 entries found for sstable 139987 [ReadStage-2] | 2021-07-13 11:39:26.822000 |
Merged data from memtables and 5 sstables [ReadStage-2] | 2021-07-13 11:39:26.822000 |
Bloom filter allows skipping sstable 140052 [ReadStage-2] | 2021-07-13 11:39:26.822000 |
Bloom filter allows skipping sstable 140097 [ReadStage-2] | 2021-07-13 11:39:26.822000 |
Bloom filter allows skipping sstable 140134 [ReadStage-2] | 2021-07-13 11:39:26.822001 |
Bloom filter allows skipping sstable 140135 [ReadStage-2] | 2021-07-13 11:39:26.822001 |
Caching 1 rows [ReadStage-2] | 2021-07-13 11:39:26.822001 |
Merged data from memtables and 5 sstables [ReadStage-2] | 2021-07-13 11:39:26.823000 |
Read 1 live rows and 0 tombstone cells [ReadStage-2] | 2021-07-13 11:39:26.823000 |
Partition index with 0 entries found for sstable 140137 [ReadStage-2] | 2021-07-13 11:39:26.823000 |
Read 1 live rows and 0 tombstone cells [ReadStage-2] | 2021-07-13 11:39:26.823000 |
Enqueuing response to /XX.XX.XX.XX [ReadStage-2] | 2021-07-13 11:39:26.823000 |
Sending REQUEST_RESPONSE message to /XX.XX.XX.XX [MessagingService-Outgoing-/XX.XX.XX.XX-Large] | 2021-07-13 11:39:26.824000 | X
REQUEST_RESPONSE message received from /XX.XX.XX.XX [MessagingService-Incoming-/XX.XX.XX.XX] | 2021-07-13 11:39:26.825000 | X
Bloom filter allows skipping sstable 140140 [ReadStage-2] | 2021-07-13 11:39:26.825000 |
Processing response from /XX.XX.XX.XX [RequestResponseStage-2] | 2021-07-13 11:39:26.825000 |
My nodetool info
Key Cache : entries 52837, size 5.21 MiB, capacity 500 MiB, 14109 hits, 49400 requests, 0.286 recent hit rate, 14400 save period in seconds
Row Cache : entries 232, size 20.91 MiB, capacity 4.09 GiB, 78 hits, 540 requests, 0.144 recent hit rate, 0 save period in seconds
So my question is how to reduce this 30 milliseconds latency and why it is happening .
Please let me know, thank you.

From first look I would say that you have too many SSTables, you're spending almost 12ms looking for data in them. I would suggest to check how many SSTables read in both DCs by using nodetool tablehistograms.
But according to trace, it took only 19ms to return answer. If you see 30ms response time, then you may need to capture execution metric from the driver side, and make sure that you have fast link from driver to Cassandra nodes.

Related

ad-hoc slowly-changing dimensions materialization from external table of timestamped csvs in a data lake

Question
main question
How can I ephemerally materialize slowly changing dimension type 2 from from a folder of daily extracts, where each csv is one full extract of a table from from a source system?
rationale
We're designing ephemeral data warehouses as data marts for end users that can be spun up and burned down without consequence. This requires we have all data in a lake/blob/bucket.
We're ripping daily full extracts because:
we couldn't reliably extract just the changeset (for reasons out of our control), and
we'd like to maintain a data lake with the "rawest" possible data.
challenge question
Is there a solution that could give me the state as of a specific date and not just the "newest" state?
existential question
Am I thinking about this completely backwards and there's a much easier way to do this?
Possible Approaches
custom dbt materialization
There's a insert_by_period dbt materialization in the dbt.utils package, that I think might be exactly what I'm looking for? But I'm confused as it's dbt snapshot, but:
run dbt snapshot for each file incrementally, all at once; and,
built directly off of an external table?
Delta Lake
I don't know much about Databricks's Delta Lake, but it seems like it should be possible with Delta Tables?
Fix the extraction job
Is our oroblem is solved if we can make our extracts contain only what has changed since the previous extract?
Example
Suppose the following three files are in a folder of a data lake. (Gist with the 3 csvs and desired table outcome as csv).
I added the Extracted column in case parsing the timestamp from the filename is too tricky.
2020-09-14_CRM_extract.csv
| OppId | CustId | Stage | Won | LastModified | Extracted |
|-------|--------|-------------|-----|--------------|-----------|
| 1 | A | 2 - Qualify | | 9/1 | 9/14 |
| 2 | B | 3 - Propose | | 9/12 | 9/14 |
2020-09-15_CRM_extract.csv
| OppId | CustId | Stage | Won | LastModified | Extracted |
|-------|--------|-------------|-----|--------------|-----------|
| 1 | A | 2 - Qualify | | 9/1 | 9/15 |
| 2 | B | 4 - Closed | Y | 9/14 | 9/15 |
| 3 | C | 1 - Lead | | 9/14 | 9/15 |
2020-09-16_CRM_extract.csv
| OppId | CustId | Stage | Won | LastModified | Extracted |
|-------|--------|-------------|-----|--------------|-----------|
| 1 | A | 2 - Qualify | | 9/1 | 9/16 |
| 2 | B | 4 - Closed | Y | 9/14 | 9/16 |
| 3 | C | 2 - Qualify | | 9/15 | 9/16 |
End Result
Below is SCD-II for the three files as of 9/16. SCD-II as of 9/15 would be the same but OppId=3 has only one from valid_from=9/15 and valid_to=null
| OppId | CustId | Stage | Won | LastModified | valid_from | valid_to |
|-------|--------|-------------|-----|--------------|------------|----------|
| 1 | A | 2 - Qualify | | 9/1 | 9/14 | null |
| 2 | B | 3 - Propose | | 9/12 | 9/14 | 9/15 |
| 2 | B | 4 - Closed | Y | 9/14 | 9/15 | null |
| 3 | C | 1 - Lead | | 9/14 | 9/15 | 9/16 |
| 3 | C | 2 - Qualify | | 9/15 | 9/16 | null |
Interesting concept and of course it would a longer conversation than is possible in this forum to fully understand your business, stakeholders, data, etc. I can see that it might work if you had a relatively small volume of data, your source systems rarely changed, your reporting requirements (and hence, datamarts) also rarely changed and you only needed to spin up these datamarts very infrequently.
My concerns would be:
If your source or target requirements change how are you going to handle this? You will need to spin up your datamart, do full regression testing on it, apply your changes and then test them. If you do this as/when the changes are known then it's a lot of effort for a Datamart that's not being used - especially if you need to do this multiple times between uses; if you do this when the datamart is needed then you're not meeting your objective of having the datamart available for "instant" use.
Your statement "we have a DW as code that can be deleted, updated, and recreated without the complexity that goes along with traditional DW change management" I'm not sure is true. How are you going to test updates to your code without spinning up the datamart(s) and going through a standard test cycle with data - and then how is this different from traditional DW change management?
What happens if there is corrupt/unexpected data in your source systems? In a "normal" DW where you are loading data daily this would normally be noticed and fixed on the day. In your solution the dodgy data might have occurred days/weeks ago and, assuming it loaded into your datamart rather than erroring on load, you would need processes in place to spot it and then potentially have to unravel days of SCD records to fix the problem
(Only relevant if you have a significant volume of data) Given the low cost of storage, I'm not sure I see the benefit of spinning up a datamart when needed as opposed to just holding the data so it's ready for use. Loading large volumes of data everytime you spin up a datamart is going to be time-consuming and expensive. Possible hybrid approach might be to only run incremental loads when the datamart is needed rather than running them every day - so you have the data from when the datamart was last used ready to go at all times and you just add the records created/updated since the last load
I don't know whether this is the best or not, but I've seen it done. When you build your initial SCD-II table, add a column that is a stored HASH() value of all of the values of the record (you can exclude the primary key). Then, you can create an External Table over your incoming full data set each day, which includes the same HASH() function. Now, you can execute a MERGE or INSERT/UPDATE against your SCD-II based on primary key and whether the HASH value has changed.
Your main advantage doing things this way is you avoid loading all of the data into Snowflake each day to do the comparison, but it will be slower to execute this way. You could also load to a temp table with the HASH() function included in your COPY INTO statement and then update your SCD-II and then drop the temp table, which could actually be faster.

Difference between entities and relations in databases

I want to know if an entity is just a table in the database modelling context? And which is the difference between an entity and a relation? I know that the relation is the base concept of relational databases and has tabular presentation. Are the entity and the relation the same thing? Note: Do not confuse the relation with the relshionship.
An entity is somting like an object, you can put a name to an entity.
A relation is a "link" beetwen entity, you can put a verb to an entity, in your MCD of your databases.
for exemple :
in your MCD DOG end are entity end part of is the relation
+-------+ +-----+
|DOG | |RACE |
|-------| |-----|
|name |---(part of)----|name |
|age | +-----+
+-------+
in your MPD this MCD become :
+--------+ +---------+
|tDog | |tRace |
|--------| |---------|
|id_tDog | +--|id_tRace |
|nameDog | +--(tRace_to_tDog)--| |nameRace |
|ageDog | | +---------+
|FK_tRace|<-+
+--------+

Google Datastore limitation

I wish to use Datastore, but I read that an entity size is limited to 1mb.
I have an entity "Users", which contains around 50k "User". I wonder if the entity size restriction is not too restrictive for my case. And if a day I will have more users, will I be blocked.
This is how I imagine my database, maybe I misunderstood how it's supposed to work:
+--------- Datastore -------------+
| |
| +---------- Users ------------+ |
| | | |
| | +---------- User ---------+ | |
| | | Name: Alpha | | |
| | +-------------------------+ | |
| | | |
| | +---------- User ---------+ | |
| | | Name: Beta | | |
| | +-------------------------+ | |
| +-----------------------------+ |
+---------------------------------+
Where "Users" is an entity which contains entities "User".
Thank you.
Your "KIND" is user, your "entities" are EACH user. So no matter how MANY users you have, as long as EACH user is under a meg, you're fine.
The only limit to the size of the full "kind" is what you're willing to pay in storage. Reading up on this doc, or watching this introduction video could give some high level advice to your situation.
To better understand keys and indexes (another VERY important concept of datastore), I would suggest this video that explains VERY well how composite indexes work and behave :)

Cassandra Reads seem to slow

I was profiling an application that uses Cassandra and it turned out that reads were the bottleneck. At closer inspection it seems they take way to long, I would really appreciate some help in understanding why.
The application reads always the whole set of rows for a given partition key (the query is of the form SELECT * FROM table WHERE partition_key = ?). Unsurprisingly, the read time is O(number of rows for partition key), however the constant, seems way to high. After examining the query plan it turns out that the majority of time is spent on the "merging data from mem and sstables".
This step takes over 200ms for a partition key of ~5000 rows, where a row consists of 9 columns, and is less than 100 bytes. Given the read throughput of a SSD, reading sequentially 0.5MB should happen instantaneously.
Actually, I doubt this is to do with I/O at all. The machine used to have a spinning disk which was replaced with the SSD it has now. The change had no impact on query performance. I think there is something very involved in Cassandra processing or how it reads the data of disk that makes this operation very expensive.
Merging from more than one SSTable or iterating over tombstoned cells does not explain this. First of all, it should take milliseconds, second of all this is happening consistently, regardless if it is 2 or 4 SSTables and whether there are or not tombstoned cells.
To give some background:
Hardware: The machine that is running Cassandra is an 8 core, bare metal and SSD backed. I query it from cqlsh on the machine, the data is stored locally. There is no other load on it and looking at iostats, there is also barely any i/o.
Data model: The partition key, PK, is of text type, the primary key is a composite of the partition key and a bigint column K, and the rest are 7 mutable columns. The schema creation command is listed below.
CREATE TABLE inboxes (
PK text,
K bigint,
A boolean,
B boolean,
C boolean,
D boolean,
E bigint,
E bigint,
F int,
PRIMARY KEY (PK, K)
) WITH CLUSTERING ORDER BY (K DESC));
This is an example trace, with 3 SSTable involved, an a quite large number of tombstones.
activity | timestamp | source | source_elapsed
-------------------------------------------------------------------------------------------+--------------+-------------+----------------
execute_cql3_query | 03:14:07,507 | 10.161.4.77 | 0
Parsing select * from table where PK = 'key_value' LIMIT 10000;| 03:14:07,508 | 10.161.4.77 | 123
Preparing statement | 03:14:07,508 | 10.161.4.77 | 244
Executing single-partition query on table | 03:14:07,509 | 10.161.4.77 | 1155
Acquiring sstable references | 03:14:07,509 | 10.161.4.77 | 1173
Merging memtable tombstones | 03:14:07,509 | 10.161.4.77 | 1195
Key cache hit for sstable 2906 | 03:14:07,509 | 10.161.4.77 | 1231
Seeking to partition beginning in data file | 03:14:07,509 | 10.161.4.77 | 1240
Key cache hit for sstable 1533 | 03:14:07,509 | 10.161.4.77 | 1550
Seeking to partition beginning in data file | 03:14:07,509 | 10.161.4.77 | 1561
Key cache hit for sstable 1316 | 03:14:07,509 | 10.161.4.77 | 1867
Seeking to partition beginning in data file | 03:14:07,509 | 10.161.4.77 | 1878
Merging data from memtables and 3 sstables | 03:14:07,510 | 10.161.4.77 | 2180
Read 5141 live and 1944 tombstoned cells | 03:14:07,646 | 10.161.4.77 | 138734
Request complete | 03:14:07,742 | 10.161.4.77 | 235030
You're not just "reading sequentially 0.5MB", you're asking Cassandra to turn it into rows, filter out tombstones (deleted rows), and turn it into a resultset. 0.04ms per row is pretty reasonable; my rule of thumb is 0.5ms per 10 rows for an entire query.
Remember that Cassandra optimizes for short requests suitable for online applications; 10 to 100 row resultsets are typical. There is no parallelization within a single query.

Spare parts Database (structure)

There is a database of spare parts for cars, and online search by the name of spare parts. The user can type in the search, for example "safety cushion" or "airbag" - and the search result should be the same.
Therefore, I need somehow to implement the aliases for names of spare parts, and the question is how to store them in the database? Until now I have only one option that comes in mind - to create an additional table
| id | name of part | alias_id |
-------------------------------------------------- ---------------
| 1 | airbag | 10 |
| 2 | safety cushion | 10 |
And add additional field "alias_id" to table containing all the spare parts, and search by this field...
Are there other better options?
If I have understood correctly, it's best to have 3 tables in a many to many situation (if multiple parts have multiple aliases:
Table - Parts
| id | name of part |
-----------------------
| 1 | airbag |
| 2 | safety cushion |
Table - Aliases
| id | name of alias |
-----------------------
| 10 | AliasName |
Table - PartToAliases
| id | PartId | AliasId |
-------------------------
| 1 | 1 | 10 |
| 2 | 2 | 10 |
Your solution looks fine for the exact problem you described.
BUT what if someone writes safetycushion? or safety cuschion? With these kinds of variations your alias lookup table will soon become huge and and manualy maintaining these will not be feasible.
At that point you'll need a completely different approach (think full text search engine).
So if you are still sure you only need a couple of aliases your approach seems to be fine.

Resources