How to get item count from DynamoDB? - database

I want to know item count with DynamoDB querying.
I can querying for DynamoDB, but I only want to know 'total count of item'.
For example, 'SELECT COUNT(*) FROM ... WHERE ...' in MySQL
$result = $aws->query(array(
'TableName' => 'game_table',
'IndexName' => 'week-point-index',
'KeyConditions' => array(
'week' => array(
'ComparisonOperator' => 'EQ',
'AttributeValueList' => array(
array(Type::STRING => $week)
)
),
'point' => array(
'ComparisonOperator' => 'GE',
'AttributeValueList' => array(
array(Type::NUMBER => $my_point)
)
)
),
));
echo Count($result['Items']);
this code gets the all users data higher than my point.
If count of $result is 100,000, $result is too much big.
And it would exceed the limits of the query size.
I need help.

With the aws dynamodb cli you can get it via scan as follows:
aws dynamodb scan --table-name <TABLE_NAME> --select "COUNT"
The response will look similar to this:
{
"Count": 123,
"ScannedCount": 123,
"ConsumedCapacity": null
}
notice that this information is in real time in contrast to the describe-table api

You can use the Select parameter and use COUNT in the request. It "returns the number of matching items, rather than the matching items themselves". Important, as brought up by Saumitra R. Bhave in a comment, "If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items. You will need to perform multiple Query operations in order to retrieve all of the results".
I'm Not familiar with PHP but here is how you could use it with Java. And then instead of using Count (which I am guessing is a function in PHP) on the 'Items' you can use the Count value from the response - $result['Count']:
final String week = "whatever";
final Integer myPoint = 1337;
Condition weekCondition = new Condition()
.withComparisonOperator(ComparisonOperator.EQ)
.withAttributeValueList(new AttributeValue().withS(week));
Condition myPointCondition = new Condition()
.withComparisonOperator(ComparisonOperator.GE)
.withAttributeValueList(new AttributeValue().withN(myPoint.toString()))
Map<String, Condition> keyConditions = new HashMap<>();
keyConditions.put("week", weekCondition);
keyConditions.put("point", myPointCondition);
QueryRequest request = new QueryRequest("game_table");
request.setIndexName("week-point-index");
request.setSelect(Select.COUNT);
request.setKeyConditions(keyConditions);
QueryResult result = dynamoDBClient.query(request);
Integer count = result.getCount();
If you don't need to emulate the WHERE clause, you can use a DescribeTable request and use the resulting item count to get an estimate.
The number of items in the specified table. DynamoDB updates this value approximately every six hours. Recent changes might not be reflected in this value.
Also, an important note from the documentation as noted by Saumitra R. Bhave in the comments on this answer:
If the size of the Query result set is larger than 1 MB, ScannedCount and Count represent only a partial count of the total items. You need to perform multiple Query operations to retrieve all the results (see Paginating Table Query Results).

Can be seen from UI as well.
Go to overview tab on table, you will see item count. Hope it helps someone.

I'm too late here but like to extend Daniel's answer about using aws cli to include filter expression.
Running
aws dynamodb scan \
--table-name <tableName> \
--filter-expression "#v = :num" \
--expression-attribute-names '{"#v": "fieldName"}' \
--expression-attribute-values '{":num": {"N": "123"}}' \
--select "COUNT"
would give
{
"Count": 2945,
"ScannedCount": 7874,
"ConsumedCapacity": null
}
That is, ScannedCount is total count and Count is the number of items which are filtered by given expression (fieldName=123).

Replace the table name and use the below query to get the data on your local environment:
aws dynamodb scan --table-name <TABLE_NAME> --select "COUNT" --endpoint-url http://localhost:8000
Replace the table name and remove the endpoint url to get the data on production environment
aws dynamodb scan --table-name <TABLE_NAME> --select "COUNT"

If you happen to reach here, and you are working with C#, here is the code:
var cancellationToken = new CancellationToken();
var request = new ScanRequest("TableName") {Select = Select.COUNT};
var result = context.Client.ScanAsync(request, cancellationToken).Result;
totalCount = result.Count;

If anyone is looking for a straight forward NodeJS Lambda count solution:
const data = await dynamo.scan({ Select: "COUNT", TableName: "table" }).promise();
// data.Count -> number of elements in table.

I'm posting this answer for anyone using C# that wants a fully functional, well-tested answer that demonstrates using query instead of scan. In particular, this answer handles more than 1MB size of items to count.
public async Task<int> GetAvailableCount(string pool_type, string pool_key)
{
var queryRequest = new QueryRequest
{
TableName = PoolsDb.TableName,
ConsistentRead = true,
Select = Select.COUNT,
KeyConditionExpression = "pool_type_plus_pool_key = :type_plus_key",
ExpressionAttributeValues = new Dictionary<string, AttributeValue> {
{":type_plus_key", new AttributeValue { S = pool_type + pool_key }}
},
};
var t0 = DateTime.UtcNow;
var result = await Client.QueryAsync(queryRequest);
var count = result.Count;
var iter = 0;
while ( result.LastEvaluatedKey != null && result.LastEvaluatedKey.Values.Count > 0)
{
iter++;
var lastkey = result.LastEvaluatedKey.Values.ToList()[0].S;
_logger.LogDebug($"GetAvailableCount {pool_type}-{pool_key} iteration {iter} instance key {lastkey}");
queryRequest.ExclusiveStartKey = result.LastEvaluatedKey;
result = await Client.QueryAsync(queryRequest);
count += result.Count;
}
_logger.LogDebug($"GetAvailableCount {pool_type}-{pool_key} returned {count} after {iter} iterations in {(DateTime.UtcNow - t0).TotalMilliseconds} ms.");
return count;
}
}

DynamoDB now has a 'Get Live Item Count' button in the UI. Please note the production caveat if you have a large table that will consume read capacity.

In Scala:
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder
import com.amazonaws.services.dynamodbv2.document.DynamoDB
val client = AmazonDynamoDBClientBuilder.standard().build()
val dynamoDB = new DynamoDB(client)
val tableDescription = dynamoDB.getTable("table name").describe().getItemCount()

Similar to Java in PHP only set Select PARAMETER with value 'COUNT'
$result = $aws->query(array(
'TableName' => 'game_table',
'IndexName' => 'week-point-index',
'KeyConditions' => array(
'week' => array(
'ComparisonOperator' => 'EQ',
'AttributeValueList' => array(
array(Type::STRING => $week)
)
),
'point' => array(
'ComparisonOperator' => 'GE',
'AttributeValueList' => array(
array(Type::NUMBER => $my_point)
)
)
),
'Select' => 'COUNT'
));
and acces it just like this :
echo $result['Count'];
but as Saumitra mentioned above be careful with resultsets largers than 1 MB, in that case use LastEvaluatedKey til it returns null to get the last updated count value.

Adding some additional context to this question. In some circumstances it makes sense to Scan the table to obtain the live item count. However, if this is a frequent occurrence or if you have large tables then it can be expensive from both a cost and performance point of view. Below, I highlight 3 ways to gain the item count for your tables.
1. Scan
Using a Scan requires you to read every item in the table, this works well for one off queries but it is not scalable and can become quite expensive. Using Select: COUNT will prevent returning data, but you must still pay for reading the entire table.
Pros
Gets you the most recent item count ("live")
Is a simple API call
Can be run in parallel to reduce time
Cons
Reads the entire dataset
Slow performance
High cost
CLI example
aws dynamodb scan \
--table-name test \
--select COUNT
2. DescribeTable
DynamoDB DescribeTable API provides you with an estimated value for ItemCount which is updated approx. every 6 hours.
The number of items in the specified table. DynamoDB updates this value approximately every six hours. Recent changes might not be reflected in this value. Ref.
Calling this API gives you an instant response, however, the value of the ItemCount could be up to 6 hours stale. In certain situations this value may be adequate.
Pros
Instant response
No cost to retrieve ItemCount
Can be called frequently
Cons
Data could be stale by up to 6 hours.
CLI Example
aws dynamodb describe-table \
--table-name test \
--query Table.ItemCount
DescribeTable and CloudWatch
As previously mentioned DescribeTable updates your tables ItemCount approx. every 6 hours. We can obtain that value and plot it on a custom CloudWatch graph which allows you to monitor your tables ItemCount over time, providing you historical data.
Pros
Provides historical data
Infer how your ItemCount changes over time
Reasonably easy to implement
Cons
Data could be stale by up to 6 hours.
Implementation
Tracking DynamoDB Storage History with CloudWatch showcases how to automatically push the value of DescribeTable to CloudWatch periodically using EventBridge and Lambda, however, it is designed to push TableSizeBytes instead of ItemCount. Some small modifications to the Lambda will allow you to record ItemCount.
Change this line from TableSizeBytes to ItemCount
Remove line 18 to line 27

You could use dynamodb mapper query.
PaginatedQueryList<YourModel> list = DymamoDBMapper.query(YourModel.class, queryExpression);
int count = list.size();
it calls loadAllResults() that would lazily load next available result until allResultsLoaded.
Ref: https://docs.amazonaws.cn/en_us/amazondynamodb/latest/developerguide/DynamoDBMapper.Methods.html#DynamoDBMapper.Methods.query

This is how you would do it using the DynamoDBMapper (Kotlin syntax), example with no filters at all:
dynamoDBMapper.count(MyEntity::class.java, DynamoDBScanExpression())

$aws = new Aws\DynamoDb\DynamoDbClient([
'region' => 'us-west-2',
'version' => 'latest',
]);
$result = $aws->scan(array(
'TableName' => 'game_table',
'Count' => true
));
echo $result['Count'];

len(response['Items'])
will give you the count of the filtered rows
where,
fe = Key('entity').eq('tesla')
response = table.scan(FilterExpression=fe)

I used scan to get total count of the required tableName.Following is a Java code snippet for same
Long totalItemCount = 0;
do{
ScanRequest req = new ScanRequest();
req.setTableName(tableName);
if(result != null){
req.setExclusiveStartKey(result.getLastEvaluatedKey());
}
result = client.scan(req);
totalItemCount += result.getItems().size();
} while(result.getLastEvaluatedKey() != null);
System.out.println("Result size: " + totalItemCount);

This is solution for AWS JavaScript SDK users, it is almost same for other languages.
Result.data.Count will give you what you are looking for
apigClient.getitemPost({}, body, {})
.then(function(result){
var dataoutput = result.data.Items[0];
console.log(result.data.Count);
}).catch( function(result){
});

Related

Two queries mysql in one json object and retrieve each details in Codeigniter

I have two tables that I want to convert them to json like this:
$first_query = $this->db->query("SELECT * FROM `parameters` WHERE patient_id=7 ORDER BY created_on DESC LIMIT 10");
$second_query = $this->db->query("SELECT * FROM `pat_details` WHERE email='".$email."' AND phone_num='".$phone_num."'");
$first_query returns the 10 set of records which should be stored in an array and $second_query returns the objects.I need to merge Two queries mysql in one object json and retrieve the details of the result.
The output should be:
[
{
"firstname":"xyz",
"id":"123456",
"mail":"xyz#gmail.com",
"parameters":[
{
"diabetic":"no",
"hypertension":"yes",
},
{
"diabetic":"no",
"hypertension":"yes",
},
{
"diabetic":"yes",
"hypertension":"no",
}
]
}
]
I am not able to aggregate these two queries into one and encode the results in json
You can use a join query or a sub query. Since the data is not relational to each other, there's not a huge performance benefit to this.
If you just want to combine the data, return or cast the data as an arrays, use array_merge() on the two arrays, and then use json_encode().
And be sure to make you queries safe from SQL injection. I'd suggest using the CodeIgniter Query Builder to make database interactions simpler and more secure.
i have found the solution
$first_query = $this->db->query("SELECT * FROM `parameters` WHERE patient_id=7 ORDER BY created_on DESC LIMIT 10");
$json= $firstquery->result();
$second_query = $this->db->query("SELECT * FROM `pat_details` WHERE email='".$email."' AND phone_num='".$phone_num."'");
$json2 = array();
foreach($secondquery->result_array() as $row){
$json2[] = array(
'name' => $row['name'],
'address' => $row['address'],
'mail' => $row['mail']
);
}
$json['parameters'] = $json2;
echo json_encode($json);

Cakephp 2.5 subtract count from unset array operation

I am using CakePHp 2.5 and need to remove some results from the query:
$this->paginate = $paginate;
$results = $this->paginate('services');
foreach($results as $key=>$data )
{
if( empty( $dado['services']['service_id'] ) )
{
unset($results[$key]);
}
The result count will keep the original count.
Is there a way to subtract, the paginate query count, when i do unset the query results?
Looking at the Paginator class can not see if there is a property with the result count information.
Don't post-process the results of an sql call.
If you post-process the results of the sql call, unless the results fit in one page the count is always going to be inaccurate - because you'd be removing results from the current page, yet other results you would remove would still be present in the other pages - affecting the count.
Make paginate return what you want
Instead, just make the database give you the results you want, and none you don't:
$this->paginate['conditions'] = ['service_id IS NOT NULL'];
$results = $this->paginate('services');
If service id can be null or 0, you can account for that using a greater-than comparison:
$this->paginate['conditions'] = ['service_id >' => 0];
$results = $this->paginate('services');

How to increment a variable in Gatlling Loop

I am trying to write a Gatling script where I read a starting number from a CSV file and loop through, say 10 times. In each iteration, I want to increment the value of the parameter.
It looks like some Scala or Java math is needed but could not find information on how to do it or how and where to combine Gatling EL with Scala or Java.
Appreciate any help or direction.
var numloop = new java.util.concurrent.atomic.AtomicInteger(0)
val scn = scenario("Scenario Name")
.asLongAs(_=> numloop.getAndIncrement() <3, exitASAP = false){
feed(csv("ids.csv")) //read ${ID} from the file
.exec(http("request")
.get("""http://finance.yahoo.com/q?s=${ID}""")
.headers(headers_1))
.pause(284 milliseconds)
//How to increment ID for the next iteration and pass in the .get method?
}
You copy-pasted this code from Gatling's Google Group but this use case was very specific.
Did you first properly read the documentation regarding loops? What's your use case and how doesn't it fit with basic loops?
Edit: So the question is: how do I get a unique id per loop iteration and per virtual user?
You can compute one for the loop index and a virtual user id. Session already has a unique ID but it's a String UUID, so it's not very convenient for what you want to do.
// first, let's build a Feeder that set an numeric id:
val userIdFeeder = Iterator.from(0).map(i => Map("userId" -> i))
val iterations = 1000
// set this userId to every virtual user
feed(userIdFeeder)
// loop and define the loop index
.repeat(iterations, "index") {
// set an new attribute named "id"
exec{ session =>
val userId = session("userId").as[Int]
val index = session("index").as[Int]
val id = iterations * userId + index
session.set("id", id)
}
// use id attribute, for example with EL ${id}
}
Here is my answer to this:
Problem Statement: I had to repeat the gatling execution for configured set of times, and my step name has to be dynamic.
object UrlVerifier {
val count = new java.util.concurrent.atomic.AtomicInteger(0)
val baseUrl = Params.applicationBaseUrl
val accessUrl = repeat(Params.noOfPagesToBeVisited,"index") {
exec(session=> {
val randomUrls: List[String] = UrlFeeder.getUrlsToBeTested()
session.set("index", count.getAndIncrement).set("pageToTest", randomUrls(session("index").as[Int]))
}
).
exec(http("Accessing Page ${pageToTest}")
.get(baseUrl+"${pageToTest}")
.check(status.is(200))).pause(Params.timeToPauseInSeconds)
}
So basically UrlFeeder give me list of String (urls to be tested) and in the exec, we are using count (AtomicInteger), and using this we are populating a variable named 'index' whose value will start from 0 and will be getAndIncremented in each iteration. This 'index' variable is the one which will be used within repeat() loop as we are specifying the name of counterVariable to be used as 'index'
Hope it helps others as well.

Eager load, ArrayResult & Doctrine 2

I need to provide a webservice which returns articles.
I want to include the user relationship in that result to avoid my clients to call another method to load the user object.
I use an Array Result because I want a collection of array (I think it's better to work with) so I wish I could eager load my user.
I tried:
* #ManyToOne(targetEntity="\My\Model\User\User", fetch="EAGER")
But it doesn't look to work.`
Edit, some code:
public function getPublishedArticles($page, $count, $useArrayResult = false) {
$qb = $this->createQueryBuilder('a');
$qb->where('a.status = :status')
->orderBy('a.published_date', 'DESC')
->addOrderBy('a.creation_date', 'DESC')
->setParameter('status', Article::STATUS_PUBLISHED )
->andWhere('a.published_date <= :date')
->setParameter('date', date('Y-m-d'));
}
$adapter = new PaginationAdapter($qb->getQuery());
$adapter->useArrayResult($useArrayResult);
$paginator = new \Zend_Paginator($adapter);
$paginator->setItemCountPerPage($itemCount)
->setCurrentPageNumber($page);
return $paginator;
}
And I call this method with the $useArrayResult flag sets to TRUE
When you're using DQL query you have add JOIN clause to join related entities:
$qb->createQueryBuilder('a')
->addSelect('u')
->join('a.user', 'u')
...
fetch="EAGER" and fetch="LAZY" are being used when you're fetching entities using EntityManager, ie:
$article = $em->find('Entity\Article', 123);

Custom query with Castle ActiveRecord

I'm trying to figure out how to execute a custom query with Castle ActiveRecord.
I was able to run simple query that returns my entity, but what I really need is the query like that below (with custom field set):
select count(1) as cnt, data from workstationevent where serverdatetime >= :minDate and serverdatetime < :maxDate and userId = 1 group by data having count(1) > :threshold
Thanks!
In this case what you want is HqlBasedQuery. Your query will be a projection, so what you'll get back will be an ArrayList of tuples containing the results (the content of each element of the ArrayList will depend on the query, but for more than one value will be object[]).
HqlBasedQuery query = new HqlBasedQuery(typeof(WorkStationEvent),
"select count(1) as cnt, data from workstationevent where
serverdatetime >= :minDate and serverdatetime < :maxDate
and userId = 1 group by data having count(1) > :threshold");
var results =
(ArrayList)ActiveRecordMediator.ExecuteQuery(query);
foreach(object[] tuple in results)
{
int count = (int)tuple[0]; // = cnt
string data = (string)tuple[1]; // = data (assuming this is a string)
// do something here with these results
}
You can create an anonymous type to hold the results in a more meaningful fashion. For example:
var results = from summary in
(ArrayList)ActiveRecordMediator.ExecuteQuery(query)
select new {
Count = (int)summary[0], Data = (string)summary[1]
};
Now results will contain a collection of anonymous types with properties Count and Data. Or indeed you could create your own summary type and populate it out this way too.
ActiveRecord also has the ProjectionQuery which does much the same thing but can only return actual mapped properties rather than aggregates or functions as you can with HQL.
Be aware though, if you're using ActiveRecord 1.0.3 (RC3) as I was, this will result in a runtime InvalidCastException. ActiveRecordMediator.ExecuteQuery returns an ArrayList and not a generic ICollection. So in order to make it work, just change this line:
var results = (ICollection<object[]>) ActiveRecordMediator.ExecuteQuery(query);
to
var results = (ArrayList) ActiveRecordMediator.ExecuteQuery(query);
and it should work.
Also note that using count(1) in your hql statement will make the query return an ArrayList of String instead of an ArrayList of object[] (which is what you get when using count(*).)
Just thought I'd point this out for the sake of having it all documented in one place.

Resources