Should most of getrusage's fields be 0? - c

I've written two system calls in linux, and I measure both of their resource usages with getrusage within the system call. However, most of the results I get are 0, and I'm not sure if that makes sense. Here's the output:
[ 4103.028728] DELTA RESULTS:
[ 4103.028746] u.tv_sec: 0
[ 4103.028748] s.tv_sec: 0
[ 4103.028749] u.tv_usec: 0
[ 4103.028751] s.tv_usec: 971765
[ 4103.028753] maxrss: 0
[ 4103.028755] ixrss: 0
[ 4103.028756] idrss: 0
[ 4103.028758] isrss: 0
[ 4103.028760] minflt: 0
[ 4103.028761] majflt: 0
[ 4103.028763] nswap: 0
[ 4103.028765] inblock: 0
[ 4103.028766] oublock: 0
[ 4103.028768] msgsnd: 0
[ 4103.028769] msgrcv: 0
[ 4103.028771] nsignals: 0
[ 4103.028773] nvcsw: 199
[ 4103.028774] nivcsw: 5
[ 4103.028961] CONTROL RESULTS:
[ 4103.028966] u.tv_sec: 0
[ 4103.028968] s.tv_sec: 0
[ 4103.028970] u.tv_usec: 1383
[ 4103.028972] s.tv_usec: 971998
[ 4103.028974] maxrss: 2492
[ 4103.028975] ixrss: 0
[ 4103.028977] idrss: 0
[ 4103.028978] isrss: 0
[ 4103.028980] minflt: 75
[ 4103.028982] majflt: 0
[ 4103.028984] nswap: 0
[ 4103.028986] inblock: 24
[ 4103.028987] oublock: 0
[ 4103.028989] msgsnd: 0
[ 4103.028991] msgrcv: 0
[ 4103.028992] nsignals: 0
[ 4103.028994] nvcsw: 200
[ 4103.028996] nivcsw: 5
I just want to know if this output is passable, or if it's a sign that somethings wrong, so I didn't put any of the source code. Thank you!

This looks right; I would not expect the syscall to make any changes in the vast majority of these metrics, which are measuring resources accounted to the process, not kernel resources. You should only see a change if you make a syscall like mmap that allocates new resources to the process, or one like read that ends up storing to previously copy-on-write memory belonging to the process.
With that said, I don't think calling getrusage like this makes much sense at all. It's normally for getting aggregate usage over a process's lifetime, not measuring deltas. Some of the more esoteric things may be hard to measure deltas for in other ways, but just time (real or cpu) can be measued via clock_gettime.

Related

Is there any way to display all the functions called by an executable in WinDBG (not just call stack)?

I am trying to debug an executable that does not work properly (does not receive segmentation fault, it just doesn't do what he should do) using WinDbg. I would like to see a call stack with all the functions that are called while running the executable. Is this possible in WinDbg or any other debugger?
yes as i commented use wt (watch and trace)
Read the docs
it can be configured in several ways
like only first level calls
upto nth level calls only
only in specific modules
only in main module etc
below is a simple trace of a function in ntdll that crosses um-km boundary
0:000> u . l1
ntdll!LdrpInitializeProcess+0x11bf:
76ff6113 e870fffdff call ntdll!NtQueryInformationProcess (76fd6088)
0:000> bp .+5 //set a bp on return address
0:000> bl
0 e 76ff6118 0001 (0001) 0:**** ntdll!LdrpInitializeProcess+0x11c4
0:000> wt
2 0 [ 0] ntdll!NtQueryInformationProcess
27 0 [ 0] aswhook
1 0 [ 1] aswhook
28 1 [ 0] aswhook
1 0 [ 1] 0x6efc0480
1 0 [ 1] 0x6efc0485
2 0 [ 1] ntdll!NtQueryInformationProcess
2 0 [ 2] ntdll!KiFastSystemCall
1 0 [ 1] ntdll!NtQueryInformationProcess
46 8 [ 0] aswhook
3 0 [ 1] aswhook
Breakpoint 0 hit
eax=00000000 ebx=7ffdf000 ecx=e8cb8789 edx=ffffffff esi=ffffffff edi=00000000
eip=76ff6118 esp=0018f59c ebp=0018f6f4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!LdrpInitializeProcess+0x11c4:
76ff6118 85c0 test eax,eax
0:000>

"strange" virtual memory chunks

I have an ARM 32bit C/C++ program running on Ubuntu 14.04. The max resident memory of the program is about 90MB. However, the virtual memory size of the program is huge - around 400MB. I use pmap with x switch to check the details, what I find and don't quite understand is there are about 20-30 virtual memory chunks, each chunk's size is 8188KB and mode is RWX. The actual resident size of each 8188KB chunk is very small, something like 8K, 12KB, 24KB. Below is a snapshot for your reference.
Address Kbytes RSS Dirty Mode Mapping
00010000 12 12 0 r-x-- theApp
00022000 4 4 4 r---- theApp
00023000 4 4 4 rw--- theApp
00024000 3556 3536 3536 rw--- [ anon ]
9941c000 4 0 0 ----- [ anon ]
9941d000 8188 8 8 rwx-- [ anon ]
99c1c000 4 0 0 ----- [ anon ]
99c1d000 8188 8 8 rwx-- [ anon ]
9a41c000 4 0 0 ----- [ anon ]
9a41d000 8188 8 8 rwx-- [ anon ]
9ac1c000 4 0 0 ----- [ anon ]
9ac1d000 8188 8 8 rwx-- [ anon ]
9b41c000 4 0 0 ----- [ anon ]
9b41d000 8188 8 8 rwx-- [ anon ]
9bc1c000 4 0 0 ----- [ anon ]
9bc1d000 8188 24 24 rwx-- [ anon ]
...
b57f6000 28 28 28 rw--- libcrypto.so.1.0.0
b57fd000 12 4 4 rw--- [ anon ]
b5800000 4 0 0 ----- [ anon ]
b5801000 8188 12 12 rwx-- [ anon ]
...
In GDB, there is no information after I run - info symbol "the chunk's address". I also dump(dump memory command) some chunks into a file, each bit is zero.
In the program, there is no direct call to mmap. It only explicitly calls malloc and new for heap memory allocation. And I don't find any memory allocation size is 8188KB. Actually, I could identify each heap allocation of my code from the output of pmap. Of course, the program also uses some 3rd party shared libraries which I don't know the internal implementation.
My questions are:
where may the memory chunks come from?
is there any other ways to trace them down?
if swap file of the system is disabled, will the program still running fine? As the actual physical RAM is less than 400MB.
Any input will be appreciated!
Thanks,
-Neil

Reshaping, overlapping columns of array into vector in MATLAB

I want to reshape an array into a vector by its columns, and I want to have an offset between each column, with the overlapping elements added together.
Any ideas? I've done it using a double for-loop but I was hoping for something more efficient...
for i=1:b
for j=1:a
overlap=j+(i-1)*offset;
vector(overlap) = vector(overlap) + (array(j,i));
end
end
for example I want to have:
[ 1 4 7 ]
[ 2 5 8 ]
[ 3 6 9 ]
and an offset of 1 between columns, then I want to get as a vector the following:
[ 1 2 7 5 13 8 9 ]
edit I thought of appending zeros and then adding per column like this
[ 1 2 3 0 0 0 0 ]
[ 0 0 4 5 6 0 0 ]
[ 0 0 0 0 7 8 9 ]
and then use sum per column in order to get a new vector with elements the sum of the columns.
Does anyone know of a quick way to create such diagonal matrices?
Basically what you need is a general formula for this matrix:
[ 1 2 3 0 0 0 0 ]
[ 0 0 4 5 6 0 0 ]
[ 0 0 0 0 7 8 9 ]
This is a little easier if we rewrite the matrix as follows:
[ 1 2 3 0 0 0 0 0 0 4 5 6 0 0 0 0 0 0 7 8 9 ]
I'll state without proof that the number of zeros between each set of non-zero numbers is equal to:
nz = (size(array,1) - overlap) * size(array,2);
You should be able to convince yourself that this is true fairly easily. Now we can do the following:
vector = [array;zeros(nz,size(array,2)];
vector = vector(1:end-nz);
which gives
vector = [ 1 2 3 0 0 0 0 0 0 4 5 6 0 0 0 0 0 0 7 8 9 ]
then we just reshape and sum:
vector = sum(reshape(vector,[],size(array,2))');
vector =
1 2 7 5 13 8 9

Query data from multi keys sorted sets in redis

I have several sorted sets stored in redis. Like:
ZADD tag:1 1 1 2 2 3 3 4 4 5 5 6 6
ZADD tag:2 21 1 22 2 23 3 24 4 25 5 26 6
ZADD tag:3 31 1 32 2 33 3 34 4 35 5 36 6
Here is my question: I want to get the data sorted by scores in tag:1 and tag:2, or tag:1 and tag:3, or tag:1,tag:2,and tag:3. That means I need to get data from different key combination([ 1 ] [ 2 ] [ 3 ] [ 1,2,3 ] [ 1,2 ] [ 2, 3 ] [ ... ] ). I have hundreds of this kind of sorted sets, with each sorted set that can be combined to any one/two/more of the others.
I kinda not choosing the ZUNIONSTORE, cause all the combination is temporary, and ZUNIONSTORE will create another new sorted set, and this set will have very low possibility for reusing. So is there any good idea to solve my problem, or any new solution to help me? Thanks in advance!
Despite your reluctance, use ZUNIONSTORE for this. Once you're done, just DEL the result. This workflow can be embedded in a Lua script that performs the actions and returns the unified result.

which clustering technique i should use?

i have a data matrix given as below..
it is the user access matrix..each row represents users and each column represents page category visited by that user.
0 8 1 0 0 8 0 0 0 0 0 0 0 11 2 2 0
1 0 7 0 0 0 0 0 1 1 0 0 0 0 0 0 1
1 0 1 1 0 0 0 0 0 1 0 0 0 1 0 0 0
6 1 0 0 0 2 6 0 0 0 0 1 0 0 0 0 0
5 3 2 0 2 0 0 0 0 0 1 0 0 0 1 0 0
2 3 0 1 0 1 0 0 0 0 0 1 0 3 0 0 0
9 0 1 1 0 0 5 0 0 0 1 2 0 0 0 0 0
5 1 4 0 0 0 1 0 0 2 0 0 0 9 0 0 0
5 5 0 2 0 1 0 0 0 0 1 1 0 0 0 0 0
1 2 0 0 2 3 3 0 0 1 1 0 0 0 4 0 0
0 1 0 1 0 2 0 0 1 0 0 0 0 2 0 0 0
5 4 0 0 1 0 0 0 0 0 1 0 0 2 0 0 0
0 0 0 2 0 0 2 12 1 0 0 0 2 0 0 0 0
6 1 0 0 0 0 58 15 7 0 1 0 0 0 0 0 0
1 0 2 0 0 1 1 0 0 0 2 0 0 0 0 0 0
I need to apply biclustering technique on it.
This biclustering technique will first generates user clusters and then generates page clusters.after that it combine both user and page clusters to generate biclusters.
Now i am confused about which clustering technique i should use for this purpose.
the best clustering will generate coherent biclusters from this matrix.
Here is a summary of several clustering algorithms that can help to answer the question
"which clustering technique i should use?"
There is no objectively "correct" clustering algorithm Ref
Clustering algorithms can be categorized based on their "cluster model". An algorithm designed for a particular kind of model will generally fail on a different kind of model. For eg, k-means cannot find non-convex clusters, it can find only circular shaped clusters.
Therefore, understanding these "cluster models" becomes the key to understanding how to choose among the various clustering algorithms / methods. Typical cluster models include:
[1] Connectivity models: Builds models based on distance connectivity. Eg hierarchical clustering. Used when we need different partitioning based on tree cut height. R function: hclust in stats package.
[2] Centroid models: Builds models by representing each cluster by a single mean vector. Used when we need crisp partitioning (as opposed to fuzzy clustering described later). R function: kmeans in stats package.
[3] Distribution models: Builds models based on statistical distributions such as multivariate normal distributions used by the expectation-maximization algorithm. Used when cluster shapes can be arbitrary unlike k-means which assumes circular clusters. R function: emcluster in the emcluster package.
[4] Density models: Builds models based on clusters as connected dense regions in the data space. Eg DBSCAN and OPTICS. Used when cluster shapes can be arbitrary unlike k-means which assumes circular clusters.. R function dbscan in package dbscan.
[5] Subspace models: Builds models based on both cluster members and relevant attributes. Eg biclustering (also known as co-clustering or two-mode-clustering). Used when simultaneous row and column clustering is needed. R function biclust in biclust package.
[6] Group models: Builds models based on the grouping information. Eg collaborative filtering (recommender algorithm). R function Recommender in recommenderlab package.
[7] Graph-based models: Builds models based on clique. Community structure detection algorithms try to find dense subgraphs in directed or undirected graphs. Eg R function cluster_walktrap in igraph package.
[8] Kohonen Self-Organizing Feature Map: Builds models based on neural network. R function som in the kohonen package.
[9] Spectral Clustering: Builds models based on non-convex cluster structure, or when a measure of the center is not a suitable description of the complete cluster. R function specc in the kernlab package.
[10] subspace clustering : For high-dimensional data, distance functions could be problematic. cluster models include the relevant attributes for the cluster. Eg, hddc function in the R package HDclassif.
[11] Sequence clustering: Group sequences that are related. rBlast package.
[12] Affinity propagation: Builds models based on message passing between data points. It does not require the number of clusters to be determined before running the algorithm. It is better for certain computer vision and computational biology tasks, e.g. clustering of pictures of human faces and identifying regulated transcripts, than k-means, Ref Rpackage APCluster.
[13] Stream clustering: Builds models based on data that arrive continuously such as telephone records, financial transactions etc. Eg R package BIRCH [https://cran.r-project.org/src/contrib/Archive/birch/]
[14] Document clustering (or text clustering): Builds models based on SVD. It has used in topic extraction. Eg Carrot [http://search.carrot2.org] is an open source search results clustering engine which can cluster documents into thematic categories.
[15] Latent class model: It relates a set of observed multivariate variables to a set of latent variables. LCA may be used in collaborative filtering. R function Recommender in recommenderlab package has collaborative filtering functionality.
[16] Biclustering: Used to simultaneously cluster rows and columns of two-mode data. Eg R function biclust in package biclust.
[17] Soft clustering (fuzzy clustering): Each object belongs to each cluster to a certain degree. Eg R Fclust function in the fclust package.
You cannot tell which clustering algorithms is best by just looking at the matrix. You must try different algorithms (maybe k-means, bayes, nearest-neighbor or whatever your library has). Make a cross validation (clustering is just a type of categorization where you categorize users to cluster centers) and evaluate the results. You could even print it to a chart. Then make a decision. No decision will be perfect, you will always have errors. And the result depends of what you expect. Maybe a result with more errors will have better results in your personal view.
Have you tried any algorithm yet?

Resources