I thought of learning gprof.so i started with a simple program.
I have written a small program in c below:
#include<stdio.h>
#include<unistd.h>
void hello(void);
int main()
{
hello();
return 0;
}
void hello()
{
int i;
for(i=0; i<60; i++)
{
sleep(1);
printf("hello world\n");
}
}
i compiled my program using -pg option.
and i executed it to make sure that its working fine.
then i did
gprof -f hello a.out > gout
this gives me gout file created.
inside the gout file i can see the below information.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
-nan 0.00 0.00 3 0.00 0.00 __1cH__CimplWnew_atexit_implemented6F_b_ (1561)
-nan 0.00 0.00 1 0.00 0.00 __1cFhello6F_v_ (1562)
-nan 0.00 0.00 1 0.00 0.00 __1cG__CrunMdo_exit_code6F_v_ (1563)
-nan 0.00 0.00 1 0.00 0.00 __1cG__CrunSregister_exit_code6FpG_v_v_ (1564)
-nan 0.00 0.00 1 0.00 0.00 __1cG__CrunVdo_exit_code_in_range6Fpv1_v_ (1565)
-nan 0.00 0.00 1 0.00 0.00 __1cH__CimplKcplus_fini6F_v_ (1566)
-nan 0.00 0.00 1 0.00 0.00 __1cH__CimplQ__type_info_hash2t5B6M_v_ (1567)
-nan 0.00 0.00 1 0.00 0.00 __1cU__STATIC_CONSTRUCTOR6F_v_ (1568)
-nan 0.00 0.00 1 0.00 0.00 __SLIP.FINAL__A (1569)
-nan 0.00 0.00 1 0.00 0.00 __SLIP.INIT_A (1570)
-nan 0.00 0.00 1 0.00 0.00 __cplus_fini_at_exit (1571)
-nan 0.00 0.00 1 0.00 0.00 _ex_deregister (1572)
-nan 0.00 0.00 1 0.00 0.00 main (1)
^L
Index by function name
(1562) __1cFhello6F_v_ (1567) __1cH__CimplQ__type(1571) __cplus_fini_at_exi
(1563) __1cG__CrunMdo_exit(1561) __1cH__CimplWnew_at(1572) _ex_deregister
(1564) __1cG__CrunSregiste(1568) __1cU__STATIC_CONST (1) main
(1565) __1cG__CrunVdo_exit(1569) __SLIP.FINAL__A
(1566) __1cH__CimplKcplus_(1570) __SLIP.INIT_A
i have given a sleep time of 60 sec.
and i am not seeing that 60 sec in the gprof output.
i believe its probably hidden inside the output.
could anybody pls help me understand the output of gprof?
gprof's sample doesn't consider I/O, sleep, and other async or blocked OS syscalls, so you can't see related time cost in gprof's report.
Related
I have data with multilabel classification. I used KNN model in order to classify it. The number of labels are 15, I got accuracy results for each label, averaged the results to get the accuracy of the model which is 93%.
The confusion matrix is showing bad numbers.
Would you tell me what does this mean? Is it overfitting? How can I solve my problem?
Accuracy and mean absolute error (mae) code
Input:
# Getting the accuracy of the model
y_pred1 = level_1_knn_model.predict(X_val1)
accuracy = (sum(y_val1==y_pred1)/y_val1.shape[0])*100
accuracy = sum(accuracy)/len(accuracy)
print("Accuracy: "+str(accuracy)+"%\n")
# Getting the mean absolute error
mae1 = mean_absolute_error(y_val1, y_pred1)
print("Mean Absolute Error: "+str(mae1))
Output:
Accuracy: [96.55462575 97.82146336 99.23207908 95.39247451 98.69340807 74.22793801
78.67975909 97.47825108 99.80189098 77.67264969 91.69399776 99.97084683
99.42621267 99.32682688 99.74159693]%
Accuracy: 93.71426804569977%
Mean Absolute Error: 9.703818402273944
Confusion Matrix and classification report code
Input:
# Calculate the confusion matrix
cMatrix1 = confusion_matrix(y_val1.argmax(axis=1), y_pred1.argmax(axis=1))
# Plot the confusion matrix
plt.figure(figsize=(11,10))
sns.heatmap(cMatrix1, annot=True, fmt='g')
# Calculate the classification report
classReport1 = classification_report(y_val1, y_pred1)
print("\nClassification Report:")
print(classReport1)
Output:
Classification Report:
precision recall f1-score support
0 0.08 0.00 0.01 5053
1 0.03 0.00 0.01 3017
2 0.00 0.00 0.00 1159
3 0.07 0.00 0.01 6644
4 0.00 0.00 0.00 1971
5 0.58 0.65 0.61 47222
6 0.39 0.33 0.36 27302
7 0.02 0.00 0.00 3767
8 0.00 0.00 0.00 299
9 0.58 0.61 0.60 40823
10 0.13 0.02 0.03 11354
11 0.00 0.00 0.00 44
12 0.00 0.00 0.00 866
13 0.00 0.00 0.00 1016
14 0.00 0.00 0.00 390
micro avg 0.54 0.43 0.48 150927
macro avg 0.13 0.11 0.11 150927
weighted avg 0.43 0.43 0.42 150927
samples avg 0.43 0.43 0.43 150927
I used gprof to get a profile of a c code which is running too slowly. Here is what I get:
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
100.05 0.16 0.16 etext
0.00 0.16 0.00 90993 0.00 0.00 Nel_wind
0.00 0.16 0.00 27344 0.00 0.00 calc_crab_dens
0.00 0.16 0.00 17472 0.00 0.00 Nel_radio
0.00 0.16 0.00 1786 0.00 0.00 sync
0.00 0.16 0.00 1 0.00 0.00 _fini
0.00 0.16 0.00 1 0.00 0.00 calc_ele
0.00 0.16 0.00 1 0.00 0.00 ic
0.00 0.16 0.00 1 0.00 0.00 initialize
0.00 0.16 0.00 1 0.00 0.00 make_table
I don't know what does "etext" mean and why is it taking 100.05% of time running. Thanks for your help!
I was having a similar issue and it was caused by me calling gprof with a different executable.
The accident occurred because I was recompiling with different options and naively called gprof with the same executable name on two different gmon.out files that were generated with different executables.
gprof exec1 exec1.gmon.out # Good, expected output
gprof exec1 exec2.gmon.out # Weird etext function with 0 calls, but lots of time consumed
Make sure you're not doing something similar.
I am trying to trace how this open source program, mhash computes it's hashing
I can run the program successfully by using using the following commands:
gcc -o example example.c -lmhash
(also, mhash is currently installed, and I am running Ubuntu Linux)
Mhash can be found here: http://mhash.sourceforge.net/
and the example that I have tried is here:
#include <mhash.h>
#include <stdio.h>
int main()
{
char password[] = "Jefe";
int keylen = 4;
char data[] = "what do ya want for nothing?";
int datalen = 28;
MHASH td;
unsigned char *mac;
int j;
td = mhash_hmac_init(MHASH_MD5, password, keylen,
mhash_get_hash_pblock(MHASH_MD5));
mhash(td, data, datalen);
mac = mhash_hmac_end(td);
/*
* The output should be 0x750c783e6ab0b503eaa86e310a5db738
* according to RFC 2104.
*/
printf("0x");
for (j = 0; j < mhash_get_block_size(MHASH_MD5); j++) {
printf("%.2x", mac[j]);
}
printf("\n");
exit(0);
}
I have read the API's, it has very well documentations, but there are soo many files, I do not know from which areas it inherits it's algorithms from?
Thanks for your time and help in advance
Your question seems a little vague to me ... I'm not sure I fully understand it. I'll adventure myself into an answer though.
If you simply don't know what gets executed for crunching that MD5 hash the easiest way to get into it is probably to attach yourself with a debugger on this example program of yours. Make sure you have the debug flags enabled on your mhash library (which seem to be on by default), then step in mhash and see where that gets you. You cannot miss anything this way.
In gdb it would look something like this (You'd probably want to use an IDE - eclipse perhaps, to make it a LOT prettier):
$ gdb ./test.exe
..
Reading symbols from /home/B41655/workspace/ctest/test.exe...done.
(gdb) break main
Breakpoint 1 at 0x4011af: file test.c, line 5.
(gdb) run
Starting program: /home/B41655/workspace/ctest/test.exe
[New Thread 10200.0x205c]
[New Thread 10200.0x27b0]
Breakpoint 1, main () at test.c:5
5 char password[] = "Jefe";
(gdb) s
6 int keylen = 4;
(gdb) s
7 char data[] = "what do ya want for nothing?";
(gdb) s
8 int datalen = 28;
(gdb) s
13 td = mhash_hmac_init(MHASH_MD5, password, keylen,
(gdb) s
mhash_get_hash_pblock (type=MHASH_MD5) at mhash.c:438
438 {
(gdb) s
441 MHASH_ALG_LOOP(ret = p->hash_pblock);
and so on ...
If by any chance you want to passively get some sort a call graph of your example program execution you could do that with a profiler. Using gprof on this program would issue something like this (this would require your library/program recompiled with -pg flag):
index % time self children called name
0.00 0.00 17/17 main [81]
[2] 0.0 0.00 0.00 17 mhash_get_block_size [2]
-----------------------------------------------
0.00 0.00 1/9 mhash [14]
0.00 0.00 2/9 mhash_hmac_deinit [17]
0.00 0.00 2/9 mhash_hmac_init [20]
0.00 0.00 2/9 MD5Update [9]
0.00 0.00 2/9 MD5Final [10]
[3] 0.0 0.00 0.00 9 mutils_memcpy [3]
-----------------------------------------------
0.00 0.00 1/6 mhash_deinit [15]
0.00 0.00 1/6 mhash_hmac_init [20]
0.00 0.00 2/6 mhash_hmac_deinit [17]
0.00 0.00 2/6 MD5Final [10]
[4] 0.0 0.00 0.00 6 mutils_bzero [4]
-----------------------------------------------
0.00 0.00 1/6 mhash_hmac_end_m [19]
0.00 0.00 1/6 mhash_hmac_init [20]
0.00 0.00 4/6 mhash_init_int [12]
[5] 0.0 0.00 0.00 6 mutils_malloc [5]
-----------------------------------------------
0.00 0.00 2/6 MD5Update [9]
0.00 0.00 4/6 MD5Final [10]
[6] 0.0 0.00 0.00 6 mutils_word32nswap [6]
-----------------------------------------------
0.00 0.00 1/5 mhash_deinit [15]
0.00 0.00 4/5 mhash_hmac_deinit [17]
[7] 0.0 0.00 0.00 5 mutils_free [7]
-----------------------------------------------
0.00 0.00 2/4 MD5Update [9]
0.00 0.00 2/4 MD5Final [10]
[8] 0.0 0.00 0.00 4 MD5Transform [8]
-----------------------------------------------
0.00 0.00 1/4 mhash [14]
0.00 0.00 1/4 mhash_hmac_init [20]
0.00 0.00 2/4 mhash_hmac_deinit [17]
[9] 0.0 0.00 0.00 4 MD5Update [9]
0.00 0.00 2/9 mutils_memcpy [3]
0.00 0.00 2/6 mutils_word32nswap [6]
0.00 0.00 2/4 MD5Transform [8]
-----------------------------------------------
0.00 0.00 1/2 mhash_deinit [15]
0.00 0.00 1/2 mhash_hmac_deinit [17]
[10] 0.0 0.00 0.00 2 MD5Final [10]
0.00 0.00 4/6 mutils_word32nswap [6]
0.00 0.00 2/4 MD5Transform [8]
0.00 0.00 2/9 mutils_memcpy [3]
0.00 0.00 2/6 mutils_bzero [4]
-----------------------------------------------
0.00 0.00 2/2 mhash_init_int [12]
[11] 0.0 0.00 0.00 2 MD5Init [11]
-----------------------------------------------
0.00 0.00 1/2 mhash_hmac_deinit [17]
0.00 0.00 1/2 mhash_hmac_init [20]
[12] 0.0 0.00 0.00 2 mhash_init_int [12]
0.00 0.00 4/6 mutils_malloc [5]
0.00 0.00 2/2 mutils_memset [13]
0.00 0.00 2/2 MD5Init [11]
-----------------------------------------------
0.00 0.00 2/2 mhash_init_int [12]
[13] 0.0 0.00 0.00 2 mutils_memset [13]
-----------------------------------------------
0.00 0.00 1/1 main [81]
[14] 0.0 0.00 0.00 1 mhash [14]
0.00 0.00 1/9 mutils_memcpy [3]
0.00 0.00 1/4 MD5Update [9]
-----------------------------------------------
0.00 0.00 1/1 mhash_hmac_deinit [17]
[15] 0.0 0.00 0.00 1 mhash_deinit [15]
0.00 0.00 1/6 mutils_bzero [4]
0.00 0.00 1/2 MD5Final [10]
0.00 0.00 1/5 mutils_free [7]
-----------------------------------------------
0.00 0.00 1/1 main [81]
[16] 0.0 0.00 0.00 1 mhash_get_hash_pblock [16]
-----------------------------------------------
0.00 0.00 1/1 mhash_hmac_end_m [19]
[17] 0.0 0.00 0.00 1 mhash_hmac_deinit [17]
0.00 0.00 4/5 mutils_free [7]
0.00 0.00 2/9 mutils_memcpy [3]
0.00 0.00 2/4 MD5Update [9]
0.00 0.00 2/6 mutils_bzero [4]
0.00 0.00 1/2 mhash_init_int [12]
0.00 0.00 1/2 MD5Final [10]
0.00 0.00 1/1 mhash_deinit [15]
-----------------------------------------------
0.00 0.00 1/1 main [81]
[18] 0.0 0.00 0.00 1 mhash_hmac_end [18]
0.00 0.00 1/1 mhash_hmac_end_m [19]
-----------------------------------------------
0.00 0.00 1/1 mhash_hmac_end [18]
[19] 0.0 0.00 0.00 1 mhash_hmac_end_m [19]
0.00 0.00 1/6 mutils_malloc [5]
0.00 0.00 1/1 mhash_hmac_deinit [17]
-----------------------------------------------
0.00 0.00 1/1 main [81]
[20] 0.0 0.00 0.00 1 mhash_hmac_init [20]
0.00 0.00 2/9 mutils_memcpy [3]
0.00 0.00 1/2 mhash_init_int [12]
0.00 0.00 1/6 mutils_malloc [5]
0.00 0.00 1/6 mutils_bzero [4]
0.00 0.00 1/4 MD5Update [9]
-----------------------------------------------
showing you which functions got executed and how they were called.
for a seemingly large code for AES when I profile the code using gprof with following command
cc file1.c file2.c -pg
./a.out
gprof a.out gmon.out > analysis.txt
cat analysis.txt
the output file shows time as 0 for all function calls
Flat profile:
Each sample counts as 0.01 seconds.
no time accumulated
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 576 0.00 0.00 galois_multiply
0.00 0.00 0.00 40 0.00 0.00 getSBoxValue
0.00 0.00 0.00 33 0.00 0.00 PrintArr
0.00 0.00 0.00 11 0.00 0.00 AddRoundKey
0.00 0.00 0.00 10 0.00 0.00 core
0.00 0.00 0.00 10 0.00 0.00 getRconValue
0.00 0.00 0.00 10 0.00 0.00 myrotate
0.00 0.00 0.00 10 0.00 0.00 shiftRow
0.00 0.00 0.00 10 0.00 0.00 subByte
0.00 0.00 0.00 9 0.00 0.00 mixColumn
0.00 0.00 0.00 1 0.00 0.00 ReadInput
0.00 0.00 0.00 1 0.00 0.00 expandKey
am I missing somthing.. kindly advise,
I tried using eclipse tptp, but couldnt figure out a way to profice c
code using eclipse, any ideas in that direction would also be
appreciated
Is there any tool online using which I can upload my code and extract
the detailed analysis report?
I have below output from gprof for my program:
Flat profile:
Each sample counts as 0.01 seconds.
no time accumulated
% cumulative self self total
time seconds seconds calls Ts/call Ts/call name
0.00 0.00 0.00 30002 0.00 0.00 insert
0.00 0.00 0.00 10124 0.00 0.00 getNode
0.00 0.00 0.00 3000 0.00 0.00 search
0.00 0.00 0.00 1 0.00 0.00 initialize
I have done optimizations and the run time I have is 0.01 secs(this is being calculated on a server where I'm uploading my code) which is the least I am getting at the moment. I am not able to reduce it further, though I want to. Does the 0.01 sec run time of my program has anything to do with the sampling time I see above in gprof output.
Call graph is as below:
gprof -q ./a.out gmon.out
Call graph (explanation follows)
granularity: each sample hit covers 2 byte(s) no time propagated
index % time self children called name
0.00 0.00 30002/30002 main [10]
[1] 0.0 0.00 0.00 30002 insert [1]
0.00 0.00 10124/10124 getNode [2]
-----------------------------------------------
0.00 0.00 10124/10124 insert [1]
[2] 0.0 0.00 0.00 10124 getNode [2]
-----------------------------------------------
0.00 0.00 3000/3000 main [10]
[3] 0.0 0.00 0.00 3000 search [3]
-----------------------------------------------
0.00 0.00 1/1 main [10]
[4] 0.0 0.00 0.00 1 initialize [4]
-----------------------------------------------
While using `time /bin/sh -c ' ./a.out < inp.in '` on my machine I get below which varies slightly on every run .
real 0m0.024s
user 0m0.016s
sys 0m0.004s
real 0m0.017s
user 0m0.008s
sys 0m0.004s
I am bit confused how to correlate time output and gprof o/p
According to your other question, you got it from 8 seconds down to 0.01 seconds.
That's pretty good.
Now if you want to go further, first do as #Peter suggested in his comment.
Run the code many times inside main() so it runs long enough to get samples.
Then you could try my favorite technique.
It will be much more informative than gprof.
P.S. Don't worry about CPU percent.
All it tells is if your machine is busy and not doing much I/O.
It does not tell you anything about your program.