Increase heap to avoid Out of Memory Error in WEKA

Increase heap to avoid Out of Memory Error in WEKA - heap-memory

I am trying to run a classifier in WEKA, using a J48 classifier using the following command line:
$ java -Xmx2048m -cp /home/weka-3-7-9/weka.jar weka.classifiers.trees.J48 -t input.arff -i -k -d J48-data.model &
Although the size of my arff is 43.8 M, and I aumented the heap space to 2048m,
I still received the following errors:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.ArrayList.<init>(ArrayList.java:132)
at weka.core.Instances.initialize(Instances.java:196)
at weka.core.Instances.<init>(Instances.java:177)
at weka.classifiers.trees.j48.ClassifierSplitModel.split(ClassifierSplitModel.java:252)
at weka.classifiers.trees.j48.ClassifierTree.buildTree(ClassifierTree.java:159)
at weka.classifiers.trees.j48.C45PruneableClassifierTree.buildClassifier(C45PruneableClassifierTree.java:126)
at weka.classifiers.trees.J48.buildClassifier(J48.java:249)
at weka.classifiers.evaluation.Evaluation.evaluateModel(Evaluation.java:1485)
at weka.classifiers.Evaluation.evaluateModel(Evaluation.java:649)
at weka.classifiers.AbstractClassifier.runClassifier(AbstractClassifier.java:297)
at weka.classifiers.trees.J48.main(J48.java:1062)
Does someone know if I am doing something incorrectly? Or can point me to a different solution to increase the heap?
Thank you in advance.

Quick instruction for the Ubuntu-Users: The heap can be set by changing the line MEMORY="256m" in the file /usr/bin/weka with your favored editor.

Weka's instructions state that the "-Xmx..." command will not work from the simple command line interface. I believe you should increase the heap size by editing the RunWeka.ini file. The link I provided should point you in the right direction.

In terminal use this command
sudo gedit /usr/bin/weka
Change the size in the line
MEMORY="256m"

Related

warning: Error disabling address space randomization: Operation not permitted

what have i done wrong (or didn't do) that gdb is not working properly for me?
root#6be3d60ab7c6:/# cat minimal.c
int main()
{
int i = 1337;
return 0;
}
root#6be3d60ab7c6:/# gcc -g minimal.c -o minimal
root#6be3d60ab7c6:/# gdb minimal
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
.
.
.
Reading symbols from minimal...done.
(gdb) break main
Breakpoint 1 at 0x4004f1: file minimal.c, line 3.
(gdb) run
Starting program: /minimal
warning: Error disabling address space randomization: Operation not permitted
During startup program exited normally.
(gdb)
(gdb) print i
No symbol "i" in current context.

If you're using Docker, you probably need the --security-opt seccomp=unconfined option (as well as enabling ptrace):
docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined

For whatever reason, your user account doesn't have permission to disable the kernel's address space layout randomisation for this process. By default, gdb turns this off because it makes some sorts of debugging easier (in particular, it means the address of stack objects will be the same each time you run your program). Read more here.
You can work around this problem by disabling this feature of gdb with set disable-randomization off.
As for getting your user the permission needed to disable ASLR, it probably boils down to having write permission to /proc/sys/kernel/randomize_va_space. Read more here.

Building on wisbucky's answer (thank you!), here are the same settings for Docker compose:
security_opt:
- seccomp:unconfined
cap_add:
- SYS_PTRACE
The security option seccomp:unconfined fixed the address space randomization warnings.
The capability SYS_PTRACE didn't seem to have a noticeable effect even though the Docker documentation states that SYS_PTRACE is a capability that is "not granted by default". Perhaps I don't know what to look for.

How to solve "ptrace operation not permitted" when trying to attach GDB to a process?

I'm trying to attach a program with gdb but it returns:
Attaching to process 29139
Could not attach to process. If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
gdb-debugger returns "Failed to attach to process, please check privileges and try again."
strace returns "attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted"
I changed "kernel.yama.ptrace_scope" 1 to 0 and /proc/sys/kernel/yama/ptrace_scope 1 to 0 and tried set environment LD_PRELOAD=./ptrace.so with this:
#include <stdio.h>
int ptrace(int i, int j, int k, int l) {
printf(" ptrace(%i, %i, %i, %i), returning -1\n", i, j, k, l);
return 0;
}
But it still returns the same error. How can I attach it to debuggers?

If you are using Docker, you will probably need these options:
docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined
If you are using Podman, you will probably need its --cap-add option too:
podman run --cap-add=SYS_PTRACE

This is due to kernel hardening in Linux; you can disable this behavior by echo 0 > /proc/sys/kernel/yama/ptrace_scope or by modifying it in /etc/sysctl.d/10-ptrace.conf
See also this article about it in Fedora 22 (with links to the documentation) and this comment thread about Ubuntu and .

I would like to add that I needed --security-opt apparmor=unconfined along with the options that #wisbucky mentioned. This was on Ubuntu 18.04 (both Docker client and host). Therefore, the full invocation for enabling gdb debugging within a container is:
docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --security-opt apparmor=unconfined

Just want to emphasize a related answer. Let's say that you're root and you've done:
strace -p 700
and get:
strace: attach: ptrace(PTRACE_SEIZE, 700): Operation not permitted
Check:
grep TracerPid /proc/700/status
If you see something like TracerPid: 12, i.e. not 0, that's the PID of the program that is already using the ptrace system call. Both gdb and strace use it, and there can only be one active at a time.

Not really addressing the above use-case but I had this problem:
Problem: It happened that I started my program with sudo, so when launching gdb it was giving me ptrace: Operation not permitted.
Solution: sudo gdb ...

As most of us land here for Docker issues I'll add the Kubernetes answer as it might come in handy for someone...
You must add the SYS_PTRACE capability in your pod's security context
at spec.containers.securityContext:
securityContext:
capabilities:
add: [ "SYS_PTRACE" ]
There are 2 securityContext keys at 2 different places. If it tells you that the key is not recognized than you missplaced it. Try the other one.
You probably need to have a root user too as default. So in the other security context (spec.securityContext) add :
securityContext:
runAsUser: 0
runAsGroup: 0
fsGroup: 101
FYI : 0 is root. But the fsGroup value is unknown to me. For what I'm doing I don't care but you might.
Now you can do :
strace -s 100000 -e write=1 -e trace=write -p 16
You won't get the permission denied anymore !
BEWARE : This is the Pandora box. Having this in production it NOT recommended.

I was running my code with higher privileges to deal with Ethernet Raw Sockets by setting set capability command in Debian Distribution. I tried the above solution: echo 0 > /proc/sys/kernel/yama/ptrace_scope
or by modifying it in /etc/sysctl.d/10-ptrace.conf but that did not work for me.
Additionally, I also tried with set capabilities command for gdb in installed directory (usr/bin/gdb) and it works: /sbin/setcap CAP_SYS_PTRACE=+eip /usr/bin/gdb.
Be sure to run this command with root privileges.

Jesup's answer is correct; it is due to Linux kernel hardening. In my case, I am using Docker Community for Mac, and in order to do change the flag I must enter the LinuxKit shell using justin cormack's nsenter (ref: https://www.bretfisher.com/docker-for-mac-commands-for-getting-into-local-docker-vm/ ).
docker run -it --rm --privileged --pid=host justincormack/nsenter1
/ # cat /etc/issue
Welcome to LinuxKit
## .
## ## ## ==
## ## ## ## ## ===
/"""""""""""""""""\___/ ===
{ / ===-
\______ O __/
\ \ __/
\____\_______/
/ # cat /proc/sys/kernel/yama/ptrace_scope
1
/ # echo 0 > /proc/sys/kernel/yama/ptrace_scope
/ # exit

Maybe someone has attached this process with gdb.
ps -ef | grep gdb
can't gdb attach the same process twice.

I was going to answer this old question as it is unaccepted and any other answers are not got the point. The real answer may be already written in /etc/sysctl.d/10-ptrace.conf as it is my case under Ubuntu. This file says:
For applications launching crash handlers that need PTRACE, exceptions can
be registered by the debugee by declaring in the segfault handler
specifically which process will be using PTRACE on the debugee:
prctl(PR_SET_PTRACER, debugger_pid, 0, 0, 0);
So just do the same thing as above: keep /proc/sys/kernel/yama/ptrace_scope as 1 and add prctl(PR_SET_PTRACER, debugger_pid, 0, 0, 0); in the debugee. Then the debugee will allow debugger to debug it. This works without sudo and without reboot.
Usually, debugee also need to call waitpid to avoid exit after crash so debugger can find the pid of debugee.

If permissions are a problem, you probably will want to use gdbserver. (I almost always use gdbserver when I gdb, docker or no, for numerous reasons.) You will need gdbserver (Deb) or gdb-gdbserver (RH) installed in the docker image. Run the program in docker with
$ sudo gdbserver :34567 myprogram arguments
(pick a port number, 1025-65535). Then, in gdb on the host, say
(gdb) target remote 172.17.0.4:34567
where 172.17.0.4 is the IP address of the docker image as reported by /sbin/ip addr list run in the docker image. This will attach at a point before main runs. You can tb main and c to stop at main, or wherever you like. Run gdb under cgdb, emacs, vim, or even in some IDE, or plain. You can run gdb in your source or build tree, so it knows where everything is. (If it can't find your sources, use the dir command.) This is usually much better than running it in the docker image.
gdbserver relies on ptrace, so you will also need to do the other things suggested above. --privileged --pid=host sufficed for me.
If you deploy to other OSes or embedded targets, you can run gdbserver or a gdb stub there, and run gdb the same way, connecting across a real network or even via a serial port (/dev/ttyS0).

I don't know what you are doing with LD_PRELOAD or your ptrace function.
Why don't you try attaching gdb to a very simple program? Make a program that simply repeatedly prints Hello or something and use gdb --pid [hello program PID] to attach to it.
If that does not work then you really do have a problem.
Another issue is the user ID. Is the program that you are tracing setting itself to another UID? If it is then you cannot ptrace it unless you are using the same user ID or are root.

I have faced the same problem and try a lot of solution but finally, I have found the solution, but really I don't know what the problem was. First I modified the ptrace_conf value and login into Ubuntu as a root but the problem still appears. But the most strange thing that happened is the gdb showed me a message that says:
Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user.
For more details, see /etc/sysctl.d/10-ptrace.conf
warning: process 3767 is already traced by process 3755 ptrace: Operation not permitted.
With ps command terminal, the process 3755 was not listed.
I found the process 3755 in /proc/$pid but I don't understand what was it!!
Finally, I deleted the target file (foo.c) that I try to attach it vid gdb and tracer c program using PTRACE_ATTACH syscall, and in the other folder, I created another c program and compiled it.
the problem is solved and I was enabled to attach to another process either by gdb or ptrace_attach syscall.
(gdb) attach 4416
Attaching to process 4416
and I send a lot of signals to process 4416. I tested it with both gdb and ptrace, both of them run correctly.
really I don't know the problem what was, but I think it is not a bug in Ubuntu as a lot of sites have referred to it, such https://askubuntu.com/questions/143561/why-wont-strace-gdb-attach-to-a-process-even-though-im-root

Extra information
If you wanna make changes in the interfaces such as add the ovs bridge, you must use --privileged instead of --cap-add NET_ADMIN.
sudo docker run -itd --name=testliz --privileged --cap-add=SYS_PTRACE --security-opt seccomp=unconfined ubuntu

If you are using FreeBSD, edit /etc/sysctl.conf, change the line
security.bsd.unprivileged_proc_debug=0
to
security.bsd.unprivileged_proc_debug=1
Then reboot.

Forcing program to create coredump on freebsd

In my project I added a new module and now my process is being terminated by signal 11 .
I want to track and understand the problem but no coredump file is generated by freebsd.
I set sysctl like :
sysctl -a | grep core
kern.corefile: /usr/core
kern.nodump_coredump: 1
kern.coredump: 1
kern.sugid_coredump: 1
debug.elf64_legacy_coredump: 1
debug.elf32_legacy_coredump: 1
I also set ulimit -c unlimited
From my code I removed all code about signal like "sigaction(SIGTERM, &signal, &signal_old);"
for not preventing kernel to generate coredump.
Why I can't see any coredump still ? What I am missing ?
Also are there any method forcing a program which run on freebsd to create coredump an equivalent to do_coredump() in linux?

The problem is in:
kern.corefile: /usr/core
Something like the following should help:
sysctl -w kern.corefile = "%N.core"

If I recall correctly, kern.corefile is the complete name of the resulting corefile, not the directory in which it should be placed. It also needs to be writable by the user running the process. /usr/core looks like a directory and/or a location writable only by root.
kern.nodump_coredump: 1 also looks suspicious.I don't remember that sysctl existing in the last version of FreeBSD I used, but it looks like it's intended to disable core dumps. Try setting it to 0.

Sybase initializes but does not run

I am using Red Hat 5.5 and I am trying to run Sybase ASE 12.5.4.
Yesterday I was trying to use the command "service sybase start" and the console showed sybase repeatedly trying to initialize, but failing, the main database server.
UPDATE:
I initialized a database at /ims_systemdb/master using the following commands:
dataserver -d /ims_systemdb/master -z 2k -b 51204 -c $SYBASE/ims.cfg -e db_error.log
chmod a=rwx /ims_systemdb/master
ls -al /ims_systemdb/master
And it gives me a nice database at /ims_systemdb/master with a size of 104865792 bytes (2048x51240).
But when I run
service sybase start
The error log at /logs/sybase_error.log goes like this:
00:00000:00000:2013/04/26 16:11:45.18 kernel Using config area from primary master device.
00:00000:00000:2013/04/26 16:11:45.19 kernel Detected 1 physical CPU
00:00000:00000:2013/04/26 16:11:45.19 kernel os_create_region: can't allocate 11534336000 bytes
00:00000:00000:2013/04/26 16:11:45.19 kernel kbcreate: couldn't create kernel region.
00:00000:00000:2013/04/26 16:11:45.19 kernel kistartup: could not create shared memory
I read "os_create_region" is normal if you don't set shmmax in sysctl high enough, so I set it to 16000000000000, but I still get this error. And sometimes, when I'm playing around with the .cfg file, I get this error message instead:
00:00000:00000:2013/04/25 14:04:08.28 kernel Using config area from primary master device.
00:00000:00000:2013/04/25 14:04:08.29 kernel Detected 1 physical CPU
00:00000:00000:2013/04/25 14:04:08.85 server The size of each partitioned pool must have atleast 512K. With the '16' partitions we cannot configure this value f
Why do these two errors appear and what can I do about them?
UPDATE:
Currently, I'm seeing the 1st error message (os cannot allocate bytes). The contents of /etc/sysctl.conf are as follows:
kernel.shmmax = 4294967295
kernel.shmall = 1048576
kernel.shmmni = 4096
But the log statements earlier state that
os_create_region: can't allocate 11534336000 bytes
So why is the region it is trying to allocate so big, and where did that get set?

The Solution:
When you get a message like "os_create_region: can't allocate 11534336000 bytes", what it means is that Sybase's configuration file is asking the kernel to create a region that exceeds the shmmax variable in /etc/sysctl.conf
The main thing to do is to change ims.conf (or whatever configuration file you are using). Then, you change the max memory variable in the physical memory section.
[Physical Memory]
max memory = 64000
additional network memory = 10485760
shared memory starting address = DEFAULT
allocate max shared memory = 1
For your information, my /etc/sysctl.conf file ended with these three lines:
kernel.shmmax = 16000000000
kernel.shmall = 16000000000
kernel.shmmni = 8192
And once this is done, type "showserver" to reveal what processes are running.
For more information, consult the Sybase System Administrator's Guide, volume 2 as well as Michael Gardner's link to Red Hat memory management in the comments earlier.

Line Number Info in ltrace and strace tools

Is it possible that I can view the line number and file name (for my program running with ltrace/strace) along with the library call/system call information.
Eg:
code section :: ptr = malloc(sizeof(int)*5); (file:code.c, line:21)
ltrace or any other tool: malloc(20) :: code.c::21
I have tried all the options of ltrace/strace but cannot figure out a way to get this info.
If not possible through ltrace/strace, do we have any parallel tool option for GNU/Linux?

You may be able to use the -i option (to output the instruction pointer at the time of the call) in strace and ltrace, combined with addr2line to resolve the calls to lines of code.

No It's not possible. Why don't you use gdb for this purpose?
When you are compiling application with gcc use -ggdb flags to get debugger info into your program and then run your program with gdb or equivalent frontend (ddd or similar)
Here is quick gdb manual to help you out a bit.
http://www.cs.cmu.edu/~gilpin/tutorial/

You can use strace-plus that can collects stack traces associated with each system call.
http://code.google.com/p/strace-plus/

Pretty old question, but I found a way to accomplish what OP wanted:
First use strace with -k option, which will generate a stack trace like this:
openat(AT_FDCWD, NULL, O_RDONLY) = -1 EFAULT (Bad address)
> /usr/lib/libc-2.33.so(__open64+0x5b) [0xefeab]
> /usr/lib/libc-2.33.so(_IO_file_open+0x26) [0x816f6]
> /usr/lib/libc-2.33.so(_IO_file_fopen+0x10a) [0x818ca]
> /usr/lib/libc-2.33.so(__fopen_internal+0x7d) [0x7527d]
> /mnt/r/build/tests/main(main+0x90) [0x1330]
> /usr/lib/libc-2.33.so(__libc_start_main+0xd5) [0x27b25]
> /mnt/r/build/tests/main(_start+0x2e) [0x114e]
The address of each function call are displayed at the end of each line, and you can paste it to addr2line to retrieve the file and line. For example, we want to locate the call in main() (fifth line of the stack trace).
addr2line -e tests/main 0x1330
It will show something like this:
/mnt/r/main.c:55

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Increase heap to avoid Out of Memory Error in WEKA - heap-memory

Quick instruction for the Ubuntu-Users: The heap can be set by changing the line MEMORY="256m" in the file /usr/bin/weka with your favored editor.

Weka's instructions state that the "-Xmx..." command will not work from the simple command line interface. I believe you should increase the heap size by editing the RunWeka.ini file. The link I provided should point you in the right direction.

In terminal use this command sudo gedit /usr/bin/weka Change the size in the line MEMORY="256m"

Related

warning: Error disabling address space randomization: Operation not permitted

How to solve "ptrace operation not permitted" when trying to attach GDB to a process?

Forcing program to create coredump on freebsd

Sybase initializes but does not run

Line Number Info in ltrace and strace tools

Categories

Resources