Rebalancing rate when new node is added

Rebalancing rate when new node is added - database

When a new node is added, we see that it is starting to receive new tablets (in the http://:7000/tablet-servers page) and the system is rebalancing. But the default rate seems low. Are there any knobs to determine this rate?

The rebalance in YugaByte DB is rate limited.
One of the parameters that governs this behavior is the yb-tserver gflag remote_bootstrap_rate_limit_bytes_per_sec which defaults to 256MB/sec and is the maximum transmission rate (inbound + outbound) related to rebalance that any one server (yb-tserver) may do.
To inspect the current setting on a yb-tserver you can try this:
$ curl -s 10.150.0.20:9000/varz | grep remote_bootstrap_rate
--remote_bootstrap_rate_limit_bytes_per_sec=268435456
This particular param can also be changed on the fly without needing a yb-tserver restart. For example to set the rate to 512MB/sec.
bin/yb-ts-cli --server_address=$TSERVER_IP:9100 set_flag --force remote_boostrap_rate_limit_bytes_per_sec 536870912
A second aspect of this is the cluster wide global settings on how many tablet rebalances can happen simultaneously in the system. These are governed by a few yb-master gflags.
$ bin/yb-ts-cli --server_address=$MASTER_IP:7100 set_flag -force load_balancer_max_concurrent_adds 3
$ bin/yb-ts-cli --server_address=$MASTER_IP:7100 set_flag -force load_balancer_max_over_replicated_tablets 3
$ bin/yb-ts-cli --server_address=$MASTER_IP:7100 set_flag -force load_balancer_max_concurrent_tablet_remote_bootstraps 3

Related

How to ignore timeouts in ab (apache bench)?

I run benchmarks with apache bench on a web service. I know that 1-2 requests from the test will be timeouted during measurement (it's a web framework issue). And when timeout occurs ab quits with the message apr_pollset_poll: The timeout specified has expired (70007) and does not show results. I want to get measurement results ignoring these timeouted tests (or count them too, but just use timeout value as response time). Is it possible with ab?
EDIT: The command I use is
ab -n 1000 -c 10 http://localhost:80
I looked into ab source and from what I saw it's impossible to ignore these errors. Maybe there is a fork which implements such feature?

The default timeout is 30 seconds. You can change this with -s:
ab -s 9999 -n 1000 -c 10 http://localhost:80

Increasing Solr5 time out from 30 seconds while starting solr

Many time while starting solr I see the below message and then the solr is not reachable.
debraj#boutique3:~/solr5$ sudo bin/solr start -p 8789
Waiting to see Solr listening on port 8789 [-] Still not seeing Solr listening on 8789 after 30 seconds!
I am having two cores in my local set-up. I am guessing this is happening because one of the core is a little big. So solr is timing out while loading the core. If I take one of the core out of solr then everything works fine.
Can some one let me know how can I increase this timeout value from default 30 seconds?
I am using Solr 5.2.1 on Debian 7.

Usually this could be related to startup problems, but if you are running solr on a slow machine, 30 seconds may not be enough for it to start.
In that case you may try this (I'm using Solr 5.5.0)
Windows (tested, working): in bin/solr.cmd file, look for the parameter
-maxWaitSecs 30
few lines below "REM now wait to see Solr come online ..." and replace 30 with a number that meets your needs (e.g. 300 seconds = 5 minutes)
Others (not tested): in bin/solr file, search the following code
if [ $loops -lt 6 ]; then
sleep 5
loops=$[$loops+1]
else
echo -e "Still not seeing Solr listening on $SOLR_PORT after 30 seconds!"
tail -30 "$SOLR_LOGS_DIR/solr.log"
exit # subshell!
fi
Increase waiting loop cycles from 6 to whatever meets your needs (e.g. 60 cycles * 5 sleep seconds = 300 seconds = 5 minutes). You should change the number of seconds in the message below too, just to be congruent.

Invalid value "zookeeper" for flag -a: valid streams are STDIN, STDOUT and STDERR

I am trying to follow this blog to setup solr cloud with docker:
https://lucidworks.com/blog/solrcloud-on-docker/
I was able to create the zookeeper image successfully. docker images command lists the image too.
However, when I try to create and run the zookeeper container with the following command, it errors out:
docker run -name zookeeper -p 2181 -p 2888 -p 3888 myusername/zookeeper:3.4.6
Error:
Warning: '-n' is deprecated, it will be removed soon. See usage.
invalid value "zookeeper" for flag -a: valid streams are STDIN, STDOUT and STDERR
See 'docker run --help'.
flag provided but not defined: -name
See 'docker run --help'.
What am I missing here?

Please use --name instead.
Usage: docker run [OPTIONS] IMAGE [COMMAND] [ARG...]
Run a command in a new container
-a, --attach=[] Attach to STDIN, STDOUT or STDERR
--add-host=[] Add a custom host-to-IP mapping (host:ip)
--blkio-weight=0 Block IO weight (relative weight)
-c, --cpu-shares=0 CPU shares (relative weight)
--cap-add=[] Add Linux capabilities
--cap-drop=[] Drop Linux capabilities
--cgroup-parent="" Optional parent cgroup for the container
--cidfile="" Write the container ID to the file
--cpu-period=0 Limit CPU CFS (Completely Fair Scheduler) period
--cpu-quota=0 Limit CPU CFS (Completely Fair Scheduler) quota
--cpuset-cpus="" CPUs in which to allow execution (0-3, 0,1)
--cpuset-mems="" Memory nodes (MEMs) in which to allow execution (0-3, 0,1)
-d, --detach=false Run container in background and print container ID
--device=[] Add a host device to the container
--dns=[] Set custom DNS servers
--dns-search=[] Set custom DNS search domains
-e, --env=[] Set environment variables
--entrypoint="" Overwrite the default ENTRYPOINT of the image
--env-file=[] Read in a file of environment variables
--expose=[] Expose a port or a range of ports
--group-add=[] Add additional groups to run as
-h, --hostname="" Container host name
--help=false Print usage
-i, --interactive=false Keep STDIN open even if not attached
--ipc="" IPC namespace to use
-l, --label=[] Set metadata on the container (e.g., --label=com.example.key=value)
--label-file=[] Read in a file of labels (EOL delimited)
--link=[] Add link to another container
--log-driver="" Logging driver for container
--log-opt=[] Log driver specific options
--lxc-conf=[] Add custom lxc options
-m, --memory="" Memory limit
--mac-address="" Container MAC address (e.g. 92:d0:c6:0a:29:33)
--memory-swap="" Total memory (memory + swap), '-1' to disable swap
--memory-swappiness="" Tune a container's memory swappiness behavior. Accepts an integer between 0 and 100.
--name="" Assign a name to the container
--net="bridge" Set the Network mode for the container
--oom-kill-disable=false Whether to disable OOM Killer for the container or not
-P, --publish-all=false Publish all exposed ports to random ports
-p, --publish=[] Publish a container's port(s) to the host
--pid="" PID namespace to use
--privileged=false Give extended privileges to this container
--read-only=false Mount the container's root filesystem as read only
--restart="no" Restart policy (no, on-failure[:max-retry], always)
--rm=false Automatically remove the container when it exits
--security-opt=[] Security Options
--sig-proxy=true Proxy received signals to the process
-t, --tty=false Allocate a pseudo-TTY
-u, --user="" Username or UID (format: <name|uid>[:<group|gid>])
--ulimit=[] Ulimit options
--disable-content-trust=true Skip image verification
--uts="" UTS namespace to use
-v, --volume=[] Bind mount a volume
--volumes-from=[] Mount volumes from the specified container(s)
-w, --workdir="" Working directory inside the container

Is there a way to ping faster in Busy Box or Tiny Core Linux?

Solution at end of this post.
By default the time is set to one second, and under the usual iputils version of ping there is an option to reduce this number with the -i switch. I need to ping faster, as I have 120 pings in a certain test that needs to be run many times.
I tried modifying the source of ping.c from the busybox source but I don't know much about compiling and I get the error "could not be found libbb.h" and I couldn't find anyone else with a similar error on busybox.
Does anyone know of a way for me to ping faster than 1 per second, I am hoping to go down to 0.1 or 0.05 seconds if at all possible.
Thanks in advance
Solution
In case anyone comes looking for an answer, the solution I came up with was much better. If you write a script to ping with the -c 1 flag, and count the failures yourself you can ping much faster.
Example:
fails=0
for i in `seq 1 20`
do
x=`ping -c 1 192.168.1.1 | grep received | cut -d' ' -f4`
if [ x -eq 0 ]
then
fails=$(($fails+1))
fi
done
echo $fails fails
done

You are correct in that you have to modify the ping.c file. As you have determined, BusyBox ping does not support the -i switch.
What platform are you building this for? A PC, an embedded system?
Option 1:
Modify ping.c from BusyBox and recompile BusyBox. To do this, you would use 'make' in the root of the BusyBox project.
user#linux:~/busybox-1.19.2$ make
Option 2:
It might be easier and more simplistic to leave BusyBox alone and get ping.c from another archive such as iputils. This supports the -i switch and goes as low as 0.2 seconds. To compile ping.c:
user#linux:~/iputils-s20101006$ make ping

LSF -- Giving multiple options for acceptable resource types?

I'm running some jobs on a cluster that uses the LSF batch scheduling tool. A typical batch request looks like this:
bsub -q someQueue -R someResourceType /home/user/myProgram
In the cluster that I'm using, there are approximately ten different types of resources. (In other words, ten different types of nodes in the cluster.) There are three resource types that would be acceptable for the batch jobs that I'm running. Therefore, I'd like to make a request that says "use whichever resource type is available out of resourceType1, resourceType2, or resourceType3."
I'm guessing such a request would look something like this:
bsub -q someQueue -R {resourceType1 or resourceType2 or resourceType3} /home/user/myProgram
Is there a way to do this in LSF?

Here's the appropriate syntax:
bsub -q someQueue -R "resourceType1|resourceType2|resourceType3" /home/user/myProgram
Make sure you don't forget the quotes!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Rebalancing rate when new node is added - database

When a new node is added, we see that it is starting to receive new tablets (in the http://:7000/tablet-servers page) and the system is rebalancing. But the default rate seems low. Are there any knobs to determine this rate?

Related

How to ignore timeouts in ab (apache bench)?

Increasing Solr5 time out from 30 seconds while starting solr

Invalid value "zookeeper" for flag -a: valid streams are STDIN, STDOUT and STDERR

Is there a way to ping faster in Busy Box or Tiny Core Linux?

LSF -- Giving multiple options for acceptable resource types?

Categories

Resources