Hadoop YARN Default schedular - default

Does Hadoop YARN have a default scheduler ?
Wondering what if yarn.resourcemanager.scheduler.class is not set in conf/yarn-site.xml?

yarn-defualt.xml specifies the value of property: yarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.
yarn-default.xml
Hence, if you do not specify the scheduler property in yarn-site.xml then CapacityScheduler is used as default,

For the benefit of future readers of this question:
Different distributions have affinity to different schedulers by default which can be overridden .
The following information about leading distributions is accurate as of the time of writing:
Hortonworks (v2.x) - Capacity Schedulers
Cloudera (v5.x) - Fair Schedulers
MAPR (v5.x) - Fair Schedulers
Big Insights (v2.x) - InfoSphere BigInsights Scheduler (average
response time metrics)
Pivotal HD (v3.x) - Capacity Schedulers

public static final String DEFAULT_RM_SCHEDULER =
"org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler";
String schedulerClassName = conf.get(YarnConfiguration.RM_SCHEDULER,
YarnConfiguration.DEFAULT_RM_SCHEDULER);
LOG.info("Using Scheduler: " + schedulerClassName);

Related

[flink]Task manager initialization failed

I am new to flink. I am trying to run the flink example on my local PC(windows).
However, after I run the start-cluster.bat, I login to the dashboard, it shows the task manager is 0.
I checked the log and seems it fails to initialize:
2020-02-21 23:03:14,202 ERROR org.apache.flink.runtime.taskexecutor.TaskManagerRunner - TaskManager initialization failed.
org.apache.flink.configuration.IllegalConfigurationException: Failed to create TaskExecutorResourceSpec
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpec.FromConfig(TaskExecutorResourceUtils.java:72)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:356)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.<init>(TaskManagerRunner.java:152)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:308)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerSecurely$2(TaskManagerRunner.java:322)
at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerSecurely(TaskManagerRunner.java:321)
at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.main(TaskManagerRunner.java:287)
Caused by: org.apache.flink.configuration.IllegalConfigurationException: The required configuration option Key: 'taskmanager.cpu.cores' , default: null (fallback keys: []) is not set
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkConfigOptionIsSet(TaskExecutorResourceUtils.java:90)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.lambda$checkTaskExecutorResourceConfigSet$0(TaskExecutorResourceUtils.java:84)
at java.util.Arrays$ArrayList.forEach(Arrays.java:3880)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.checkTaskExecutorResourceConfigSet(TaskExecutorResourceUtils.java:84)
at org.apache.flink.runtime.taskexecutor.TaskExecutorResourceUtils.resourceSpecFromConfig(TaskExecutorResourceUtils.java:70)
... 7 more
2020-02-21 23:03:14,217 INFO org.apache.flink.runtime.blob.TransientBlobCache - Shutting down BLOB cache
Basically, it looks like a required option 'taskmanager.cpu.cores' is not set. However, I can't find this property in flink-conf.yaml and in the document(https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html) either.
I am using flink 1.10.0. Any help would be highly appreciated!
That configuration option is intended for internal use only -- it shouldn't be user configured, which is why it isn't documented.
The windows start-cluster.bat is failing because of a bug introduced in Flink 1.10. See https://jira.apache.org/jira/browse/FLINK-15925.
One workaround is to use the bash script, start-cluster.sh, instead.
See also this mailing list thread: https://lists.apache.org/thread.html/r7693d0c06ac5ced9a34597c662bcf37b34ef8e799c32cc0edee373b2%40%3Cdev.flink.apache.org%3E

Where are avgRequestsPerSecond and avgTimePerRequest metrics in solr 7,8

I am coding golang solr exporter which format the same with java solr-exporter of Apache Solr (it ate much RAM) . I want to add more metric like "avgTimePerRequest", "avgRequestsPerSecond".
According to Solr document, it said that can query "avgTimePerRequest" and "avgRequestsPerSecond" via
"http://localhost:8983/solr/admin/metrics?group=core&prefix=UPDATE./update.requestTimes"
"http://localhost:8983/solr/admin/metrics?group=core&prefix=QUERY./select.requestTimes"
But when i couldn't see avgTimePerRequest or avgRequestsPerSecond, It only includes these
"count":0,
"meanRate":0.0,
"1minRate":0.0,
"5minRate":0.0,
"15minRate":0.0,
"min_ms":0.0,
"max_ms":0.0,
"mean_ms":0.0,
"median_ms":0.0,
"stddev_ms":0.0,
"p75_ms":0.0,
"p95_ms":0.0,
"p99_ms":0.0,
"p999_ms":0.0
With Solr 6, I can found "avgTimePerRequest" and "avgRequestsPerSecond" in mbean. But solr7,8 I couldn't found it? Does they need to enable?
From SOLR v7.3 Change.txt
SOLR-8785: Metrics related classes in org.apache.solr.util.stats have been removed in favor of
the dropwizard metrics library. Any custom plugins using these classes should be changed to use
the equivalent classes from the metrics library.
As part of this, the following changes were made to the output of Overseer Status API:
* The "totalTime" metric has been removed because it is no longer supported
* The metrics "75thPctlRequestTime", "95thPctlRequestTime", "99thPctlRequestTime"and "999thPctlRequestTime" in Overseer Status API have been renamed to "75thPcRequestTime", "95thPcRequestTime"
and so on for consistency with stats output in other parts of Solr.
The metrics "avgRequestsPerMinute", "5minRateRequestsPerMinute" and "15minRateRequestsPerMinute" have been replaced by corresponding per-second rates viz. "avgRequestsPerSecond", "5minRateRequestsPerSecond" and "15minRateRequestsPerSecond" for consistency with stats output in other parts of Solr.

Using JanusGraph with Solr

Setting up JanusGraph i noticed the following in the console:
09:04:12,175 INFO ReflectiveConfigOptionLoader:173 - Loaded and initialized config classes: 10 OK out of 12 attempts in PT0.023S
09:04:12,230 INFO Reflections:224 - Reflections took 28 ms to scan 1 urls, producing 2 keys and 2 values
09:04:12,291 WARN GraphDatabaseConfiguration:1445 - Local setting index.search.index-name=entity (Type: GLOBAL_OFFLINE) is overridden by globally managed value (janusgraph). Use the ManagementSystem interface instead of the local configuration to control this setting.
09:04:12,294 WARN GraphDatabaseConfiguration:1445 - Local setting index.search.backend=solr (Type: GLOBAL_OFFLINE) is overridden by globally managed value (elasticsearch). Use the ManagementSystem interface instead of the local configuration to control this setting.
09:04:12,300 INFO CassandraThriftStoreManager:628 - Closed Thrift connection pooler.
and then i see the following:
Exception in thread "main" java.lang.IllegalArgumentException: Could not instantiate implementation: org.janusgraph.diskstorage.es.ElasticSearchIndex
How do i stop using elasticsearch and switch to Solr?
My properties file is as follows:
index.search.backend=solr
index.search.directory=/path/to/directory/for/solr/index/something
index.search.index-name=something
index.search.solr.mode=http
index.search.solr.http-urls=http://127.0.0.1:8983/solr
storage.backend=cassandrathrift
storage.hostname=127.0.0.1
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
The answer to this basically the same as this one for Titan. JanusGraph was forked from Titan.
You are probably trying to connect to an existing graph that was previously configured to use Elasticsearch. By default, the keyspace is named janusgraph.
1) You could connect to a different keyspace by updating conf/janusgraph-cassandra.properties
gremlin.graph=org.janusgraph.core.JanusGraphFactory
storage.backend=cassandrathrift
storage.hostname=127.0.0.1
storage.cassandra.keyspace=mygraph
2) You could drop the existing keyspace. If you used bin/janusgraph.sh start from the quick start directions (which starts a single node Cassandra and a single node Elasticsearch),
bin/janusgraph.sh clean
Or if you have a standalone Cassandra installation:
$CASSANDRA_HOME/bin/cqlsh -e 'drop keyspace if exists janusgraph'
Then you would be able to connect with the default conf/janusgraph-cassandra.properties.

Set default settings to 'no-cache' on Google Cloud Storage

Is there a way to set all public links to have 'no-cache' in Google Cloud Storage?
I've seen solutions to use gsutil to set the "Cache-Control" upon file-upload, but I'm looking for a more permanent solution.
There was a conversation about providing a cache invalidation feature but I didn't quite follow the reasoning. Any explanations would be greatly appreciated!
it would be difficult to provide a cache invalidation feature because once served with a non-0 cache TTL any cache on the Internet (not just those under Google's control) is allowed (per HTTP spec) to cache the data
Thanks!
For a more permanent one-time-effort solution, with the current offerings on GCP, you can do this with Cloud Functions.
Create a new Funciton, set the Event type to "On (finalizing/creating) file in the selected bucket" - google.storage.object.finalize. Make sure to select the bucket you want this on. In the body of the function, set the cacheControl / Cache-Control attribute for the blob. The attribute name depends on the language. Here's my version in Python, using cache_control:
main.py:
match the function name below to the Entry point
from google.cloud import storage
def set_file_uncached(event, context):
file = event # auto-generated
print(f"Processing file: {file=}") # logging, if you want it
storage_client = storage.Client()
# we expect just one with that name
blob = storage_client.bucket(file["bucket"]).get_blob(file["name"])
if not blob:
# in case the blob is deleted before this executes
print(f"blob not found")
return None
blob.cache_control = "public, max-age=0" # or whatever you need
blob.patch()
requirements.txt
google-cloud-storage
From the logs: Function execution took 1712 ms, finished with status: 'ok'. This could have been faster but I've set the minimum to 0 instances so it needs to spin-up for each upload. Depending on your usage and cost constraints, you can set it to 1 or something higher.
Other settings:
Retry on failure: No/False
Region: [wherever your bucket is]
Memory allocated: 128 MB (smallest available currently)
Timeout: 5 seconds (smallest available currently, function shouldn't take longer)
Minimum instances: 0
Maximum instances: 1

Camel ActiveMQ client blocking, temp storage usage immediately hits 100%

I'm seeing 100% utilisation of activemq's temp storage (configured to be 100mb), and the activemq client is blocking. This 100% usage remains permanently, and I have no idea what's going on
I have a camel route, which consumes from a queue (QUEUE.IN) using the JmsTransactionManager.
public final class RouteUnderTest extends RouteBuilder {
#Override
public void configure() throws Exception {
from("activemq-transacted:QUEUE.IN")
.bean(myBean)
.to("activemq:QUEUE.OUT");
}
}
While processing the message from this queue I'm invoking a spring-integration client (myBean) which is configured as follows
<int:gateway id="myBean" service-interface="MyBean">
<int:method name="request" request-channel="channel"/>
</int:gateway>
<int:chain input-channel="channel">
<int:transformer ref="transformedToJsonHere"/>
<jms:outbound-gateway request-destination-name="QUEUE.MYBEAN"
receive-timeout="5000"
explicit-qos-enabled="true"
time-to-live="5000"
delivery-persistent="false"/>
<int:transformer ref="transformedToAnObjectHere"/>
</int:chain>
My broker is configured to use LevelDB, and with the following usage limits:
<persistenceAdapter>
<levelDB directory="${activemq.data}/leveldb"/>
</persistenceAdapter>
<systemUsage>
<systemUsage>
<memoryUsage>
<memoryUsage percentOfJvmHeap="70"/>
</memoryUsage>
<storeUsage>
<storeUsage limit="500 mb"/>
</storeUsage>
<tempUsage>
<tempUsage limit="100 mb"/>
</tempUsage>
</systemUsage>
</systemUsage>
When my route consumes the message and then attempts to put a non-persistent message on QUEUE.OUT the client is blocked and my broker shows 100% usage of temp storage.
And I see the following activemq logs
2015-07-28 15:44:59,678 | INFO | Usage(default:temp:queue://QUEUE.MYBEAN:temp) percentUsage=0%, usage=104857600, limit=104857600, percentUsageMinDelta=1%;Parent:Usage(default:temp) percentUsage=100%, usage=104857600, limit=104857600, percentUsageMinDelta=1%: Temp Store is Full (0% of 104857600). Stopping producer (ID:orbit-vm-55561-1438094698190-1:1:3:1) to prevent flooding queue://QUEUE.MYBEAN. See http://activemq.apache.org/producer-flow-control.html for more info (blocking for: 1s) | org.apache.activemq.broker.region.Queue | ActiveMQ NIO Worker 6
The queues look like (You can see that the QUEUE.IN message has been not been dequeued because it's still being processed transactionally, and no message has gone to QUEUE.MYBEAN)
I can fix this problem with any one of the following approaches:
Use KahaDB instead of LevelDB
Increase temp storage limit (150MB seems to do it but I haven't experimented a great deal)
Configure tempDataStore in activemq.xml (see below)
When configuring the tempDataStore it looks like:
<tempDataStore>
<bean xmlns="http://www.springframework.org/schema/beans" class="org.apache.activemq.leveldb.LevelDBStore">
<property name="directory" value="${activemq.data}/tmp" />
</bean>
</tempDataStore>
I should add, we were using KahaDB previously and this worked fine, but the upgrade to LevelDB has exposed this issue. Reverting to KahaDB is not an option.
I'm hoping someone could explain what we're seeing here, as the results are really difficult to understand. Why does using LevelDB necessitate a higher temp usage limit?, and why does configuring the tempDataStore explicitly also fix the problem?
I don't fully understand what's going on here so I'm worried that simply increasing the temp usage limit a little will just hide the problem until a later date.
Versions:
ActiveMQ: 5.11.1
Camel: 2.14.0
Spring: 4.0.8.RELEASE
Spring Integration: 4.0.5.RELEASE
We ran into exactly the same issue with ActiveMQ 5.13.2
The solution when using LevelDB is to explicitly configure a dedicated tempDataStore as you did.
If not, the broker uses the same store (LevelDB) for both persistent (persistent usage) and non-persistent messages (temp usage). You may therefore end-up in situations where the broker doesn't accept any non-persistent message anymore just because the store already holds persistent ones up to the configured tempUsage limit. It will however accept persistent ones if your storeUsage limit is set higher...
When using KahaDB, the broker automatically uses another store for the non-persistent messages (created in the tmp directory). So you don't have the problem...
Look at the following code for more indepth information: https://github.com/apache/activemq/blob/activemq-5.13.2/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java#L1739
When reading that code, remember LevelDBStore implements PListStore, but KahaDBStore doesn't...

Resources