Flink, where can I find the ExecutionEnvironment#readSequenceFile method? - apache-flink

I have hdfs data files which were originally created by mapreduce job with output settings like below,
job.setOutputKeyClass(BytesWritable.class);
job.setOutputValueClass(BytesWritable.class);
job.setOutputFormatClass(SequenceFileAsBinaryOutputFormat.class);
SequenceFileAsBinaryOutputFormat.setOutputCompressionType(job, CompressionType.BLOCK);
Now I'm trying to read these files with Flink DataSet API (version 1.5.6), I look into the flink doc, but couldn't figure out how to do that.
In the doc, there's an API 'readSequenceFile', I just cannot find it in the class ExecutionEnvironment, I can find 'readCsvFile', 'readTextFile', but not this one.
There's a general one 'readFile(inputFormat, path)', but I have no clue what's the inputFormat, it seems this API doesn't accept hadoop input format such as 'SequenceFileAsBinaryInputFormat'.
Could anyone please shed some light here? Many thanks.

I guess what you missed is an additional dependency: "org.apache.flink" %% "flink-hadoop-compatibility" % 1.7.2
Once you added this you can run:
val env = ExecutionEnvironment.getExecutionEnvironment
env.createInput(HadoopInputs.readSequenceFile[Long, String](classOf[Long], classOf[String], "/data/wherever"))
Find a more detail documentation about the what and how here https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/hadoop_compatibility.html
Hope that helps

Related

Where are avgRequestsPerSecond and avgTimePerRequest metrics in solr 7,8

I am coding golang solr exporter which format the same with java solr-exporter of Apache Solr (it ate much RAM) . I want to add more metric like "avgTimePerRequest", "avgRequestsPerSecond".
According to Solr document, it said that can query "avgTimePerRequest" and "avgRequestsPerSecond" via
"http://localhost:8983/solr/admin/metrics?group=core&prefix=UPDATE./update.requestTimes"
"http://localhost:8983/solr/admin/metrics?group=core&prefix=QUERY./select.requestTimes"
But when i couldn't see avgTimePerRequest or avgRequestsPerSecond, It only includes these
"count":0,
"meanRate":0.0,
"1minRate":0.0,
"5minRate":0.0,
"15minRate":0.0,
"min_ms":0.0,
"max_ms":0.0,
"mean_ms":0.0,
"median_ms":0.0,
"stddev_ms":0.0,
"p75_ms":0.0,
"p95_ms":0.0,
"p99_ms":0.0,
"p999_ms":0.0
With Solr 6, I can found "avgTimePerRequest" and "avgRequestsPerSecond" in mbean. But solr7,8 I couldn't found it? Does they need to enable?
From SOLR v7.3 Change.txt
SOLR-8785: Metrics related classes in org.apache.solr.util.stats have been removed in favor of
the dropwizard metrics library. Any custom plugins using these classes should be changed to use
the equivalent classes from the metrics library.
As part of this, the following changes were made to the output of Overseer Status API:
* The "totalTime" metric has been removed because it is no longer supported
* The metrics "75thPctlRequestTime", "95thPctlRequestTime", "99thPctlRequestTime"and "999thPctlRequestTime" in Overseer Status API have been renamed to "75thPcRequestTime", "95thPcRequestTime"
and so on for consistency with stats output in other parts of Solr.
The metrics "avgRequestsPerMinute", "5minRateRequestsPerMinute" and "15minRateRequestsPerMinute" have been replaced by corresponding per-second rates viz. "avgRequestsPerSecond", "5minRateRequestsPerSecond" and "15minRateRequestsPerSecond" for consistency with stats output in other parts of Solr.

Flink, odd behavior when using Hadoop Compatibility

I've add Flink Hadoop Compatibility to the project which reads sequence file from hdfs path,
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-hadoop-compatibility_2.11</artifactId>
<version>1.5.6</version>
</dependency>
Here's the java code snippet,
DataSource<Tuple2<NullWritable, BytesWritable>> input = env.createInput(HadoopInputs.readHadoopFile(
new org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat<NullWritable, BytesWritable>(),
NullWritable.class, BytesWritable.class, path));
This works pretty fine when I run it inside my Eclipse, but when I submit it via command line 'flink run ...', it complains,
The type returned by the input format could not be automatically determined. Please specify the TypeInformation of the produced type explicitly by using the 'createInput(InputFormat, TypeInformation)' method instead.
OK, so I update my code to add type information,
DataSource<Tuple2<NullWritable, BytesWritable>> input = env.createInput(HadoopInputs.readHadoopFile(
new org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat<NullWritable, BytesWritable>(),
NullWritable.class, BytesWritable.class, path),
TypeInformation.of(new TypeHint<Tuple2<NullWritable, BytesWritable>>() {}));
Now it complains,
Caused by: java.lang.RuntimeException: Could not load the TypeInformation for the class 'org.apache.hadoop.io.Writable'. You may be missing the 'flink-hadoop-compatibility' dependency.
Some people suggest to copy flink-hadoop-compatibility_2.11-1.5.6.jar to FLINK_HOME/lib, but it doesn't help, still same error.
Does anyone have any clue?
My Flink is a standalone installation, version 1.5.6.
UPDATE:
Sorry, I copied flink-hadoop-compatibility_2.11-1.5.6.jar to the wrong place, after fixing that, it works.
Now my question is, is there any other way to go? Because copying that jar file to FLINK_HOME/lib is definitely not a good idea to me, especially when talking about a big flink cluster.
Fixed in version 1.9.0, see https://issues.apache.org/jira/browse/FLINK-12163 for details

OWL API V5 read ontology from local file

in the current documentation example at the link:
https://github.com/owlcs/owlapi/blob/version5/contract/src/test/java/org/semanticweb/owlapi/examples/Examples.java
There is no example of how to load an ontology from a local file. There is only a way to load it from a string.
In the past when i used owl-api version 3
the following code worked perfectly:
OWLOntologyManager manager =OWLManager.createOWLOntologyManager();
File file = new File (path);
OWLOntology ont = manager.loadOntologyFromOntologyDocument(IRI.create(file));
however, in this version, the last line of the previous code:
manager.loadOntologyFromOntologyDocument(IRI.create(file));
returns this error:
Exception in thread "main" java.lang.NoSuchMethodError:
org.semanticweb.owlapi.util.SAXParsers.initParserWithOWLAPIStandards(Lorg/xml/sax/ext/DeclHandler;)Ljavax/xml/parsers/SAXParser;
at
org.semanticweb.owlapi.rdf.rdfxml.parser.RDFParser.parse(RDFParser.java:148)
at org.semanticweb.owlapi.rdf.rdfxml.parser.RDFXMLParser.parse(RDFXMLParser.java:62)
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyFactoryImpl.loadOWLOntology(OWLOntologyFactoryImpl.java:173)
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.load(OWLOntologyManagerImpl.java:954)
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntology(OWLOntologyManagerImpl.java:918)
at uk.ac.manchester.cs.owl.owlapi.OWLOntologyManagerImpl.loadOntologyFromOntologyDocument(OWLOntologyManagerImpl.java:859)
at glass.main.ontology_Test_main2.readOntology(ontology_Test_main2.java:49)
at glass.main.ontology_Test_main2.main(ontology_Test_main2.java:38)
Kindly note the attachment, a small test java project, link:
dropbox.com/s/3787a3gsk2bwc26/test.tar.gz?dl=0
Kindly what am i doing wrong, i m sure that this code
Kindly would you please provide the correct way to do it, and add it to the tutorial example in the link https://github.com/owlcs/owlapi/blob/version5/contract/src/test/java/org/semanticweb/owlapi/examples/Examples.java
Thanks very much for your time.
Sincere regards
You are very near to the solution:
final OWLOntologyManager manager = OWLManager.createOWLOntologyManager();
final OWLOntology ontology = manager.loadOntologyFromOntologyDocument(new File("/home/galigator/myLocalDir/aura.owl"));
Just use new File instead of IRI.create
The reason of the problem was:
The previous versions that i was using:
I was using Hermit version 1.3.8.500 and OWL-API previous version 5.0.5 got modified it seems.
Solution: use the newer versions Hermit 1.3.8.510 and OWL-API 5.1.0.
I posted this answer in case someone else is using the previous version and got affected.
Sincere regards.

Getting StringIndexOutOfBoundsException when attempting to create a new Form in Codenameone

I am using Netbeans and updated to use the latest codenameone plugin. I am trying to follow the walkthrough tutorial at http://www.codenameone.com/blog/gui-builder-walkthru.html, but I keep on getting a StringIndexOutOfBoundsException when attempting to generate a new Form using the NewGuiBuilderWizardIterator. The following is the stacktrace that I'm seeing. Any and all help would be greatly appreciated!
SEVERE [com.codename1.actions.OpenGuiBuilderAction]: Relative path com\mycompany\myapp\MyApp.java
SEVERE [com.codename1.actions.OpenGuiBuilderAction]: Gui file C:\Users\joshua\Documents\NetBeansProjects\TestGui1\res\guibuilder\com\mycompany\myapp\MyApp.gui
SEVERE [com.codename1.actions.OpenGuiBuilderAction]: Props C:\Users\joshua\Documents\NetBeansProjects\TestGui1\codenameone_settings.properties
SEVERE [com.codename1.actions.OpenGuiBuilderAction]: The GUI file doesn't exist!
WARNING [org.openide.filesystems.Ordering]: Found same position 100 for both Loaders/application/res/Actions/org-openide-actions-OpenAction.shadow and Loaders/application/res/Actions/sep-1.instance
WARNING [org.netbeans.modules.java.JavaTemplateAttributesProvider]: No classpath was found for folder: C:\Users\joshua\Documents\NetBeansProjects\TestGui1#b78894d2:1aed2d64
WARNING [org.openide.WizardDescriptor]
java.lang.StringIndexOutOfBoundsException: String index out of range: -4
at java.lang.String.substring(String.java:1919)
at com.codename1.NewGuiBuilderWizardIterator.instantiate(NewGuiBuilderWizardIterator.java:95)
at org.openide.loaders.TemplateWizard$InstantiatingIteratorBridge.instantiate(TemplateWizard.java:1046)
at org.openide.loaders.TemplateWizard.handleInstantiate(TemplateWizard.java:605)
at org.openide.loaders.TemplateWizard.instantiateNewObjects(TemplateWizard.java:439)
at org.openide.loaders.TemplateWizardIterImpl.instantiate(TemplateWizardIterImpl.java:248)
at org.openide.loaders.TemplateWizardIteratorWrapper.instantiate(TemplateWizardIteratorWrapper.java:160)
at org.openide.WizardDescriptor.callInstantiateOpen(WizardDescriptor.java:1629)
at org.openide.WizardDescriptor.callInstantiate(WizardDescriptor.java:1570)
at org.openide.WizardDescriptor.access$2300(WizardDescriptor.java:92)
[catch] at org.openide.WizardDescriptor$Listener$2$1.run(WizardDescriptor.java:2257)
at org.openide.util.RequestProcessor$Task.run(RequestProcessor.java:1423)
at org.openide.util.RequestProcessor$Processor.run(RequestProcessor.java:2033)
You need to select a package when you do this and not the top level project since the code won't recognize that situation and won't know where to place the GUI file.
Notice that since an XML GUI file is created in the background, refactoring after the fact won't work well so this isn't something we should generally fix.

Tomcat 7 - Retrieve the version of a webapp (versioned WAR)

I've been unable to find any easy way of figuring out the version string for a WAR file deployed with Tomcat 7 versioned naming (ie app##version.war). You can read about it here and what it enables here.
It'd be nice if there was a somewhat more supported approach other than the usual swiss army knife of reflection powered ribcage cracking:
final ServletContextEvent event ...
final ServletContext applicationContextFacade = event.getServletContext();
final Field applicationContextField = applicationContextFacade.getClass().getDeclaredField("context");
applicationContextField.setAccessible(true);
final Object applicationContext = applicationContextField.get(applicationContextFacade);
final Field standardContextField = applicationContext.getClass().getDeclaredField("context");
standardContextField.setAccessible(true);
final Object standardContext = standardContextField.get(applicationContext);
final Method webappVersion = standardContext.getClass().getMethod("getWebappVersion");
System.err.println("WAR version: " + webappVersion.invoke(standardContext));
I think the simplest solution is using the same version (SVN revision + padding as an example) in .war, web.xml and META-INF/MANIFEST.MF properties files, so you could retrieve the version of these files later in your APP or any standard tool that read version from a JAR/WAR
See MANIFEST.MF version-number
Another solution described here uses the path name on the server of the deployed WAR. You'd extract the version number from the string between the "##" and the "/"
runningVersion = StringUtils.substringBefore(
StringUtils.substringAfter(
servletConfig.getServletContext().getRealPath("/"),
"##"),
"/");
Starting from Tomcat versions 9.0.32, 8.5.52 and 7.0.101, the webapp version is exposed as a ServletContext attribute with the name org.apache.catalina.webappVersion.
Link to the closed enhancement request: https://bz.apache.org/bugzilla/show_bug.cgi?id=64189
The easiest way would be for Tomcat to make the version available via a ServletContext attribute (org.apache.catalina.core.StandardContext.webappVersion) or similar. The patch to do that would be trivial. I'd suggest opening an enhancement request in Tomcat's Bugzilla. If you include a patch then it should get applied fairly quickly.

Resources