Get the values from the <xd:doc/> - version

I want to access the version information which I store in the
<xsl:stylesheet>
<xd:doc scope="stylesheet">
<xd:ul>
<xd:li>
<xd:i>Updates</xd:i>: <xd:ul>
<xd:li>20.11.2018, version: <xd:i>1.1.0</xd:i></xd:li>
<xd:li>08.03.2019, version: <xd:i>2.0.0</xd:i></xd:li>
<xd:li>11.03.2019, version: <xd:i>2.0.1</xd:i></xd:li>
</xd:ul>
</xd:li>
</xd:ul>
</xd:doc>
</xsl:stylesheet>
Normally the XPath of the #select is evaluated to the xml-file which is currently transforming. But how refer XPath to the (main) XSL-stylesheet?
Another option would be to use fn:doc(). But I want to place the version-writing-functionality in an external module, so there will be dynamic file names and I don't know how to get the xsl-file name
I use oXygen XML editor 20.1 where I define the transformation scenarios.

#MartinHonnen Thank you. I updated the code in the question. As you may see I store the version in the docs of the stylesheet.
The document('') / doc('') was what I needed. So I implemented the following:
I put the document-node of the main stylesheet in the variable which is placed in this stylesheet:
<xsl:variable name="currentStylesheet" select="doc('')"/>
then refer to it in the external module
<xsl:variable as="xs:string" name="versionXSLT" select="$currentStylesheet//xd:li[xd:i = 'Updates']/xd:ul/xd:li[last()]/xd:i/string()"/>
I also use it to get the name of the stylesheet:
<xsl:variable as="xs:string" name="currentStylesheetName" select="tokenize(document-uri($currentStylesheet), '/')[last()]"/>
Are there maybe better solutions?

Related

How to prevent crawling external links with apache nutch?

I want to crawl only specific domains on nutch. For this I set the db.ignore.external.links to true as it was said in this FAQ link
The problem is nutch start to crawl only links in the seed list. For example if I put "nutch.apache.org" to seed.txt, It only find the same url (nutch.apache.org).
I get the result by running crawl script with 200 depth. And it's finished with one cycle and generate the out put below.
How can I solve this problem ?
I'm using apache nutch 1.11
Generator: starting at 2016-04-05 22:36:16
Generator: Selecting best-scoring urls due for fetch.
Generator: filtering: false
Generator: normalizing: true
Generator: topN: 50000
Generator: 0 records selected for fetching, exiting ...
Generate returned 1 (no new segments created)
Escaping loop: no more URLs to fetch now
Best Regards
You want to fetch only pages from a specific domain.
You already tried db.ignore.external.links but this restrict anything but the seek.txt urls.
You should try conf/regex-urlfilter.txt like in the example of the nutch1 tutorial:
+^http://([a-z0-9]*\.)*your.specific.domain.org/
Are you using "Crawl" script? If yes make sure you giving level which is greater than 1. If you run something like this "bin/crawl seedfoldername crawlDb http://solrIP:solrPort/solr 1". It will crawl only urls which are listed in the seed.txt
And to crawl specific domain you can use regex-urlfiltee.txt file.
Add following property in nutch-site.xml
<property>
<name>db.ignore.external.links</name>
<value>true</value>
<description>If true, outlinks leading from a page to external hosts will be ignored. This is an effective way to limit the crawl to include only initially injected hosts, without creating complex URLFilters. </description>
</property>

XQuery 3.0 and maps in Saxon

I would like to experiment with map features in Saxon (http://www.saxonica.com/documentation/expressions/xpath30maps.xml), but I am unable to get past query compilation. Maybe I am missing some parameter or I use a wrong namespace, but I just can't find the right answer. This is my query code:
xquery version "3.0";
(: i have also tried http://www.w3.org/2005/xpath-functions/map, no difference :)
import module namespace map = "http://ns.saxonica.com/map";
map:get(map { 1 := 'aaa'}, 1)
invoked from command line:
"c:\Program Files\Saxonica\SaxonEE9.4N\bin\Query.exe" -s:play.xml -q:play2.xq" -qversion:3.0
The commands ends with error Cannot locate module for namespace "http://ns.saxonica.com/map"
When I leave out the module namespace map declaration, the error is Prefix map has not been declared, so I assume it must be.
Michael Kay has just posted a new blog entry with details on the Saxon Map implementation:
http://dev.saxonica.com/blog/mike/2012/01/#000188
You should use declare namespace instead of import module namespace for access to builtin functions. As far as I understand it, module import is for user-supplied modules only.
File map.xq:
declare namespace map="http://www.w3.org/2005/xpath-functions/map";
map:get(map { 1 := 'aaa'}, 1)
Works just fine:
> "C:\Program Files\Saxonica\SaxonEE9.4N\bin\Query.exe" -qversion:3.0 map.xq
<?xml version="1.0" encoding="UTF-8"?>aaa
I tried it with Saxon-EE 9.4.0.2J (the Java version) too, with the same effect.
Dunno if this helps, but the BaseX XQuery Processor also offers an implementation of Michael Kay's map proposal (still to be finalized by the W3): http://docs.basex.org/wiki/Map_Module

Map static field between nutch and solr

I use nutch 1.4 and I would like to map static field to Solr.
I know there is the index-static plugin. I configured it in nutch-site.xml like this :
<property>
<name>index-static</name>
<value>field:value</value>
</property>
However, the value is not sent to Solr.
Does anyone have a solution ?
It looks like the entry in nutch-default.xml is wrong.
According to the plugin source "index.static" instead of "index-static" is the right name for the property.
String fieldsString = conf.get("index.static", null);
After using that in my nutch-site.xml I was able to send multiple fields to my solr server.
Also make sure that the plugin is added to list of included plugins in the "plugin.includes" property.

ibatis - where to place the <cacheModel> tag?

I have the map config file like this
<sqlMap ..............>
<alias>
<typeAlias ......../>
</alias>
<statements>
....
<sql>....</sql>
<select cacheModel="cache-select-all">....</select>
<update>...</update>
<procedure>...</procedure>
.....
</statements>
<parameterMaps>
<parameterMap>....</parameterMap>
</parameterMaps>
<cacheModel id="cache-select-all" type="LRU" readOnly="true" serialize="false">
<flushInterval hours="24"/>
<flushOnExecute statement="InsertIOs"/>
<!--<property name="CacheSize" value="1000"/>-->
</cacheModel>
</sqlMap>
I am using ibatis (.net, if that matters) and i have one question: where to place the tags? is There a or because placing it like i did, in the statements seems not to work. What am i doing wrong?
You must reference the cacheModel you defined inside a statement tag as shown in the following link:
http://ibatis.apache.org/docs/dotnet/datamapper/ch03s08.html
Before you use it in the select statement. Order does matter here. Otherwise sql map parser wouldn't be able to validate your sql map.

Apple iWork Mime Types

I was wondering what the mime type for iWork's Pages is? And also what the mime type is for the rest of the software in the iWork suite? I looked around online and I didn't see it anywhere.
I recently needed this for work and ended up just uploading some files and querying the mimetypes. I found the following:
keynote: application/x-iwork-keynote-sffkey
pages: application/x-iwork-pages-sffpages
numbers: application/x-iwork-numbers-sffnumbers
2021 Update
Please note that this answer is now outdated and the following content types have been approved by IANA:
application/vnd.apple.pages
application/vnd.apple.keynote
application/vnd.apple.numbers
Looks like Apple doesn't much care, since installing iWork does not add any mime type information to any of its system mime-type info reps (in /etc/cups and /etc/apache2), "Get Info" on an iWork file shows no mime-type, etc. The only hint I've found is in Page's info.plist (a copy's online here) which mentions:
<key>public.filename-extension</key>
<array>
<string>pages</string>
</array>
<key>public.mime-type</key>
<array>
<string>application/x-iwork-pages-sffpages</string>
</array>
and a similar one for filename-extension "template", with "-sfftemplate" as the suffix instead of "-sffpages".
application/vnd.apple.keynote
application/vnd.apple.pages
application/vnd.apple.numbers
Just got it approved with IANA. You will find the list at the below link.
https://www.iana.org/assignments/media-types/media-types.xhtml.
You can use mime-db https://github.com/jshttp/mime-db to validate using javascript
This URL shows some other types in case new readers need it:
Apache Jira Issue TIKA-588
application/vnd.apple.keynote, application/vnd.apple.pages, application/vnd.apple.numbers
Actually, those files are all a masked zipfile. So, some systems might indicate their mimetype simply as application/zip.

Resources