Putsftp is taking a wrong path in SFTP server in nifi - file

I have a flow to fetch file from SFTP server, rename it and put it back to server in same location.
My flow:
Listsftp-> fetchsftp-> updateAttribute-> putsftp
My file location is in d drive, I have mentioned that location in remote path property of putsftp but it taking the path like
c:/users/myname/d:/file/location
And of course it is giving me error.
Is there any solution for this?
Thanks in advance.

you can use the SFTP processor only if you are using a server with Host - Port etc.
If you want to get some files from your disk (C:/ for example) you can use the GETFILE processor
an example of flow could be this:
GETSFTP with the property Keep Source File to false
UpdateAttribute
new property -> filename -> new_file_test.example
PUTSFTP
you can use GETSFTP/GETFILE PUTSFTP/PUTFILE

Related

Anybody know if OrcTableSource supports S3 file system?

I'm running into some troubles with using OrcTableSource to fetch Orc file from cloud Object storage(IBM COS), the code fragment is provided below:
OrcTableSource soORCTableSource = OrcTableSource.builder() // path to ORC
.path("s3://orders/so.orc") // s3://orders/so.csv
// schema of ORC files
.forOrcSchema(OrderHeaderORCSchema)
.withConfiguration(orcconfig)
.build();
seems this path is incorrect but anyone can help out? appreciate a lot!
Caused by: java.io.FileNotFoundException: File /so.orc does not exist
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:428)
at
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:142)
at
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:346)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768) at
org.apache.orc.impl.ReaderImpl.extractFileTail(ReaderImpl.java:528)
at org.apache.orc.impl.ReaderImpl.(ReaderImpl.java:370) at
org.apache.orc.OrcFile.createReader(OrcFile.java:342) at
org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:225)
at
org.apache.flink.orc.OrcRowInputFormat.open(OrcRowInputFormat.java:63)
at
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:170)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) at
java.lang.Thread.run(Thread.java:748)
By the way, I've already set up flink-s3-fs-presto-1.6.2 and had following code running correctly. The question is limited to OrcTableSource only.
DataSet<Tuple5<String, String, String, String, String>> orderinfoSet =
env.readCsvFile("s3://orders/so.csv")
.types(String.class, String.class, String.class
,String.class, String.class);
The problem is that Flink's OrcRowInputFormat uses two different file systems: One for generating the input splits and one for reading the actual input splits. For the former, it uses Flink's FileSystem abstraction and for the latter it uses Hadoop's FileSystem. Therefore, you need to configure Hadoop's configuration core-site.xml to contain the following snippet
<property>
<name>fs.s3.impl</name>
<value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
</property>
See this link for more information about setting up S3 for Hadoop.
This is a limitation of Flink's OrcRowInputFormat and should be fixed. I've created the corresponding issue.

Solr extract text from image and imagePdf files

I am working with Solr-6.5.1, I want to extract text from Image file and ImagePdf file.for this i installed TesseractOcr and configured this with solr in two ways:
1.Environment variable is set for TESSDATA_PREFIX = C:\Program Files (x86)\Tesseract-OCR and i used /update/extract request handler to index image with content.
2.I modified the tesseractOCRConfig.properties file in tika-parsers-1.13 jar file in solr lib to" tesseractPath=C:/Program Files (x86)/Tesseract-OCR" and i used /update/extract request handler to index image/imagePdf with content.
In this two way also i'm not getting any content ,But response giving only attr_x_parsed_by=org.apache.tika.parser.ocr.TesseractOCRParser.
Any other configuration i need to set for solr to TesseractOcr to extract content for Image/ImagePdf file.
Thanks in advance.

Using a different doctrine.yaml file in dev and prod with Symfony4

In a symfony4 project, I want to use a sqlite database in local development and a mariadb instance or production. In the structure, I see there is the main doctrine.yaml in config/packages.
Since sqlite file location congifured with path: and mariadb/mysql finds the connection string via url:, can I move the doctrine.yaml file to /dev/ and add the dbal: to /prod/doctrine.yaml?
Yes, because of these lines in your src/Kernel.php, Symfony will load first your global packages, and next your per-environment packages
protected function configureContainer(ContainerBuilder $container, LoaderInterface $loader)
{
$confDir = $this->getProjectDir().'/config';
$loader->load($confDir.'/{packages}/*'.self::CONFIG_EXTS, 'glob');
$loader->load($confDir.'/{packages}/'.$this->environment.'/**/*'.self::CONFIG_EXTS, 'glob');
$loader->load($confDir.'/{services}'.self::CONFIG_EXTS, 'glob');
$loader->load($confDir.'/{services}_'.$this->environment.self::CONFIG_EXTS, 'glob');
}

Mule File Inbound - empty files are not triggered

I have a scenario wherein I need to read files from a particular folder. So I had a File inbound as below, its reading all non-empty files. But empty files are not read and sits in the same location as is.
<file:inbound-endpoint path="${file.path}" responseTimeout="10000" doc:name="File" moveToDirectory="${audit.location}">
<file:filename-regex-filter pattern="file.employee(.*).xml,file.client(.*).xml"
caseSensitive="true"/>
</file:inbound-endpoint>
I removed File filter, but still it doesn't read empty files.
Is there a way to enable file inbound to read empty files too?
According the the Mule File Connector documentation:
The File connector as inbound endpoint does not process empty (0 bytes) files.
So this behavior is expected. There is no documented way to process non empty file with the File Inbound Endpoint.
However you can still write your own connector to do this, or use a workaround such as fill your "empty" file with a single character (such as a space) to make it non-empty
If you want to read a file with the size of 0 KB, then you can`t achieve this with File Connector, but we can read a file by using MuleRequester in the flow. I will share sample snippet soon. Please let me know,If you need any help.
Regards,
Sreenivas B
Mule File connector does not process empty (0 bytes) files as inbound endpoint
As per my knowledge File Inbound connector will not process (0 KB) size files.
On the File Connector, the class org.mule.transport.file.FileMessageReceiver.java in method poll has :
if (file.length() == 0)
{
if (logger.isDebugEnabled())
{
logger.debug("Found empty file '" + file.getName() + "'. Skipping file.");
}
continue;
}
that prevents it from proccessing empty files
But you can create your own CustomFileMessageReceiver.java, create your package:
package com.mycompany.mule.transport.file;
And the class that extends AbstractPollingMessageReceiver
public class CustomFileMessageReceiver extends AbstractPollingMessageReceiver
Copy the original FileMessageReceiver.java methods but comment the above lines and change FileMessageReceiver to CustomFileMessageReceiver where needed.
The call fileConnector.move(file, workFile) is a protected method from the original package, commented and beware you cannot use workdir.
In the same package create a copy of org.mule.transport.file.ReceiverFileInputStream.java
Configure your connector:
<file:connector name="FILE" readFromDirectory="${incoming.directory}" autoDelete="true" streaming="false" recursive="true" validateConnections="true" doc:name="File" writeToDirectory="${processed.directory}">
<service-overrides messageReceiver="com.mycompany.mule.transport.file.CustomFileMessageReceiver" />
</file:connector>
Or you may implement your own file connector, as stated in the above answers.

Prepare multi-databases with play framework

I want to prepare my application to be compatible with many databases types. To try it i've used H2, MySql and Postgresql. So 'ive added into build.sbt :
"mysql" % "mysql-connector-java" % "5.1.35",
"org.postgresql" % "postgresql" % "9.4-1201-jdbc41"
and i've added conf/prod.conf with all configuration without database configuration, and 3 files:
conf/h2.conf
include "prod.conf"
db.h2.driver=org.h2.Driver
db.h2.url="jdbc:h2:mem:dontforget"
db.h2.jndiName=DefaultDS
ebean.h2="fr.chklang.dontforget.business.*"
conf/mysql.conf
include "prod.conf"
db.mysql.driver=com.mysql.jdbc.Driver
db.mysql.jndiName=DefaultDS
ebean.mysql="fr.chklang.dontforget.business.*"
conf/postgresql.conf
include "prod.conf"
db.postgresql.driver=org.postgresql.Driver
db.postgresql.jndiName=DefaultDS
ebean.postgresql="fr.chklang.dontforget.business.*"
Add to it i've three folders into conf/evolutions with
evolutions/h2
evolutions/mysql
evolutions/postgresql
with these things user can start my application with this command :
-Dconfig.file=dontforget-conf.conf -DapplyEvolutions.default=true -Dhttp.port=10180 &
And this conf file is
include "postgresql.conf"
db.postgresql.url="jdbc:postgresql:dontforget"
db.postgresql.user=myUserName
db.postgresql.password=myPassword
But with this configuration, when my application try to connect to DB :
The default EbeanServer has not been defined? This is normally set via the ebean.datasource.default property. Otherwise it should be registered programatically via registerServer()]]
So i've tried to add, into my configuration :
ebean.datasource.default=postgresql
but when i add it i've :
Configuration error: Configuration error[Configuration error[]]
at play.api.Configuration$.play$api$Configuration$$configError(Configuration.scala:94)
at play.api.Configuration.reportError(Configuration.scala:743)
at play.Configuration.reportError(Configuration.java:310)
at play.db.ebean.EbeanPlugin.onStart(EbeanPlugin.java:56)
at play.api.Play$$anonfun$start$1$$anonfun$apply$mcV$sp$1.apply(Play.scala:91)
at play.api.Play$$anonfun$start$1$$anonfun$apply$mcV$sp$1.apply(Play.scala:91)
at scala.collection.immutable.List.foreach(List.scala:383)
at play.api.Play$$anonfun$start$1.apply$mcV$sp(Play.scala:91)
at play.api.Play$$anonfun$start$1.apply(Play.scala:91)
at play.api.Play$$anonfun$start$1.apply(Play.scala:91)
at play.utils.Threads$.withContextClassLoader(Threads.scala:21)
at play.api.Play$.start(Play.scala:90)
at play.core.StaticApplication.<init>(ApplicationProvider.scala:55)
at play.core.server.NettyServer$.createServer(NettyServer.scala:253)
at play.core.server.NettyServer$$anonfun$main$3.apply(NettyServer.scala:289)
at play.core.server.NettyServer$$anonfun$main$3.apply(NettyServer.scala:284)
at scala.Option.map(Option.scala:145)
at play.core.server.NettyServer$.main(NettyServer.scala:284)
at play.core.server.NettyServer.main(NettyServer.scala)
Caused by: Configuration error: Configuration error[]
at play.api.Configuration$.play$api$Configuration$$configError(Configuration.scala:94)
at play.api.Configuration.reportError(Configuration.scala:743)
at play.api.db.BoneCPApi.play$api$db$BoneCPApi$$error(DB.scala:271)
at play.api.db.BoneCPApi$$anonfun$getDataSource$3.apply(DB.scala:438)
at play.api.db.BoneCPApi$$anonfun$getDataSource$3.apply(DB.scala:438)
at scala.Option.getOrElse(Option.scala:120)
at play.api.db.BoneCPApi.getDataSource(DB.scala:438)
at play.api.db.DB$$anonfun$getDataSource$1.apply(DB.scala:142)
at play.api.db.DB$$anonfun$getDataSource$1.apply(DB.scala:142)
at scala.Option.map(Option.scala:145)
at play.api.db.DB$.getDataSource(DB.scala:142)
at play.api.db.DB.getDataSource(DB.scala)
at play.db.DB.getDataSource(DB.java:25)
at play.db.ebean.EbeanPlugin.onStart(EbeanPlugin.java:54)
So i don't understand how i can do it.
YES!!! I've found it! After debug mode (etc...)
There was 2 problems.
First problem : I must add a key into my application.conf :
ebeanconfig.datasource
For me (for exemple), postgresql.conf is modified to :
db.postgresql.driver=org.postgresql.Driver
db.postgresql.jndiName=DefaultDS
ebean.postgresql="fr.chklang.dontforget.business.*"
ebeanconfig.datasource.default=postgresql
Second problem : include into play 2.3.x don't works because conf folder isn't added into classpath (ref Load file from '/conf' directory on Cloudbees ) so we must concat prod.conf, postgresql.conf and dontforget.conf into an only single file.
I hope i have helped any other developper...

Resources