Can I change the output path format of Flink Streaming File Sink? - apache-flink

I'm using Pyflink and the Streaming API to sync data into the file system. The path of the output files were like:
-2023-01-28--01
|-part-xxx-0.json
-2023-01-28--03
|-part-xxx-0.json
It seems the output file path format is {year}-{month}-{day}--{hour}/part-xxx-{commit}.json. How can I change the path format to such as {year}/{month}/{day}/{hour}/part-xxx-{commit}.json?

Write a custom class extends DateTimeBucketAssigner and override the path generation logic in the getBucketId method
Here's an example - saving to a path with a prefix as the POJO class name:
public class DateTimeWithClassPrefixBucketAssigner<IN> extends DateTimeBucketAssigner {
....
#Override
public String getBucketId(Object element, Context context) {
if (dateTimeFormatter == null) {
dateTimeFormatter = DateTimeFormatter.ofPattern(formatString).withZone(zoneId);
}
String prefix = element.getClass().getSimpleName();
return prefix + "/" + dateTimeFormatter.format(Instant.ofEpochMilli(context.currentProcessingTime()));
}
}
End covert data format
import java.text.SimpleDateFormat;
...
String input = "2022-01-31--10";
String output = new SimpleDateFormat("{year}/{month}/{day}/{hour}/part-xxx.json").format(
new SimpleDateFormat("yyyy-MM-dd--HH").parse(input)

Related

Unable to Correctly Serialize RangeSet<Instant> with Flink Serialization System

I've implemented a RichFunction with following type:
RichMapFunction<GeofenceEvent, OutputRangeSet>
the class OutputRangeSet has a field of type:
com.google.common.collect.RangeSet<Instant>
When this pojo is serialized using Kryo I get null fields !
So far, I tried using a TypeInfoFactory<RangeSet>:
public class InstantRangeSetTypeInfo extends TypeInfoFactory<RangeSet<Instant>> {
#Override
public TypeInformation<RangeSet<Instant>> createTypeInfo(Type t, Map<String, TypeInformation<?>> genericParameters) {
TypeInformation<RangeSet<Instant>> info = TypeInformation.of(new TypeHint<RangeSet<Instant>>() {});
return info;
}
}
That annotate my field:
public class OutputRangeSet implements Serializable {
private String key;
#TypeInfo(InstantRangeSetTypeInfo.class)
private RangeSet<Instant> rangeSet;
}
Another solution (that doesn't work either) is registring a third party serializer:
env.getConfig().registerTypeWithKryoSerializer(RangeSet.class, ProtobufSerializer.class);
You can get the github project here:
https://github.com/elarbikonta/tonl-events
When you run the test you can see (in debug) that the rangeSet beans I get from my RichFunction has null fields, see test method com.tonl.apps.events.IsVehicleInZoneTest#operatorChronograph :
final RangeSet<Instant> rangeSet = resultList.get(0).getRangeSet(); // rangetSet.ranges = null !
Thanks for your help

is JSONDeserializationSchema() deprecated in Flink?

I am new to Flink and doing something very similar to the below link.
Cannot see message while sinking kafka stream and cannot see print message in flink 1.2
I am also trying to add JSONDeserializationSchema() as a deserializer for my Kafka input JSON message which is without a key.
But I found JSONDeserializationSchema() is not present.
Please let me know if I am doing anything wrong.
JSONDeserializationSchema was removed in Flink 1.8, after having been deprecated earlier.
The recommended approach is to write a deserializer that implements DeserializationSchema<T>. Here's an example, which I've copied from the Flink Operations Playground:
import org.apache.flink.api.common.serialization.DeserializationSchema;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.ObjectMapper;
import java.io.IOException;
/**
* A Kafka {#link DeserializationSchema} to deserialize {#link ClickEvent}s from JSON.
*
*/
public class ClickEventDeserializationSchema implements DeserializationSchema<ClickEvent> {
private static final long serialVersionUID = 1L;
private static final ObjectMapper objectMapper = new ObjectMapper();
#Override
public ClickEvent deserialize(byte[] message) throws IOException {
return objectMapper.readValue(message, ClickEvent.class);
}
#Override
public boolean isEndOfStream(ClickEvent nextElement) {
return false;
}
#Override
public TypeInformation<ClickEvent> getProducedType() {
return TypeInformation.of(ClickEvent.class);
}
}
For a Kafka producer you'll want to implement KafkaSerializationSchema<T>, and you'll find examples of that in that same project.
To solve the problem of reading non-key JSON messages from Kafka I used case class and JSON parser.
The following code makes a case class and parses the JSON field using play API.
import play.api.libs.json.JsValue
object CustomerModel {
def readElement(jsonElement: JsValue): Customer = {
val id = (jsonElement \ "id").get.toString().toInt
val name = (jsonElement \ "name").get.toString()
Customer(id,name)
}
case class Customer(id: Int, name: String)
}
def main(args: Array[String]): Unit = {
val env = StreamExecutionEnvironment.getExecutionEnvironment
val properties = new Properties()
properties.setProperty("bootstrap.servers", "xxx.xxx.0.114:9092")
properties.setProperty("group.id", "test-grp")
val consumer = new FlinkKafkaConsumer[String]("customer", new SimpleStringSchema(), properties)
val stream1 = env.addSource(consumer).rebalance
val stream2:DataStream[Customer]= stream1.map( str =>{Try(CustomerModel.readElement(Json.parse(str))).getOrElse(Customer(0,Try(CustomerModel.readElement(Json.parse(str))).toString))
})
stream2.print("stream2")
env.execute("This is Kafka+Flink")
}
The Try method lets you overcome the exception thrown while parsing the data
and returns the exception in one of the fields (if we want) or else it can just return the case class object with any given or default fields.
The sample output of the Code is:
stream2:1> Customer(1,"Thanh")
stream2:1> Customer(5,"Huy")
stream2:3> Customer(0,Failure(com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
at [Source: ; line: 1, column: 0]))
I am not sure if it is the best approach but it is working for me as of now.

parse XML message using SPEL

In my Spring Integration pipeline I am getting a XML payload and depending on the value of the attributes in the XML I have to generate a key and publish it to kafka.
return IntegrationFlows.from(Kafka.messageDrivenChannelAdapter(kafkaListenerContainer))
.wireTap(ACARS_WIRE_TAP_CHNL) // Log the raw message
.enrichHeaders(h ->h.headerFunction(KafkaHeaders.MESSAGE_KEY, m -> {
StringBuilder header = new StringBuilder();
Expression expression = new SpelExpressionParser().parseExpression("payload.Body.toString()");
//Expression expression = new SpelExpressionParser().parseExpression("m.payload.Body.ACIFlight.fltNbr.toString()");
String flightNbr = expression.getValue(String.class);
header.append(flightNbr);
return header.toString();
}))
.get();
XMl is
<?xml version="1.0" encoding="UTF-8"?>
<ns0:Envelope xmlns:ns0="http://www.exmaple.com/FlightLeg">
<ns0:Header>
<ns1:eventHeader xmlns:ns1="http://www.exmaple.com/header" eventID="659" eventName="FlightLegEvent" version="1.0.0">
<ns1:eventSubType>FlightLeg</ns1:eventSubType>
</ns1:eventHeader>
</ns0:Header>
<ns0:Body>
<ns1:ACIFlight xmlns:ns1="http://ual.com/cep/aero/ACIFlight">
<flightKey>1267:07042020:UA</flightKey>
<fltNbr>1267</fltNbr>
<fltLastLegDepDt>07042020</fltLastLegDepDt>
<carrCd>UA</carrCd>
</ns1:ACIFlight>
</ns0:Body>
</ns0:Envelope>
I am trying to get the fltNbr from this xml payload using spel. Please suggest
Updated
String flight = XPathUtils.evaluate(message.getPayload(), "/*[local-name() = 'fltNbr']",XPathUtils.STRING);
String DepDate = XPathUtils.evaluate(message.getPayload(), "/*[local-name() = 'fltLastLegDepDt']",XPathUtils.STRING);
return MessageBuilder.fromMessage(message).setHeader("key", flight+DepDate).build();
You can use the XPath Header Enricher.
XPath is also available as a Spel function, but you'd be better off using the enricher in this case.
public class XPathHeaderEnricher extends HeaderEnricher {
Here's a test case...
#Test
public void convertedEvaluation() {
Map<String, XPathExpressionEvaluatingHeaderValueMessageProcessor> expressionMap =
new HashMap<String, XPathExpressionEvaluatingHeaderValueMessageProcessor>();
XPathExpressionEvaluatingHeaderValueMessageProcessor processor = new XPathExpressionEvaluatingHeaderValueMessageProcessor(
"/root/elementOne");
processor.setHeaderType(TimeZone.class);
expressionMap.put("one", processor);
String docAsString = "<root><elementOne>America/New_York</elementOne></root>";
XPathHeaderEnricher enricher = new XPathHeaderEnricher(expressionMap);
Message<?> result = enricher.transform(MessageBuilder.withPayload(docAsString).build());
MessageHeaders headers = result.getHeaders();
assertThat(headers.get("one")).as("Wrong value for element one expression")
.isEqualTo(TimeZone.getTimeZone("America/New_York"));
}

javafx CheckBoxTreeItem<File> TreeView<File>. How to show only the name of file and not the full path?

Thanks in advance for support...
I'm coding in JavaFx using TreeView class and CheckBoxTreeItem. I want to show in the treeview the CheckBoxTreeItem (File) only the name of Path or File, and all sub, that user choice. All the stuff about selecting path, flush trow work fine, but when I upload this on the treeview the object show the full path of file and not the name. I want only show the name.
To do this I use a class that extend CheckBoxTreeItem:
public class FilePathTreeItem_analisi extends CheckBoxTreeItem<File>
My question is this: When I add this to the TreeView in this way:
TreeView<File> treview_Base;
treview_Base.setCellFactory(CheckBoxTreeCell.<File>forTreeView());
FilePathTreeItem_analisi Ckaggiunto = new FilePathTreeItem_analisi(file.toPath());
.. and other command that correct upload file
Why its show the full path and not only the name?
This is the class of CheckBoxTreeItem:
public class FilePathTreeItem_analisi extends CheckBoxTreeItem<File>
public FilePathTreeItem_analisi(Path file){
super(file.toFile());
dilavoro =file;
this.fullPath=file.toString();
this.setIndependent(false);
//test if this is a directory and set the icon
if(Files.isDirectory(file)){
this.isDirectory=true;
this.setGraphic(new ImageView(folderCollapseImage));
}else{
this.isDirectory=false;
this.setGraphic(new ImageView(fileImage));
}
this.setValue(file.toFile());
... and some listyener and eventHandler...
So my question is: what I have to use in the class that extend CheckBoxTreeItem to show in the TreeView the name of the file and not the entire path?
Use
StringConverter<File> converter = new StringConverter<File>() {
#Override
public String toString(File file) {
return file.getName();
}
#Override
public File fromString(String string) {
// not used by CheckBoxTreeCell:
return null ;
}
};
treview_Base.setCellFactory(tv -> {
CheckBoxTreeCell<File> cell = new CheckBoxTreeCell<>();
cell.setConverter(converter);
return cell ;
});
instead of
treview_Base.setCellFactory(CheckBoxTreeCell.<File>forTreeView());

How to use property file as a object repository in Selenium WebDriver Automation?

How to use property files as a object repository in Selenium WebDriver Automation ?
I am seeking for instructions regarding the setup and the steps that need to be done to achieve this.
Create a framework.properties file and store the variables in this way(below are two locators with sample values)
locator1=username
locator2=password
Create a class for loading the properties file. You can use the snippet below:
Note:Path /src/main/resources/com/framework/properties/ is a sample path and may change as per your framework
public class PropertyManager {
private static final Properties PROPERTY = new Properties();
private static final String FRAMEWORKPROPERTIESPATH = "/src/main/resources/com/framework/properties/";
private static final Logger LOGGER = Logg.createLogger();
public static Properties loadPropertyFile(String propertyToLoad) {
try {
PROPERTY.load(new FileInputStream(System.getProperty("user.dir")
+ FRAMEWORKPROPERTIESPATH + propertyToLoad));
} catch (IOException io) {
LOGGER.info(
"IOException in the loadFrameworkPropertyFile() method of the PropertyManager class",
io);
Runtime.getRuntime().halt(0);
}
return PROPERTY;
}
}
When you want to access the variables from the property class, use the snippet below:
private static final Properties LOCATORPROPERTIES = PropertyManager
.loadPropertyFile("framework.properties");
public void click() {
driver.findElement(By.id(LOCATORPROPERTIES.getProperty("locator1")));
}
Create any file & save it with .properties extension
For example - Add new file in eclipse By right click on project > New > File
Add below data in config.properties file and save
Username = Jhon
Password = Qwerty123
Write Below code to access this file
String filepath = "./config.properties" ; // Path of .properties file
File f = new File(filepath);
FileInputStream fs = new FileInputStream(f);
Properties pro = new Properties();
Pro.Load(fs);
pro.getProperty("Username"); // return value "Jhon" return type string
pro.getProperty("Password"); // retun value "Qwerty123" return type string
Also use like -
driver.findelement(By.id("user")).sendKeys(pro.getProperty("Username"));
driver.findelement(By.id("pass")).sendKeys(pro.getProperty("Password"));

Resources