How to use CSVImporter and create vertex supplier - jgrapht

I can't find any documentation on how to use the CSVImporter (1.5.0). I have a very simple csv file with integers that I'm trying to import using the following code:
Graph<String, DefaultEdge> wpCategories = new DirectedMultigraph(DefaultEdge.class);
CSVImporter<String, DefaultEdge> importer = new CSVImporter(CSVFormat.EDGE_LIST);
importer.importGraph(wpCategories, new File("hypernymGraphWithEntities_WP1-small.csv"));
I just get a "The graph contains no vertex supplier" exception. How do I create a vertex supplier?

A JGraphT graph consists of vertex and edge objects. When importing a graph from a text file, the importer must somehow create vertex objects for every vertex it encounters in the text file. These objects must be of the same type you defined in the graph. To generate these objects, JGraphT uses vertex suppliers.
Various examples of how to use the CSV importer can be found in the corresponding test class CSVImporterTest.
There are two different ways to create a graph with a vertex supplier. Either you use the GraphTypeBuilder, or you use one of the graph constructors. Here's an example for a directed graph.
//Builder
Graph<String,DefaultEdge> g1 = GraphTypeBuilder.directed().allowingMultipleEdges(false).allowingSelfLoops(false).weighted(false).edgeClass(DefaultEdge.class).vertexSupplier(SupplierUtil.createStringSupplier(1)).buildGraph();
//Constructor
Graph<String,DefaultEdge> g2 = new DefaultDirectedGraph(SupplierUtil.createStringSupplier(1),SupplierUtil.DEFAULT_EDGE_SUPPLIER,false);
So applied to your example this would give:
Graph<String, DefaultEdge> wpCategories = new DirectedMultigraph(SupplierUtil.createStringSupplier(1),SupplierUtil.DEFAULT_EDGE_SUPPLIER,false);
CSVImporter<String, DefaultEdge> importer = new CSVImporter(CSVFormat.EDGE_LIST);
importer.importGraph(wpCategories, new File("hypernymGraphWithEntities_WP1-small.csv"));
Note that, as an alternative to the vertex supplier, you could also use the setVertexFactory function in the CSVImporter class. Again, using your code:
Graph<String, DefaultEdge> wpCategories = new DirectedMultigraph(DefaultEdge.class);
CSVImporter<String, DefaultEdge> importer = new CSVImporter(CSVFormat.EDGE_LIST);
Function<String, String> vertexFactory = x -> x;
importer.setVertexFactory(vertexFactory);
importer.importGraph(wpCategories, new File("hypernymGraphWithEntities_WP1-small.csv"));
Disclaimer: In absence of data, the above code isn't tested.

Related

Serializing objects Codename One

How do I serialize an object in order to make a customizable Parse initialization? Like:
//Simple text fields to get info
TextField url = new TextField();
TextField appid = new TextField();
TextField clientkey = new TextField();
then I put all in an object e.g:
Myclass object = new Myclass();
object.url = url.getText();
object.appid = appid.getText();
object.clientkey = clientkey.getText();
so I put it here, but before it needs to be serialized in order to keep its values after my app get restarted.
//After serialization
Parse.initialize(object.url, object.appid, object.clientkey);
In this way I can set my Parse initialization by my application instead.
I'd appreciate to see an example of serialization in this case.
When you store an object in parse it's saved locally so you don't need to serialize.
FYI Codename One supports the Externalizable interface to serialize objects in binary form. It also supports seamless externalization for object properties. The latter don't work with Parse AFAIK.
There's no support for serialization. You're on your own.

JPA map entity with array datatype

I have a table which contains a column of type: integer[]
I'm trying to map my entity to this table and I've tried the following suggestion of:
#ElementCollection
private ArrayList<Integer> col;
public MyEntity() {
col = new ArrayList<>();
}
However I get the following error: Illegal attempt to map a non collection as a #OneToMany, #ManyToMany or #CollectionOfElements
Not sure how to get around this. I'm open to changing the entity's datatype, but I would prefer not to move this property into its own table/entity. Is there another solution? Thanks.
The field must be of type List<Integer>, not ArrayList<Integer>.
The JPA engine must be able to use its own List implementation, used for lazy-loading, dirty checking, etc.
It's a good idea in general to program on interfaces rather than implementations, and it's a requirement to do it in JPA entities.

Training own model in opennlp

I am finding it difficult to create my own model openNLP.
Can any one tell me, how to own model.
How the training shouls be done.
What should be the input and where the output model file will get stored.
https://opennlp.apache.org/docs/1.5.3/manual/opennlp.html
This website is very useful, shows both in code, and using the OpenNLP application to train models for all different types, like entity extraction and part of speech etc.
I could give you some code examples in here, but the page is very clear to use.
Theory-wise:
Essentially you create a file which lists the stuff you want to train
eg.
Sport [whitespace] this is a page about football, rugby and stuff
Politics [whitespace] this is a page about tony blair being prime minister.
The format is described on the page above (each model expects a different format). once you have created this file, you run it through either the API or the opennlp application (via command line), and it generates a .bin file. Once you have this .bin file, you can load it into a model, and start using it (as per the api in the above website).
First you need to train the data with the required Entity.
Sentences should be separated with new line character (\n). Values should be separated from and tags with a space character.
Let's say you want to create medicine entity model, so data should be something like this:
<START:medicine> Augmentin-Duo <END> is a penicillin antibiotic that contains two medicines - <START:medicine> amoxicillin trihydrate <END> and
<START:medicine> potassium clavulanate <END>. They work together to kill certain types of bacteria and are used to treat certain types of bacterial infections.
You can refer a sample dataset for example. Training data should have at least 15000 sentences to get the better results.
Further you can use Opennlp TokenNameFinderTrainer.
Output file will be in the .bin format.
Here is the example: Writing a custom NameFinder model in OpenNLP
For more details, refer the Opennlp documentation
Perhaps this article will help you out. It describes how to do TokenNameFinder training from data extracted from Wikipedia...
nuxeo - blog - Mining Wikipedia with Hadoop and Pig for Natural Language Processing
Copy the data in data and run below code to get your own mymodel.bin .
Can refer for data=https://github.com/mccraigmccraig/opennlp/blob/master/src/test/resources/opennlp/tools/namefind/AnnotatedSentencesWithTypes.txt
public class Training {
static String onlpModelPath = "mymodel.bin";
// training data set
static String trainingDataFilePath = "data.txt";
public static void main(String[] args) throws IOException {
Charset charset = Charset.forName("UTF-8");
ObjectStream<String> lineStream = new PlainTextByLineStream(
new FileInputStream(trainingDataFilePath), charset);
ObjectStream<NameSample> sampleStream = new NameSampleDataStream(
lineStream);
TokenNameFinderModel model = null;
HashMap<String, Object> mp = new HashMap<String, Object>();
try {
// model = NameFinderME.train("en","drugs", sampleStream, Collections.<String,Object>emptyMap(),100,4) ;
model= NameFinderME.train("en", "drugs", sampleStream, Collections. emptyMap());
} finally {
sampleStream.close();
}
BufferedOutputStream modelOut = null;
try {
modelOut = new BufferedOutputStream(new FileOutputStream(onlpModelPath));
model.serialize(modelOut);
} finally {
if (modelOut != null)
modelOut.close();
}
}
}

Dapper Correct Object / Aggregate Mapping

I have recently started evaluating Dapper as a potential replacement for EF, since I was not too pleased with the SQL that was being generated and wanted more control over it. I have a question regarding mapping a complex object in my domain model. Let's say I have an object called Provider, Provider can contain several properties of type IEnumerable that should only be accessed by going through the parent provider object (i.e. aggregate root). I have seen similar posts that have explained using the QueryMultiple and a Map extension method but was wondering how if I wanted to write a method that would bring back the entire object graph eager loaded, if Dapper would be able to do this in one fell swoop or if it needed to be done piece-meal. As an example lets say that my object looked something like the following:
public AggregateRoot
{
public int Id {get;set;}
...//simple properties
public IEnumerable<Foo> Foos
public IEnumerable<Bar> Bars
public IEnumerable<FooBar> FooBars
public SomeOtherEntity Entity
...
}
Is there a straightforward way of populating the entire object graph using Dapper?
I have a similar situation. I made my sql return flat, so that all the sub objects come back. Then I use the Query<> to map the full set. I'm not sure how big your sets are.
So something like this:
var cnn = sqlconnection();
var results = cnn.Query<AggregateRoot,Foo,Bars,FooBar,someOtherEntity,AggregateRoot>("sqlsomething"
(ar,f,b,fb,soe)=>{
ar.Foo = f;
ar.Bars = b;
ar.FooBar = fb;
ar.someotherentity = soe;
return ar;
},.....,spliton:"").FirstOrDefault();
So the last object in the Query tag is the return object. For the SplitOn, you have to think of the return as a flat array that the mapping will run though. You would pick the first return value for each new object so that the new mapping would start there.
example:
select ID,fooid, foo1,foo2,BarName,barsomething,foobarid foobaritem1,foobaritem2 from blah
The spliton would be "ID,fooid,BarName,foobarid". As it ran over the return set, it will map the properties that it can find in each object.
I hope that this helps, and that your return set is not too big to return flat.

JGrapht: Generate subgraphs with DirectedSubgraph.java class

I use jgrapht. I will generate subgraphs.
I think jgrapht-0.8.2/jgrapht-0.8.2/src/org/jgrapht/graph/DirectedSubgraph.java is useful for this purpose. But I could not find how can I use this class? Can you help me ?
For example: jgrapht-0.8.2/jgrapht-0.8.2/src/org/jgrapht/demo/HelloJGraphT.java
A directed graph constructor is used like that in HelloJGraphT.java class
DirectedGraph<String, DefaultEdge> g =
new DefaultDirectedGraph<String, DefaultEdge>(DefaultEdge.class);
if you want to create your new sub graph, you have to write this code:
DirectedSubgraph<String, DefaultEdge> YouSubGraph = new DirectedSubgraph<String, DefaultEdge>(arg0, arg1, arg2)
Where arg0, is your main graph, arg1 is the set of your vertex in your sub graph, and arg2 is the set of your edges in your sub graph.
You can obtain the edged set using:
Set<DefaultEdge> YourEdges = YouSubGraph.edgeSet();
I think that you could obtain the vertex on the same way.
Sorry for mi English I hope it helps you.

Resources