Using a weka decision tree classifier without the whole weka library? - export

I have trained a classifier for my instances, and now want to export it to an Android application, where the Weka library will be unavailable.
It is not suitable to simply add the Weka library in the Android application, because of it's size (6.5 Mb).
Is there any other way to use my classifier to evaluate and label other unlabeled instances? Are there any smaller, independent library specifically design for this?
Of course I could, eventually, write my own library to interpret the output model of Weka, but it would seem logical to me, that such a solution already exists. (although it escapes me, somehow)

There are no independent libraries that would do what you want. You could remove all the parts of Weka you don't need and package that into a library.
In your particular case, the easiest thing to do might be to take the decision tree that Weka learns and put it directly into the code in a series of if...else statements. You could even write a script that takes the (graphical) output of the decision tree and writes that code for you.

After paying more attention the output model of weka, I noticed that by using the option that generates the tree in a Java class form, I can use it separatly from the weka library.
You can remove the generated WekaWrapper and keep only the internal class, which is a basic implementation of the tree:
The class looks something like this:
public class WekaWrapper
extends Classifier {
/**
* Returns only the toString() method.
*
* #return a string describing the classifier
*/
public String globalInfo() {
return toString();
}
/**
* Returns the capabilities of this classifier.
*
* #return the capabilities
*/
public Capabilities getCapabilities() {
weka.core.Capabilities result = new weka.core.Capabilities(this);
result.enable(weka.core.Capabilities.Capability.NOMINAL_ATTRIBUTES);
result.enable(weka.core.Capabilities.Capability.NOMINAL_CLASS);
result.enable(weka.core.Capabilities.Capability.MISSING_CLASS_VALUES);
result.setMinimumNumberInstances(0);
return result;
}
/**
* only checks the data against its capabilities.
*
* #param i the training data
*/
public void buildClassifier(Instances i) throws Exception {
// can classifier handle the data?
getCapabilities().testWithFail(i);
}
/**
* Classifies the given instance.
*
* #param i the instance to classify
* #return the classification result
*/
public double classifyInstance(Instance i) throws Exception {
Object[] s = new Object[i.numAttributes()];
for (int j = 0; j < s.length; j++) {
if (!i.isMissing(j)) {
if (i.attribute(j).isNominal())
s[j] = new String(i.stringValue(j));
else if (i.attribute(j).isNumeric())
s[j] = new Double(i.value(j));
}
}
// set class value to missing
s[i.classIndex()] = null;
return WekaClassifier.classify(s);
}
/**
* Returns the revision string.
*
* #return the revision
*/
public String getRevision() {
return RevisionUtils.extract("1.0");
}
/**
* Returns only the classnames and what classifier it is based on.
*
* #return a short description
*/
public String toString() {
return "Auto-generated classifier wrapper, based on weka.classifiers.trees.Id3 (generated with Weka 3.6.9).\n" + this.getClass().getName() + "/WekaClassifier";
}
/**
* Runs the classfier from commandline.
*
* #param args the commandline arguments
*/
public static void main(String args[]) {
runClassifier(new WekaWrapper(), args);
}
}
class WekaClassifier {
private static void checkMissing(Object[] i, int index) {
if (i[index] == null)
throw new IllegalArgumentException("Null values are not allowed!");
}
public static double classify(Object[] i) {
return node0(i);
}
protected static double node0(Object[] i) {
return 0.0; // unacc
}
}
So, yeah, in fact you can do it really easy. Things to remember:
to classify an instance, call the classify(Object[]) method;
the return value will be a float value;
the return values are explained in comments, right next to the return command;
the parameters have no validation, so be careful in which order you are inputing them (this part was done by the weka dependent part);
the order is the one defined in the arff file.

If you want to run RandomForests, you can use a little script I wrote that turns the output of WEKA's -printTrees option of the RandomForest classifier into Java source code.
http://pielot.org/2015/06/exporting-randomforest-models-to-java-source-code/
The code you need to include into your Android app will consist of three classes only: the class with the generated model + two classes to make the classification work.

Related

Codename One API to append / merge files

To merge Storage files in Codename One I elaborated this solution:
/**
* Merges the given list of Storage files in the output Storage file.
* #param toBeMerged
* #param output
* #throws IOException
*/
public static synchronized void mergeStorageFiles(List<String> toBeMerged, String output) throws IOException {
if (toBeMerged.contains(output)) {
throw new IllegalArgumentException("The output file cannot be contained in the toBeMerged list of input files.");
}
// Note: the temporary file used for merging is placed in the FileSystemStorage because it offers the method
// openOutputStream(String file, int offset) that allows appending to a stream. Storage doesn't have a such method.
long writtenBytes = 0;
String tempFile = FileSystemStorage.getInstance().getAppHomePath() + "/tempFileUsedInMerge";
for (String partialFile : toBeMerged) {
InputStream in = Storage.getInstance().createInputStream(partialFile);
OutputStream out = FileSystemStorage.getInstance().openOutputStream(tempFile, (int) writtenBytes);
Util.copy(in, out);
writtenBytes = FileSystemStorage.getInstance().getLength(tempFile);
}
Util.copy(FileSystemStorage.getInstance().openInputStream(tempFile), Storage.getInstance().createOutputStream(output));
FileSystemStorage.getInstance().delete(tempFile);
}
This solution is based on the API FileSystemStorage.openOutputStream(String file, int offset), that is the only API that I found to allow to append the content of a file to another.
Are there other API that can be used to append or merge files?
Thank you
Since you end up copying everything to a Storage entry I don't see the value of using FileSystemStorage as an intermediate merging tool.
The only reason I can think of is integrity of the output file (e.g. if failure happens while writing) but that can happen here too. You can guarantee integrity by setting a flag e.g. creating a file called "writeLock" and deleting it when write has finished successfully.
To be clear I would copy like this which is simpler/faster:
try(OutputStream out = Storage.getInstance().createOutputStream(output)) {
for (String partialFile : toBeMerged) {
try(InputStream in = Storage.getInstance().createInputStream(partialFile)) {
Util.copyNoClose(in, out, 8192);
}
}
}

whats is happening in this apex code?

String color1 = moreColors.get(0);
String color2 = moreColors[0];
System.assertEquals(color1, color2);
// Iterate over a list to read elements
for(Integer i=0;i<colors.size();i++) {
// Write value to the debug log
System.debug(colors[i]);
}
I am learning Apex and just started what is meaning of line System.assertEquals(color1, color2); and what is mean by debug log here?
System.assert, System.assertEquals, System.assertNotEquals. I argue these are three of the most important method calls in Apex.
These are assert statements. They are used in testing to validate that the data you have matches your expectations.
System.assert tests an logical statement. If the statement evaluates to True, the code keeps running. If the statement evaluates to False, the code throws an exception.
System.assertEquals tests that two values are equal. If the two are equal, the code keeps running. If they are not equal, the code throws an exception.
System.assertNotEqual tests that two values are not equal. If the two are not equal, the code keeps running. If they are equal, the code throws an exception.
These are critical for completing system testing. In Apex Code, you must have 75% line test coverage. Many people do this by generating test code that simply covers 75% of their lines of code. However, this is an incomplete test. A good test class actually tests that the code does what you expect. This is really great to ensure that your code actually works. This makes debugging and regression testing far easier. For example. Lets create a method called square(Integer i) that squares the integer returned.
public static Integer square( Integer i ) {
return i * i;
}
A poor test method would simply be:
#isTest
public static void test_squar() {
square( 1 );
}
A good test method could be:
#isTest
public static void test_square() {
Integer i;
Integer ret_square;
i = 3;
ret_square = square( i );
System.assertEquals( i * i; ret_square );
}
How I would probably write it is like this:
#isTest
public static void test_square() {
for( Integer i = 0; i < MAX_TEST_RUNS; i++ ) {
System.assertEquals( i*i, square( i ) );
}
}
Good testing practices are integral to being a good developer. Look up more on Testing-Driven Development. https://en.wikipedia.org/wiki/Test-driven_development
Line by Line ...
//Get color in position 0 of moreColors list using the list get method store in string color1
String color1 = moreColors.get(0);
//Get color in position 0 of moreColors list using array notation store in string color2,
//basically getting the same value in a different way
String color2 = moreColors[0];
//Assert that the values are the same, throws exception if false
System.assertEquals(color1, color2);
// Iterate over a list to read elements
for(Integer i=0;i<colors.size();i++) {
// Write value to the debug log
System.debug(colors[i]);//Writes the value of color list ith position to the debug log
}
If you are running this code anonymously via the Developer console you can look for lines prefixed with DEBUG| to find the statements, for e.g.
16:09:32:001 USER_DEBUG 1|DEBUG| blue
More about system methods can be found at https://developer.salesforce.com/docs/atlas.en-us.apexcode.meta/apexcode/apex_methods_system_system.htm#apex_System_System_methods

How we know the value of Messages used in Giraph

How we know the value of message.get() in SimpleShortestPathsComputation?
if we have Vertex<DoubleWritable, DoubleWritable, DoubleWritable> vertex instead of
Vertex<LongWritable, DoubleWritable, FloatWritable> vertex
How we know that Messages has the value of MinDist and not e.g VertextID or EdgeValue?
#Override public void compute(
Vertex<LongWritable, DoubleWritable, FloatWritable> vertex,
Iterable<DoubleWritable> messages) throws IOException {
if (getSuperstep() == 0) {
vertex.setValue(new DoubleWritable(Double.MAX_VALUE));
}
double minDist = isSource(vertex) ? 0d : Double.MAX_VALUE;
for (DoubleWritable message : messages) {
minDist = Math.min(minDist, message.get());
}
Thank you
Message will have the value you will put inside via the sendMessage method. Just because they have the same type, doesn't mean that things can get mixed up. That's not how the serialisation / deserialisation works in Giraph.
If you don't trust it, you can also have a look at the code here: https://github.com/apache/giraph
Besides that you mixed up a thing, Message doesn't contain the min distance, The min distance is saved inside the vertex' value. The message contains a distance to the source vertex, when it passes the current vertex and is actually the edge data (you called it edge value). The message data, or edge data, is actually of type FloatWritable in the original case - see the code here:
....
/**
* Class which holds vertex id, data and edges.
*
* #param <I> Vertex id
* #param <V> Vertex data
* #param <E> Edge data
*/
public interface Vertex<I extends WritableComparable,
V extends Writable, E extends Writable> extends
ImmutableClassesGiraphConfigurable<I, V, E> {
....
}

Pointers, functions and arrays in D Programming Language

I'm writing a method to output to several output streams at once, the way I got it set up right now is that I have a LogController, LogFile and LogConsole, the latter two are implementations of the Log interface.
What I'm trying to do right now adding a method to the LogController that attaches any implementation of the Log interface.
How I want to do this is as follows: in the LogController I have an associative array, in which I store pointers to Log objects. When the writeOut method of the LogController is called, I want it to then run over the elements of the array and call their writeOut methods too. The latter I can do, but the previous is proving to be difficult.
Mage/Utility/LogController.d
module Mage.Utility.LogController;
import std.stdio;
interface Log {
public void writeOut(string s);
}
class LogController {
private Log*[string] m_Logs;
public this() {
}
public void attach(string name, ref Log l) {
foreach (string key; m_Logs.keys) {
if (name is key) return;
}
m_Logs[name] = &l;
}
public void writeOut(string s) {
foreach (Log* log; m_Logs) {
log.writeOut(s);
}
}
}
Mage/Utility/LogFile.d
module Mage.Utility.LogFile;
import std.stdio;
import std.datetime;
import Mage.Utility.LogController;
class LogFile : Log {
private File fp;
private string path;
public this(string path) {
this.fp = File(path, "a+");
this.path = path;
}
public void writeOut(string s) {
this.fp.writefln("[%s] %s", this.timestamp(), s);
}
private string timestamp() {
return Clock.currTime().toISOExtString();
}
}
I've already tried multiple things with the attach functions, and none of them. The build fails with the following error:
Mage\Root.d(0,0): Error: function Mage.Utility.LogController.LogController.attach (string name, ref Log l) is not callable using argument types (string, LogFile)
This is the incriminating function:
public void initialise(string logfile = DEFAULT_LOG_FILENAME) {
m_Log = new LogController();
LogFile lf = new LogFile(logfile);
m_Log.attach("Log File", lf);
}
Can anyone tell me where I'm going wrong here? I'm stumped and I haven't been able to find the answer anywhere. I've tried a multitude of different solutions and none of them work.
Classes and interfaces in D are reference types, so Log* is redundant - remove the *. Similarly, there is no need to use ref in ref Log l - that's like taking a pointer by reference in C++.
This is the cause of the error message you posted - variables passed by reference must match in type exactly. Removing the ref should solve the error.

Array throwing exception

I can't seem to put my finger on this and why the array is not being initialized.
Basically I am coding a 2d top down spaceship game and the ship is going to be fully customizable. The ship has several allocated slots for certain "Modules" (ie weapons, electronic systems) and these are stored in an array as follows:
protected Array<Weapon> weaponMount;
Upon creating the ship none of the module arrays are initialized, since some ships might have 1 weapon slot, while others have 4.
So when I code new ships, like this example:
public RookieShip(World world, Vector2 position) {
this.width = 35;
this.height = 15;
// Setup ships model
bodyDef.type = BodyType.DynamicBody;
bodyDef.position.set(position);
body = world.createBody(bodyDef);
chassis.setAsBox(width / GameScreen.WORLD_TO_BOX_WIDTH, height / GameScreen.WORLD_TO_BOX_HEIGHT);
fixtureDef.shape = chassis;
fixtureDef.friction = 0.225f;
fixtureDef.density = 0.85f;
fixture = body.createFixture(fixtureDef);
sprite = new Sprite(new Texture(Gdx.files.internal("img/TestShip.png")));
body.setUserData(sprite);
chassis.dispose();
// Ship module properties
setShipName("Rookie Ship");
setCpu(50);
setPower(25);
setFuel(500);
setWeaponMounts(2, world);
setDefenseSlots(1);
addModule(new BasicEngine(), this);
addModule(new BasicBlaster(), this);
// Add hp
setHullHP(50);
setArmorHP(125);
setShieldHP(125);
}
#Override
public void addModule(Module module, Ship currentShip) {
// TODO Auto-generated method stub
super.addModule(module, currentShip);
}
#Override
public void setWeaponMounts(int weaponMounts, World world) {
weaponMount = new Array<Weapon>(weaponMounts);
// super.setWeaponMounts(weaponMounts, world);
}
#Override
public String displayInfo() {
String info = "Everyones first ship, sturdy, reliable and only a little bit shit";
return info;
}
When I set the number of weapon mounts the following method is called:
public void setWeaponMounts(int weaponMounts, World world) {
weaponMount = new Array<Weapon>(weaponMounts);
}
This basically initializes the array with a size (weapon mounts available) to whatever the argument is. Now to me this seems fine but I have setup a hotkey to output the size of the Array, which reports zero. If I try to reference any objects in the array, it throws an outofbounds exception.
The addModule method adds to the array as follows:
public void addModule(Module module, Ship currentShip) {
currentShip.cpu -= module.getCpuUsage();
currentShip.power -= module.getPowerUsage();
if(module instanceof Engine){
engine = (Engine) module;
}else if(module instanceof Weapon){
if(maxWeaponMounts == weaponMount.size){
System.out.println("No more room for weapons!");
}else{
maxWeaponMounts += 1;
weaponMount.add((Weapon)module);
}
}
}
My coding ain't great but heh, better than what I was 2 month ago....
Any ideas?
First of all, You should avoid instanceof. It's not a really big deal performance-wise, but it always points to problems with your general architecture. Implement two different addModule methods. One that takes a Weapon, and one that takes an Engine.
Now back to topic:
else if(module instanceof Weapon){
if (maxWeaponMounts == weaponMount.size) {
System.out.println("No more room for weapons!");
} else{
maxWeaponMounts += 1;
weaponMount.add((Weapon)module);
}
}
It looks like you use maxWeaponMounts as a counter instead of a limit. That's why I assume that it will initially be 0. The same holds for Array.size. It is not the limit, but size also counts how many elements the Array currently holds. Thus you will always have (maxWeaponMounts == weaponMount.size) as 0 == 0 and you will not add the weapon to the array. It will always stay empty and trying to reference any index will end in an OutOfBoundsException.
What you should actually do is using maxWeaponMounts as a fixed limit and not the counter.
public void setWeaponMounts(int weaponMounts, World world) {
weaponMount = new Array<Weapon>(weaponMounts);
maxWeaponMounts = weaponMounts;
}
else if(module instanceof Weapon){
if (weaponMount.size >= maxWeaponMounts) {
System.out.println("No more room for weapons!");
} else{
weaponMount.add((Weapon)module);
}
}

Resources