Move graph trained the GPU to be tested on the CPU - artificial-intelligence

So I have this CNN which I train on the GPU. During the training, I regularly save checkpoint.
Later on, I want to have a small script that reads .meta file and the checkpoint and do some tests on a CPU. I use the following the code:
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
with sess.as_default():
with tf.device('/cpu:0'):
saver = tf.train.import_meta_graph('{}.meta'.format(model))
saver.restore(sess,model)
I keep getting this error which tell me that the saver is trying to put the operation on the GPU.
How can i change that?

Move all the ops to CPU using _set_device API. https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/framework/ops.py#L2255
with tf.Session() as sess:
g = tf.get_default_graph()
ops = g.get_operations()
for op in ops:
op._set_device('/device:CPU:*')

Hacky work-around, open your graph definition file (ending with .pbtxt), and remove all lines starting with device:
For programmatic approach you can see how TensorFlow exporter does this with clear_devices although that uses regular Saver, not meta graph exporter

Related

Start a second core with PSCI on QEMU

Good day,
I am currently writing my first boot-loader for the linux operating system for the cortex a72 processor using qemu.
Since I am doing some complex calculations (CRC32, SHA256, signature check...) the code takes quite some time to execute on one core. So I have decided to use the power of more than 1 core, to speed up computation, and have some parallelism.
From the research that I have conducted I found that I have to start up my second core using the PSCI protocole, and taking a look at the Device tree revealed the configuration of my psci. Here is some relevant information from the .dtb:
psci {
migrate = <0xc4000005>;
cpu_on = <0xc4000003>;
cpu_off = <0x84000002>;
cpu_suspend = <0xc4000001>;
method = "hvc";
compatible = "arm,psci-1.0", "arm,psci-0.2", "arm,psci";
};
cpu#0 {
phandle = <0x00008004>;
reg = <0x00000000>;
enable-method = "psci";
compatible = "arm,cortex-a72";
device_type = "cpu";
};
cpu#1 {
phandle = <0x00008003>;
reg = <0x00000001>;
enable-method = "psci";
compatible = "arm,cortex-a72";
device_type = "cpu";
};
So now it seems like there is all the needed information to start the second core.
However I am not sure how to use the hvc method to do that having found examples that only explain the smc method.
Furthermore, how should I implement concurrency and parallelism?
The basic usage I would wish to achieve is to have my CRC32 algorithm running on one core, and my SHA256 running on the other. Would the two cores be able to access the common UART? I guess that there is some way to know that a core has finished executing a program, so race conditions should not be hard to catch.
It would be of great help if anyone could provide any guidance to implement this feature.
Thanks in advance!
The official documentation of how to use PSCI calls is in the Arm Power State Coordination Interface Platform Design Document. But the short answer is that the only difference between the HVC method and the SMC method is that for one you use the "HVC" instruction and for the other the "SMC" instruction -- everything else is identical.

get_multiple_points volttron RPC call

Any chance I could get a tip for proper way to build an agent that could do read multiple points from multiple devices on a BACnet system? I am viewing the actuator agent code trying learn how to make the proper rpc call.
So going through the agent development procedure with the agent creation wizard.
In the init I have this just hard coded at the moment:
def __init__(self, **kwargs):
super(Setteroccvav, self).__init__(**kwargs)
_log.debug("vip_identity: " + self.core.identity)
self.default_config = {}
self.agent_id = "dr_event_setpoint_adj_agent"
self.topic = "slipstream_internal/slipstream_hq/"
self.jci_zonetemp_string = "/ZN-T"
So the BACnet system in the building has JCI VAV boxes all with the same zone temperature sensor point self.jci_zonetemp_string and self.topic is how I pulled them into volttron/config store through BACnet discovery processes.
In my actuate point function (copied from CSV driver example) am I at all close for how to make the rpc call named reads using the get_multiple_points? Hoping to scrape the zone temperature sensor readings on BACnet device ID's 6,7,8,9,10 which are all the same VAV box controller with the same points/BAS program running.
def actuate_point(self):
"""
Request that the Actuator set a point on the CSV device
"""
# Create a start and end timestep to serve as the times we reserve to communicate with the CSV Device
_now = get_aware_utc_now()
str_now = format_timestamp(_now)
_end = _now + td(seconds=10)
str_end = format_timestamp(_end)
# Wrap the timestamps and device topic (used by the Actuator to identify the device) into an actuator request
schedule_request = [[self.ahu_topic, str_now, str_end]]
# Use a remote procedure call to ask the actuator to schedule us some time on the device
result = self.vip.rpc.call(
'platform.actuator', 'request_new_schedule', self.agent_id, 'my_test', 'HIGH', schedule_request).get(
timeout=4)
_log.info(f'*** [INFO] *** - SCHEDULED TIME ON ACTUATOR From "actuate_point" method sucess')
reads = publish_agent.vip.rpc.call(
'platform.actuator',
'get_multiple_points',
self.agent_id,
[(('self.topic'+'6', self.jci_zonetemp_string)),
(('self.topic'+'7', self.jci_zonetemp_string)),
(('self.topic'+'8', self.jci_zonetemp_string)),
(('self.topic'+'9', self.jci_zonetemp_string)),
(('self.topic'+'10', self.jci_zonetemp_string))]).get(timeout=10)
Any tips before I break something on the live system greatly appreciated :)
The basic form of an RPC call to the actuator is as follows:
# use the agent's VIP connection to make an RPC call to the actuator agent
result = self.vip.rpc.call('platform.actuator', <RPC exported function>, <args>).get(timeout=<seconds>)
Because we're working with devices, we need to know which devices we're interested in, and what their topics are. We also need to know which points on the devices that we're interested in.
device_map = {
'device1': '201201',
'device2': '201202',
'device3': '201203',
'device4': '201204',
}
building_topic = 'campus/building'
all_device_points = ['point1', 'point2', 'point3']
Getting points with the actuator requires a list of point topics, or device/point topic pairs.
# we only need one of the following:
point topics = []
for device in device_map.values():
for point in all_device_points:
point_topics.append('/'.join([building_topic, device, point]))
device_point_pairs = []
for device in device_map.values():
for point in all_device_points:
device_point_pairs.append(('/'.join([building_topic, device]),point,))
Now we send our RPC request to the actuator:
# can use instead device_point_pairs
point_results = self.vip.rpc.call('platform.actuator', 'get_multiple_points', point_topics).get(timeout=3)
maybe it's just my interpretation of your question, but it seems a little open-ended - so I shall respond in a similar vein - general (& I'll try to keep it short).
First, you need the list of info for targeting each device in-turn; i.e. it might consist of just a IP(v4) address (for the physical device) & the (logical) device's BOIN (BACnet Object Instance Number) - or if the request is been routed/forwarded on by/via a BACnet router/BACnet gateway then maybe also the DNET # & the DADR too.
Then you probably want - for each device/one at a time, to retrieve the first/0-element value of the device's Object-List property - in order to get the number of objects it contains, to allow you to know how many objects are available (including the logical device/device-type object) - that you need to retrieve from it/iterate over; NOTE: in the real world, as much as it's common for the device-type object to be the first one in the list, there's no guarantee it will always be the case.
As much as the BACnet standard started allowing for the retrieval of the Property-List (property) from each & every object, most equipment don't as-yet support it, so you might need your own idea of what properties (/at least the ones of interest to you) that each different object-type supports; probably at the very-very least know which ones/object-types support the Present-Value property & which ones don't.
One ideal would be to have the following mocked facets - as fakes for testing purposes instead of testing against a live/important device (- or at least consider testing against a noddy BACnet enabled Raspberry PI or the harware-based like):
a mock for your BACnet service
a mock for the BACnet communication stack
a mock for your device as a whole (- if you can't write your own one, then maybe even start with the YABE 'Room Control Simulator' as a starting point)
Hope this helps (in some way).

How can I abort a Gatling simulation if the test system is not in the right state?

The target system I am load testing has a mode that indicates if it's suitable for running a load test against.
I want to check that mode once only at the beginning of my simulation (i.e. I don't want to do the check over and over for each user in the sim).
This is what I've come up with, but System.exit() seems pretty harsh.
I define an execution chain that checks if the mode is the value I want:
def getInfoCheckNotRealMode:ChainBuilder = exec(
http("mode check").get("/modeUrl").
check( jsonPath("$.mode").saveAs("mode") )
).exec { sess =>
val mode = sess("mode").as[String]
println(s"sengingMode $mode")
if( mode == "REAL"){
log.error("cannot allow simulation to run against system in REAL mode")
System.exit(1)
}
sess
}
Then I run the "check" scenario in parallel to the real scenario like this:
val sim = setUp(
newUserScene.inject(loadProfile).
protocols(mySvcHttp),
scenario("Check Sending mode").exec(getInfoCheckNotRealMode).
inject(atOnceUsers(1)).
protocols(mySvcHttp)
)
Problems that I see with this:
Seems a bit over-complicated for simply checking that the system-under-test is suitable for testing against.
It's going to actually run the scenarios in parallel so if the check takes a while it's still going to generate load against a system that's in the wrong mode.
Need to consider and test what happens if the mode check is not behaving correctly
Is there a better way?
Is there some kind of "before simulation begins" phase where I can put this check?

Spring Batch FlatFileItemWriter does not write data to a file

I am new to Spring Batch application. I am trying to use FlatFileItemWriter to write the data into a file. Challenge is application is creating the file on a given path, but, now writing the actual content into it.
Following are details related to code:
List<String> dataFileList : This list contains the data that I want to write to a file
FlatFileItemWriter<String> writer = new FlatFileItemWriter<>();
writer.setResource(new FileSystemResource("C:\\Desktop\\test"));
writer.open(new ExecutionContext());
writer.setLineAggregator(new PassThroughLineAggregator<>());
writer.setAppendAllowed(true);
writer.write(dataFileList);
writer.close();
This is just generating the file at proper place but contents are not getting written into the file.
Am I missing something? Help is highly appreciated.
Thanks!
This is not a proper way to use Spring Batch Writer and writer data. You need to declare bean of Writer first.
Define Job Bean
Define Step Bean
Use your Writer bean in Step
Have a look at following examples:
https://github.com/pkainulainen/spring-batch-examples/blob/master/spring-boot/src/main/java/net/petrikainulainen/springbatch/csv/in/CsvFileToDatabaseJobConfig.java
https://spring.io/guides/gs/batch-processing/
You probably need to force a sync to disk. From the docs at https://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/item/file/FlatFileItemWriter.html,
setForceSync
public void setForceSync(boolean forceSync)
Flag to indicate that changes should be force-synced to disk on flush. Defaults to false, which means that even with a local disk changes could be lost if the OS crashes in between a write and a cache flush. Setting to true may result in slower performance for usage patterns involving many frequent writes.
Parameters:
forceSync - the flag value to set

Hadoop Map Whole File in Java

I am trying to use Hadoop in java with multiple input files. At the moment I have two files, a big one to process and a smaller one that serves as a sort of index.
My problem is that I need to maintain the whole index file unsplitted while the big file is distributed to each mapper. Is there any way provided by the Hadoop API to make such thing?
In case if have not expressed myself correctly, here is a link to a picture that represents what I am trying to achieve: picture
Update:
Following the instructions provided by Santiago, I am now able to insert a file (or the URI, at least) from Amazon's S3 into the distributed cache like this:
job.addCacheFile(new Path("s3://myBucket/input/index.txt").toUri());
However, when the mapper tries to read it a 'file not found' exception occurs, which seems odd to me. I have checked the S3 location and everything seems to be fine. I have used other S3 locations to introduce the input and output file.
Error (note the single slash after the s3:)
FileNotFoundException: s3:/myBucket/input/index.txt (No such file or directory)
The following is the code I use to read the file from the distributed cache:
URI[] cacheFile = output.getCacheFiles();
BufferedReader br = new BufferedReader(new FileReader(cacheFile[0].toString()));
while ((line = br.readLine()) != null) {
//Do stuff
}
I am using Amazon's EMR, S3 and the version 2.4.0 of Hadoop.
As mentioned above, add your index file to the Distributed Cache and then access the same in your mapper. Behind the scenes. Hadoop framework will ensure that the index file will be sent to all the task trackers before any task is executed and will be available for your processing. In this case, data is transferred only once and will be available for all the tasks related your job.
However, instead of add the index file to the Distributed Cache in your mapper code, make your driver code to implement ToolRunner interface and override the run method. This provides the flexibility of passing the index file to Distributed Cache through the command prompt while submitting the job
If you are using ToolRunner, you can add files to the Distributed Cache directly from the command line when you run the job. No need to copy the file to HDFS first. Use the -files option to add files
hadoop jar yourjarname.jar YourDriverClassName -files cachefile1, cachefile2, cachefile3, ...
You can access the files in your Mapper or Reducer code as below:
File f1 = new File("cachefile1");
File f2 = new File("cachefile2");
File f3 = new File("cachefile3");
You could push the index file to the distributed cache, and it will be copied to the nodes before the mapper is executed.
See this SO thread.
Here's what helped me to solve the problem.
Since I am using Amazon's EMR with S3, I have needed to change the syntax a bit, as stated on the following site.
It was necessary to add the name the system was going to use to read the file from the cache, as follows:
job.addCacheFile(new URI("s3://myBucket/input/index.txt" + "#index.txt"));
This way, the program understands that the file introduced into the cache is named just index.txt. I also have needed to change the syntax to read the file from the cache. Instead of reading the entire path stored on the distributed cache, only the filename has to be used, as follows:
URI[] cacheFile = output.getCacheFiles();
BufferedReader br = new BufferedReader(new FileReader(#the filename#));
while ((line = br.readLine()) != null) {
//Do stuff
}

Resources