How to append or change data within a file

How to append or change data within a file - file

Hiya i have made a program that stores the player name and strength..Here is the code:
data = {
"PLAYER":name2,
"STRENGTH":str(round(strength, 2)),
}
with open("data2.txt", "w", encoding="utf-8") as file:
file.write(repr(data))
file.close()
So this stores the data so what to i do if i wanna append/change the value after a certain action usch as a 'BATTLE'
Is it possible the get the variable of 'STRENGTH' and then change the number?
At the moment to read data from the external file 'DATA1.txt'i am using this code:
with open("data1.txt", "r", encoding="utf-8") as file:
data_string = file.readline()
data = eval(data_string)
# (data["STRENGTH"])
S1 = (float(data["STRENGTH"]))
file.close()
Now i can do something with the variable --> 'S1'
Here is the external text file 'data1.txt'
{'PLAYER': 'Oreo', 'STRENGTH': '11.75'}
... But i wanna change the strength value after a "battle" many thanks

Maybe you're not understanding Python dict semantics?
Seems to me you're doing a lot of unnecessary things like S1 = (float(data['STRENGTH'])) to try to manipulate and change values when you could be doing really simple stuff.
>>> data = {'PLAYER': 'Oreo', 'STRENGTH': '11.75'}
>>> data['STRENGTH'] = float(data['STRENGTH'])
>>> data
{'PLAYER': 'Oreo', 'STRENGTH': 11.75}
>>> data['STRENGTH'] += 1
>>> data
{'PLAYER': 'Oreo', 'STRENGTH': 12.75}
Maybe you should give Native Data Types -- Dive Into Python 3 a read to see if it clears things up.

Related

How can I create and shuffle a dataset for triplet mining in TensorFlow 2?

I'm working on a network using triplet mining for training. In order to make it work properly, I need my batches to contain several images of the same class. The problem I'm currently facing is that I have 751 classes, for a total of 12,937 pictures, and a batch size of 48 pictures. When shuffling the dataset using the command below, the odds to get pictures from the same class are really low, making the triplet mining inefficient.
dataset = dataset.shuffle(12937)
What I would need instead is a way of generating batches that contain a specific number of pictures for every class represented in this batch. As an example, let's say here that I want 12 classes per batch, there would be 4 pictures for each of them.
Another problem I'm facing is how would I shuffle this dataset at the end of every epoch so that I can have different batches that still follow the condition fixed above, that is 12 classes, 4 pictures for each one of them?
Is there any proper way to do it? I can't really find one. Please let me know if I'm unclear, and if you need further details.
================ EDIT ================
I've been trying a few things, and came up with something that would do what I want. The function would be the following:
counter = 0.
# Assuming a format such as (data, label)
def predicate(data, label):
global counter
allowed_labels = tf.constant([counter])
isallowed = tf.equal(allowed_labels, tf.cast(label, tf.float32))
reduced = tf.reduce_sum(tf.cast(isallowed, tf.float32))
counter += 1
return tf.greater(reduced, tf.constant(0.))
##tf.function
def custom_shuffle(train_dataset, batch_size, samples_per_class = 4, iterations_in_epoch = 100, database='market'):
assert batch_size%samples_per_class==0, F'batch size must be a {samples_per_class} multiple.'
if database == 'market':
class_nbr = 751
else:
raise Exception('Unsuported database yet')
all_datasets = [train_dataset.filter(predicate) for _ in range(class_nbr)] # Every element of this array is a dataset of one class
for i in range(iterations_in_epoch):
choice = tf.random.uniform(
shape=(batch_size//samples_per_class,),
minval=0,
maxval=class_nbr,
dtype=tf.dtypes.int64,
) # Which classes will be in batch
choice = tf.data.Dataset.from_tensor_slices(tf.concat([choice for _ in range(4)], axis=0)) # Exactly 4 picture from each class in the batch
batch = tf.data.experimental.choose_from_datasets(all_datasets, choice)
if i==0:
all_batches = batch
else:
all_batches = all_batches.concatenate(batch)
all_batches = all_batches.batch(batch_size)
return all_batches
It does what I want, however the returned dataset is extremely slow to iterate, making modele learning impossible. As per this thread, I understood that I needed to decorate custom_shuffle with #tf.function, as the one commented out. However, when doing so, it raises the following error:
Traceback (most recent call last):
File "training.py", line 137, in <module>
main()
File "training.py", line 80, in main
train_dataset = get_dataset(TRAINING_FILENAMES, IMG_SIZE, BATCH_SIZE, database=database, func_type='train')
File "E:\Morgan\TransReID_TF\tfr_to_dataset.py", line 260, in get_dataset
dataset = custom_shuffle(dataset, batch_size)
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\def_function.py", line 846, in _call
return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\function.py", line 1843, in _filtered_call
return self._call_flat(
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\function.py", line 1923, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\function.py", line 545, in call
outputs = execute.execute(
File "D:\Programs\Anaconda3\envs\AlignedReID_TF\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: No unary variant device copy function found for direction: 1 and Variant type_index: class tensorflow::data::`anonymous namespace'::DatasetVariantWrapper
[[{{node BatchDatasetV2/_206}}]] [Op:__inference_custom_shuffle_11485]
Function call stack:
custom_shuffle
Which I don't understand, and don't see how to fix.
Is there something I'm doing wrong?
PS: I'm aware the lack of minimal code to reproduce this behavior makes it hard to debug, I'll try to provide some as soon as possible.

Python3 - IndexError when trying to save a text file

i'm trying to follow this tutorial with my own local data files:
CNTK tutorial
i have the following function to save my data array into a txt file feedable to CNTK:
# Save the data files into a format compatible with CNTK text reader
def savetxt(filename, ndarray):
dir = os.path.dirname(filename)
if not os.path.exists(dir):
os.makedirs(dir)
if not os.path.isfile(filename):
print("Saving", filename )
with open(filename, 'w') as f:
labels = list(map(' '.join, np.eye(11, dtype=np.uint).astype(str)))
for row in ndarray:
row_str = row.astype(str)
label_str = labels[row[-1]]
feature_str = ' '.join(row_str[:-1])
f.write('|labels {} |features {}\n'.format(label_str, feature_str))
else:
print("File already exists", filename)
i have 2 ndarrays of the following shape that i want to feed the model:
train.shape
(1976L, 15104L)
test.shape
(1976L, 15104L)
Then i try to implement the fucntion like this:
# Save the train and test files (prefer our default path for the data)
data_dir = os.path.join("C:/Users", 'myself', "OneDrive", "IA Project", 'data', 'train')
if not os.path.exists(data_dir):
data_dir = os.path.join("data", "IA Project")
print ('Writing train text file...')
savetxt(os.path.join(data_dir, "Train-128x118_cntk_text.txt"), train)
print ('Writing test text file...')
savetxt(os.path.join(data_dir, "Test-128x118_cntk_text.txt"), test)
print('Done')
and then i get the following error:
Writing train text file...
Saving C:/Users\A702628\OneDrive - Atos\Microsoft Capstone IA\Capstone data\train\Train-128x118_cntk_text.txt
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-24-b53d3c69b8d2> in <module>()
6
7 print ('Writing train text file...')
----> 8 savetxt(os.path.join(data_dir, "Train-128x118_cntk_text.txt"), train)
9
10 print ('Writing test text file...')
<ipython-input-23-610c077db694> in savetxt(filename, ndarray)
12 for row in ndarray:
13 row_str = row.astype(str)
---> 14 label_str = labels[row[-1]]
15 feature_str = ' '.join(row_str[:-1])
16 f.write('|labels {} |features {}\n'.format(label_str, feature_str))
IndexError: list index out of range
Can somebody please tell me what's going wrong with this part of the code? And how could i fix it? Thank you very much in advance.

Since you're using your own input data -- are they labelled in the range 0 to 9? The labels array only has 10 entries in it, so that could cause an out-of-range problem.

Way to assign multiple files to set variable names?

Is there a way to assign file names to set varibles using a GUI? Say I have 6 file sets which contain 4 colors each (blue, green, nir, red). There are 24 files in total, so i'd need 24 variables. And I want the set varialbes to be something like
blue1
green1
nir1
red1
blue2
green2
nir2
red2
etc...
Currently I'm trying to use GUIDE to creat a custom GUI that will allow the user to select the files they wish and have them assigned to certain variables. I am thinking something along the lines of having 24 popupmenus that are attached to a file directory and allows the user to select which file they want, and then it will assign that file and it's path to a variable (blue1 for example) I also want 24 check boxes to associate with an if statement
Let's say popupmenu1 is associated with the variable blue1 and checkbox1
if checkbox1 == checked
do import
elseif checkbox1 == unchecked
fill with zeros
I have the basic frame of the GUI created, I am just unclear on how to apply the file select and then associate the if statements, etc...

If you know the variable files in advance, it's bad practice (look also here and here) to use string defined variable names like this:
var1name = 'blue';
var2name = 'red';
% etc.
% load data
datablue=rand(4,1);
datared =rand(4,1);
% assign
eval([var1name '1 = datablue(1);']);
eval([var2name '1 = datared (1);']);
% etc.
eval([var1name '2 = datablue(2);']);
eval([var2name '1 = datared (2);']);
% etc
It's much easier and better to just use an ordinary array, given the variable name is not changing or application dependend, which in my example I already have as datablue and datared.
Another option if you'd like user defined variable names is to use an array of structs:
var1name = 'blue';
var2name = 'red';
sample(1).(var1name) = datablue(1);
sample(1).(var2name) = datared (1);
% ...
sample(2).(var1name) = datablue(2);
sample(2).(var2name) = datared (2);
Try some of these out, and only if you have a very good reason, resort to eval!

for k = 1:6
blue(k) = sprintf('blue%d', k);
green(k) = sprintf('green%d', k);
nir(k) = sprintf('nir%d', k);
red(k) = sprintf('red%d', k);
end
This will create the variable names for you. Then you can use assignin (i believe) or eval to set the values to the variable names.

In Django how can I run a custom clean function on fixture data during import and validation?

In a ModelForm I can write a clean_<field_name> member function to automatically validate and clean up data entered by a user, but what can I do about dirty json or csv files (fixtures) during a manage.py loaddata?

Fixtures loaded with loaddata are assumed to contain clean data that doen't need validation (usually as an inverse operation to a prior dumpdata), so the short answer is that loaddata isn't the approach you want if you need to clean your inputs.
However, you probably can use some of the underpinnings of loaddata while implementing your custom data cleaning code--I'm sure you can easily script something using the Django serialization libs to read your existing data files them in and the save the resulting objects normally after the data has been cleaned up.

In case others want to do something similar, I defined a model method to do the cleaning (so it can be called from ModelForms)
MAX_ZIPCODE_DIGITS = 9
MIN_ZIPCODE_DIGITS = 5
def clean_zip_code(self, s=None):
#s = str(s or self.zip_code)
if not s: return None
s = re.sub("\D","",s)
if len(s)>self.MAX_ZIPCODE_DIGITS:
s = s[:self.MAX_ZIPCODE_DIGITS]
if len(s) in (self.MIN_ZIPCODE_DIGITS-1,self.MAX_ZIPCODE_DIGITS-1):
s = '0'+s # FIXME: deal with other intermediate lengths
if len(s)>=self.MAX_ZIPCODE_DIGITS:
s = s[:self.MIN_ZIPCODE_DIGITS]+'-'+s[self.MIN_ZIPCODE_DIGITS:]
return s
Then wrote a standalone python script to clean up my legacy json files using any clean_ methods found among the models.
import os, json
def clean_json(app = 'XYZapp', model='Entity', fields='zip_code', cleaner_prefix='clean_'):
# Set the DJANGO_SETTINGS_MODULE environment variable.
os.environ['DJANGO_SETTINGS_MODULE'] = app+".settings"
settings = __import__(app+'.settings').settings
models = __import__(app+'.models').models
fpath = os.path.join( settings.SITE_PROJECT_PATH, 'fixtures', model+'.json')
if isinstance(fields,(str,unicode)):
fields = [fields]
Ns = []
for field in fields:
try:
instance = getattr(models,model)()
except AttributeError:
print 'No model named %s could be found'%(model,)
continue
try:
cleaner = getattr(instance, cleaner_prefix+field)
except AttributeError:
print 'No cleaner method named %s.%s could be found'%(model,cleaner_prefix+field)
continue
print 'Cleaning %s using %s.%s...'%(fpath,model,cleaner.__name__)
fin = open(fpath,'r')
if fin:
l = json.load(fin)
before = len(l)
cleans = 0
for i in range(len(l)):
if 'fields' in l[i] and field in l[i]['fields']:
l[i]['fields'][field]=cleaner(l[i]['fields'][field]) # cleaner returns None to delete records
cleans += 1
fin.close()
after = len(l)
assert after>.5*before
Ns += [(before, after,cleans)]
print 'Writing %d/%d (new/old) records after %d cleanups...'%Ns[-1]
with open(fpath,'w') as fout:
fout.write(json.dumps(l,indent=2,sort_keys=True))
return Ns
if __name__ == '__main__':
clean_json()

writing p-values to file in R

Can someone help me with this piece of code. In a loop I'm saving p-values in f and then I want to write the p-values to a file but I don't know which function to use to write to file. I'm getting error with write function.
{
f = fisher.test(x, y = NULL, hybrid = FALSE, alternative = "greater",
conf.int = TRUE, conf.level = 0.95, simulate.p.value = FALSE)
write(f, file="fisher_pvalues.txt", sep=" ", append=TRUE)
}
Error in cat(list(...), file, sep, fill, labels, append) :
argument 1 (type 'list') cannot be handled by 'cat'

The return value from fisher.test is (if you read the docs):
Value:
A list with class ‘"htest"’ containing the following components:
p.value: the p-value of the test.
conf.int: a confidence interval for the odds ratio. Only present in
the 2 by 2 case and if argument ‘conf.int = TRUE’.
etc etc. R doesn't know how to write things like that to a file. More precisely, it doesn't know how YOU want it written to a file.
If you just want to write the p value, then get the p value and write that:
write(f$p.value,file="foo.values",append=TRUE)

f is an object of class 'htest', so writing it to a file will write much more than just the p-value.
If you do want to simply save a written representation of the results to a file, just as they appear on the screen, you can use capture.output() to do so:
Convictions <-
matrix(c(2, 10, 15, 3),
nrow = 2,
dimnames =
list(c("Dizygotic", "Monozygotic"),
c("Convicted", "Not convicted")))
f <- fisher.test(Convictions, alternative = "less")
capture.output(f, file="fisher_pvalues.txt", append=TRUE)
More likely, you want to just store the p-value. In that case you need to extract it from f before writing it to the file, using code something like this:
write(paste("p-value from Experiment 1:", f$p.value, "\n"),
file = "fisher_pvalues.txt", append=TRUE)