Write and Read multiple arrays on a file in Python - arrays

I need to write multiple arrays (3,3) on the same file and read them one by one.
It’s the first time I use Python. Help me, please

You could use np.savez:
# save three arrays to a numpy archive file
a0 = np.array([[1,2,3],[4,5,6],[7,8,9]])
a1 = np.array([[11,12,13],[14,15,16],[17,18,19]])
a2 = np.array([[21,22,23],[24,25,26],[27,28,29]])
np.savez("my_archive.npz", soil=a0, crust=a1, bedrock=a2)
# opening the archive and accessing each array by name
with np.load("my_archive.npz") as my_archive_file:
out0 = my_archive_file["soil"]
out1 = my_archive_file["crust"]
out2 = my_archive_file["bedrock"]
https://www.pythonlikeyoumeanit.com/Module5_OddsAndEnds/WorkingWithFiles.html

if you mean an array like this
mylist = [[1,2,3],[4,5,6],[7,8,9]]
the n the solution is like this
with open("fout.txt", "w") as fout:
print(*mylist, sep="\n", file=fout)
to read like this
with open("fout.txt", "r") as fout:
print(fout.read())

Related

Python collection of different sized arrays (Jagged arrays), Dask?

I have multiple 1-D numpy arrays of different size representing audio data.
Since they're different sizes (e.g (8200,), (13246,), (61581,)), I cannot stack them as 1 array with numpy. The size difference is too big to engage in 0-padding.
I can keep them in a list or dictionary and then use for loops to iterate over them to do calculations, but I would prefer that I could approach it in numpy-style. Calling a numpy function on the variable, without having to write a for-loop. Something like:
np0 = np.array([.2, -.4, -.5])
np1 = np.array([-.8, .9])
np_mix = irregular_stack(np0, np1)
np.sum(np_mix)
# output: [-0.7, 0.09999999999999998]
Looking at this Dask picture, I was wondering if I can do what I want with Dask.
My attempt so far is this:
import numpy as np
import dask.array as da
np0 = np.array([.2, -.4, -.5])
arr0 = da.from_array(np0, chunks=(3,))
np1 = np.array([-.8, .9])
arr1 = da.from_array(np1, chunks=(2,))
# stack them
data = [[arr0],
[arr1]]
x = da.block(data)
x.compute()
# output: ValueError: ('Shapes do not align: %s', [(1, 3), (1, 2)])
Questions
Am I misunderstanding how Dask can be used?
If it's possible, how do I do my np.sum() example?
If it's possible, is it actually faster than a for-loop on a high-end single PC?
I found the library awkward-array (https://github.com/scikit-hep/awkward-array), which allows for different length arrays and can do what I asked for:
import numpy as np
import awkward
np0 = np.array([.2, -.4, -.5])
np1 = np.array([-.8, .9])
varlen = awkward.fromiter([np0, np1])
# <JaggedArray [[0.2 -0.4 -0.5] [-0.8 0.9]] at 0x7f01a743e790>
varlen.sum()
# output: array([-0.7, 0.1])
The library describes itself as: "Manipulate arrays of complex data structures as easily as Numpy."
So far, it seems to satisfies everything I need.
Unfortunately, Dask arrays follow Numpy semantics, and assume that all rows are of equal length.
I don't know of a good library in Python that efficiently handles ragged arrays today, so you may be out of luck.

Reading CSV file in loop Dataframe (Julia)

I want to read multiple CSV files with changing names like "CSV_1.csv" and so on.
My idea was to simply implement a loop like the following
using CSV
for i = 1:8
a[i] = CSV.read("0.$i.csv")
end
but obviously that won't work.
Is there a simple way of implementing this, like introducing a additional dimension in the dataframe?
Assuming a in this case is an array, this is definitely possible, but to do it this way, you'd need to pre-allocate your array, since you can't assign an index that doesn't exist yet:
julia> a = []
0-element Array{Any,1}
julia> a[1] = 1
ERROR: BoundsError: attempt to access 0-element Array{Any,1} at index [1]
Stacktrace:
[1] setindex!(::Array{Any,1}, ::Any, ::Int64) at ./essentials.jl:455
[2] top-level scope at REPL[10]:1
julia> a2 = Vector{Int}(undef, 5);
julia> for i in 1:5
a2[i] = i
end
julia> a2
5-element Array{Int64,1}:
1
2
3
4
5
Alternatively, you can use push!() to add things to an array as you need.
julia> a3 = [];
julia> for i in 1:5
push!(a3, i)
end
julia> a3
5-element Array{Any,1}:
1
2
3
4
5
So for your CSV files,
using CSV
a = []
for i = 1:8
push!(a, CSV.read("0.$i.csv"))
end
You can alternatively to what Kevin proposed write:
# read in the files into a vector
a = CSV.read.(["0.$i.csv" for i in 1:8])
# add an indicator column
for i in 1:8
a[i][!, :id] .= i
end
# create a single data frame with indicator column holding the source
b = reduce(vcat, a)
You can read an arbitrary number of CSV files with a certain pattern in the file name, create a dataframe per file and lastly, if you want, create a single dataframe.
using CSV, Glob, DataFrames
path = raw"C:\..." # directory of your files (raw is useful in Windows to add a \)
files=glob("*.csv", path) # to load all CSVs from a folder (* means arbitrary pattern)
dfs = DataFrame.( CSV.File.( files ) ) # creates a list of dataframes
# add an index column to be able to later discern the different sources
for i in 1:length(dfs)
dfs[i][!, :sample] .= i # I called the new col sample
end
# finally, reduce your collection of dfs via vertical concatenation
df = reduce(vcat, dfs)

How to store the character for each looping in Matlab?

I have a string :
A="ILOVEYOUMATLAB"
and I create 2 empty array:
B1=[]
B2=[]
when i used the while loop, for the first time looping, if i want the first character from the A to store in B1 array, what command i need to write?
if in Python, i just need to used append command, but if in Matlab, what is the commend need to apply?
If you have MATLAB R2016b or newer you can use the new string class' overloaded + operator to append text in a more pythonic manner:
A = 'hi';
B = "";
B = B + A(1)
Which gives you:
B =
"h"
Here I've created A as a traditional character array ('') and B as a string array (""), mainly to avoid having to index into the string array (A{1}(1) instead of A(1)).
You can also just use traditional matrix concatenation to accomplish the task:
B = [B, A(1)];
% or
B = strcat(B, A(1));
% or
B(end+1) = A(1);
Note that 4 of these approaches will continually grow B in memory, which can be a significant performance bottleneck. If you know how many elements B is going to contain you can save a lot of IO time by preallocating the array and using matrix indexing to assign values inside your loop:
A = {'apple', 'banana', 'cucumber'};
B = char(zeros(1, numel(A)));
for ii = 1:numel(A)
B(ii) = A{ii}(1);
end
you can try strcat for concatenating strings in matlab
https://www.mathworks.com/help/matlab/ref/strcat.html
Try using arrays instead of matrices. You can assign the first letter to the first position of the B1 array like this:
>> A = 'ILOVEMATLAB';
>> B1 = {};
>> B1{1} = A(1);
>> B1{1}
ans =
I
To Loop through:
for i = 1:length(A)
B1{i} = A{i};
end

Create a multidimensional symbolic array in Matlab 2013b

According to the Matlab R2016a documentation, a symbolic multidimensional array can be comfortably created by using the sym command as follows:
A = sym('a',[2 2 2])
and the output is
A(:,:,1) =
[ a1_1_1, a1_2_1;
a2_1_1, a2_2_1]
A(:,:,2) =
[ a1_1_2, a1_2_2;
a2_1_2, a2_2_2]
However, I'm using Matlab 2013b and this command doesn't work for multiple dimensions. Is there any other way to create such variables for the 2013b version?
I'm not yet using R2016a, but looking around the code for the sym class (type edit sym in your Command Window), it's not too hard to write one's own function to do this:
function s = ndSym(x,a)
a = a(:).';
format = repmat('%d_',[1 numel(a)]);
x = [x format(1:end-1)];
s = cellfun(#createCharArrayElement,num2cell(1:prod(a)),'UniformOutput',false);
s = sym(reshape(s,a));
function s = createCharArrayElement(k)
[v{1:numel(a)}] = ind2sub(a,k);
s = sprintf(x,v{:});
end
end
You can the test it via A = ndSym('A',[2 2 2]), which returns:
A(:,:,1) =
[ A1_1_1, A1_2_1]
[ A2_1_1, A2_2_1]
A(:,:,2) =
[ A1_1_2, A1_2_2]
[ A2_1_2, A2_2_2]
This function should work for arrays with an arbitrary number of dimensions. I tested it in R2013b and R2015b. However, note that the function above doesn't incorporate any input validation and many of the options/niceties supported by sym. These could be added. Also, be aware that many pre-R2016a symbolic math functions may not support such multi-dimensional arrays.

Write a string within a list to one cell in a CSV file. Python 2.7 Windows 7

Is it possible to write a string stored into a list into a .CSV file into one cell?
I have a folder with files and I want to write the file names onto a .csv file.
Folder with files:
Data.txt
Data2.txt
Data3.txt
Here is my code:
import csv
import os
index = -1
filename = []
filelist = []
filelist = os.listdir("dirname")
f = csv.writer(open("output.csv", "ab"), delimiter=",", quotechar=" ", quoting=csv.QUOTE_MINIMAL)
for file in filelist:
if (len(filelist) + index) <0:
break
filename = filelist[len(filelist)+index]
index -= 1
f.writerow(filename)
Output I'm getting is one letter per cell in the .csv file:
A B C D E F G H I
1 D a t a . t x t
2 D a t a 2 . t x t
3 D a t a 3 . t x t
Desired output would be to have it all in 1 cell. There should be three rows on the csv file with strings "Data.txt" in cell A1, "Data2.txt" in cell B1, and "Data3.txt" in cell C1:
A B
1 Data.txt
2 Data2.txt
3 Data3.txt
Is it possible to do this? Let me know if you need more information. I am currently using Python 2.7 on Windows 7.
Solution/Corrected Code:
import csv
import os
index = -1
filename = []
filelist = []
filelist = os.listdir("dirname")
f = csv.writer(open("output.csv", "ab"), delimiter=",", quotechar=" ", quoting=csv.QUOTE_MINIMAL)
for file in filelist:
if (len(filelist) + index) <0:
break
filename = filelist[len(filelist)+index]
index -= 1
f.writerow([filename]) #Need to send in a list of strings without the ',' as a delimiter since writerow expects a tuple/list of strings.
You can do this:
import csv
import os
filelist = os.listdir("dirname") # Use a real directory
f = csv.writer(open("output.csv", 'ab'), delimiter=",", quotechar=" ", quoting=csv.QUOTE_MINIMAL)
for file_name in filelist:
f.writerow([file_name])
Writerow expects a sequence, for example a list of strings. You're giving it a single string which it is then iterating over causing you to see each letter of the string with a , between.
you can also put each sentence/value that you want to be in 1 cell inside a list and all the internal lists inside one external list.
something like this:
import csv
import os
list_ = [["value1"], ["value2"], ["value3"]]
with open('test.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(list_)

Resources