Hourly Moving average over a duration - jfreechart

I am trying to create a Moving average line based on hour for a selected duration. I want the x-axis to represent 24 hours of the day and the y-axis to represent the average value of each hour for the entire duration. For eg, I want the hourly average for the month of april of a person's effort.
I have written the following program but it prints the two days after the first. Any help is highly appreciated.
Edit: Cross-posted here.
public class MovingAverageDemo extends ApplicationFrame {
private static final long serialVersionUID = -1570942379483983865L;
/**
* A moving average demo.
* #param title the frame title.
*/
public MovingAverageDemo(String title) {
super(title);
// create a title...
String chartTitle = "Hourly Average";
XYDataset dataset = createDataset();
JFreeChart chart = ChartFactory.createTimeSeriesChart(
chartTitle,
"Hours",
"Actions",
dataset,
true,
true,
false
);
LegendTitle legend = (LegendTitle) chart.getLegend();
legend.setVisible(true);
XYPlot plot = chart.getXYPlot();
XYItemRenderer renderer = plot.getRenderer();
if(renderer instanceof StandardXYItemRenderer) {
StandardXYItemRenderer rr = (StandardXYItemRenderer) renderer;
rr.setPlotLines(true);
rr.setBaseShapesFilled(true);
}
NumberFormat format = NumberFormat.getNumberInstance();
format.setMaximumFractionDigits(2);
XYItemLabelGenerator generator =
new StandardXYItemLabelGenerator(
StandardXYItemLabelGenerator.DEFAULT_ITEM_LABEL_FORMAT,
format, format);
renderer.setBaseItemLabelGenerator(generator);
renderer.setBaseItemLabelsVisible(true);
DateAxis axis = (DateAxis) plot.getDomainAxis();
axis.setDateFormatOverride(new SimpleDateFormat("HH"));
ChartPanel chartPanel = new ChartPanel(chart);
chartPanel.setPreferredSize(new java.awt.Dimension(500, 270));
setContentPane(chartPanel);
}
/**
* Creates a dataset, one series containing unit trust prices, the other a moving average.
*
* #return the dataset.
*/
public XYDataset createDataset() {
TimeSeries s1 = new TimeSeries("New", Hour.class);
s1.add(new Hour(getDateByHour(1, 4, 2012, 1)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 2)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 3)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 4)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 5)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 6)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 7)), 148);
s1.add(new Hour(getDateByHour(1, 4, 2012, 8)), 153);
s1.add(new Hour(getDateByHour(1, 4, 2012, 9)), 142);
s1.add(new Hour(getDateByHour(1, 4, 2012, 10)), 123);
s1.add(new Hour(getDateByHour(1, 4, 2012, 11)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 12)), 139);
s1.add(new Hour(getDateByHour(1, 4, 2012, 13)), 142);
s1.add(new Hour(getDateByHour(1, 4, 2012, 14)), 138);
s1.add(new Hour(getDateByHour(1, 4, 2012, 15)), 137);
s1.add(new Hour(getDateByHour(1, 4, 2012, 16)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 17)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 18)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 19)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 20)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 21)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 22)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 23)), 0);
s1.add(new Hour(getDateByHour(1, 4, 2012, 24)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 1)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 2)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 3)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 4)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 5)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 6)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 7)), 168);
s1.add(new Hour(getDateByHour(2, 4, 2012, 8)), 173);
s1.add(new Hour(getDateByHour(2, 4, 2012, 9)), 162);
s1.add(new Hour(getDateByHour(2, 4, 2012, 10)), 143);
s1.add(new Hour(getDateByHour(2, 4, 2012, 11)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 12)), 119);
s1.add(new Hour(getDateByHour(2, 4, 2012, 13)), 122);
s1.add(new Hour(getDateByHour(2, 4, 2012, 14)), 118);
s1.add(new Hour(getDateByHour(2, 4, 2012, 15)), 117);
s1.add(new Hour(getDateByHour(2, 4, 2012, 16)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 17)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 18)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 19)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 20)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 21)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 22)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 23)), 0);
s1.add(new Hour(getDateByHour(2, 4, 2012, 24)), 0);
TimeSeries s2 = MovingAverage.createMovingAverage(s1, "NewC", 1, 0);
TimeSeries s3 = new TimeSeries("Cancelled", Hour.class);
s3.add(new Hour(getDateByHour(1, 4, 2012, 1)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 2)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 3)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 4)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 5)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 6)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 7)), 9);
s3.add(new Hour(getDateByHour(1, 4, 2012, 8)), 7);
s3.add(new Hour(getDateByHour(1, 4, 2012, 9)), 2);
s3.add(new Hour(getDateByHour(1, 4, 2012, 10)), 8);
s3.add(new Hour(getDateByHour(1, 4, 2012, 11)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 12)), 9);
s3.add(new Hour(getDateByHour(1, 4, 2012, 13)), 7);
s3.add(new Hour(getDateByHour(1, 4, 2012, 14)), 3);
s3.add(new Hour(getDateByHour(1, 4, 2012, 15)), 9);
s3.add(new Hour(getDateByHour(1, 4, 2012, 16)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 17)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 18)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 19)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 20)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 21)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 22)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 23)), 0);
s3.add(new Hour(getDateByHour(1, 4, 2012, 24)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 1)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 2)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 3)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 4)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 5)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 6)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 7)), 9);
s3.add(new Hour(getDateByHour(2, 4, 2012, 8)), 7);
s3.add(new Hour(getDateByHour(2, 4, 2012, 9)), 2);
s3.add(new Hour(getDateByHour(2, 4, 2012, 10)), 8);
s3.add(new Hour(getDateByHour(2, 4, 2012, 11)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 12)), 9);
s3.add(new Hour(getDateByHour(2, 4, 2012, 13)), 7);
s3.add(new Hour(getDateByHour(2, 4, 2012, 14)), 3);
s3.add(new Hour(getDateByHour(2, 4, 2012, 15)), 9);
s3.add(new Hour(getDateByHour(2, 4, 2012, 16)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 17)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 18)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 19)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 20)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 21)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 22)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 23)), 0);
s3.add(new Hour(getDateByHour(2, 4, 2012, 24)), 0);
TimeSeries s4 = MovingAverage.createMovingAverage(s3, "CancelledC", 1, 0);
TimeSeriesCollection dataset = new TimeSeriesCollection();
//dataset.addSeries(s1);
dataset.addSeries(s2);
//dataset.addSeries(s3);
dataset.addSeries(s4);
return dataset;
}
/**
* Starting point for the demonstration application.
*
* #param args ignored.
*/
public static void main(String[] args) {
MovingAverageDemo demo = new MovingAverageDemo("Moving Average Demo 1");
demo.pack();
RefineryUtilities.centerFrameOnScreen(demo);
demo.setVisible(true);
}
private Date getDateByHour(int day, int month, int year, int hour) {
Calendar cal = Calendar.getInstance();
cal.set(Calendar.DAY_OF_MONTH, day);
cal.set(Calendar.MONTH, month);
cal.set(Calendar.YEAR, year);
cal.set(Calendar.HOUR_OF_DAY, hour);
cal.set(Calendar.MINUTE, 0);
cal.set(Calendar.SECOND, 0);
return cal.getTime();
}
}

From what I understand from your comments, you really want to display 4 series on your plot, not 2 - i.e.
01-May-2012 New
02-May-2012 New
01-May-2012 Cancelled
02-May-2012 Cancelled
You could then change your getDateByHour() method, or better yet, combine with the new Hour(Date) constructor called several times in createDataset():
private Hour makeHour(int hour) {
return new Hour(hour, 1, 1, 2012);
}
The actual day/month/year you use shouldn't matter, because you have set the DateFormat as "HH", so none of that gets rendered. (Providing you don't change the x-axis later...)

Related

How to get the mean of specific values from an nd.array region?

Given an ndarray:
np.array(
(
(1, 2, 3, 3, 2),
(4, 5, 4, 3, 2),
(1, 1, 1, 1, 1),
(0, 0, 0, 0, 0),
(0, 2, 3, 4, 0),
)
)
extract the mean of the values bounded by a rectangle with coordinates: (1, 1), (3, 1), (1, 3), (3, 3).
The extracted region of the array would be:
5, 4, 3,
1, 1, 1,
0, 0, 0,
And the mean would be ~1.666666667
import numpy as np
arr = np.array(
(
(1, 2, 3, 3, 2),
(4, 5, 4, 3, 2),
(1, 1, 1, 1, 1),
(0, 0, 0, 0, 0),
(0, 2, 3, 4, 0),
)
)
mean = arr[1:4, 1:4].mean()

How to convert RBG data into a Image

I'm doing a easy cybersecurity challenge where I'm given a data file with only RBG codes like the following ones:
[(0, 0, 2), (0, 0, 2), (0, 0, 2), (0, 0, 2), (0, 0, 2), (0, 0, 2), (0, 0, 2), (0, 0, 2), (0, 0, 2), (1, 1, 3), (1, 1, 3), (1, 1, 3), (0, 0, 2), (0, 0, 2), (0, 0, 2), (1, 1, 3), (0, 0, 2), (0, 0, 2), (1, 1, 3), (1, 1, 3), (1, 1, 3), (1, 1, 3), (0, 0, 2), (0, 0, 2), (1, 1, 3)]
As you may wonder, the file is much wider and longer and I don't know how I should convert all this RBG data into an image. I tried using a python script in a Lubuntu1204 but it still gave me some errors and I didnt get the image I was looking for.
Thanks for the help in advance!

How to convert 1D numpy array (made using .genfromtxt() method) to a 2D array?

I am new to numpy and I am trying to generate an array from a CSV file. I was informed that the .genfromtxt method works well in generating an array and automatically detecting and ascribing dtypes. The formula seemingly did this without flaws until I checked the shape of the array.
import numpy as np
taxi = np.genfromtxt("nyc_taxis.csv", delimiter=",", dtype = None, names = True)
taxi.shape
[out]: (89560,)
I believe this shows me that my dataset is now a 1D array. The tutorial I am working on in class has a final result of taxi.shape as (89560,15) but they used a long, tedious for loop, then converted certain columns to floats. But I want to try learn a more efficient way.
The first few lines of the array are
array([(2016, 1, 1, 5, 0, 2, 4, 21. , 2037, 52. , 0.8, 5.54, 11.65, 69.99, 1),
(2016, 1, 1, 5, 0, 2, 1, 16.29, 1520, 45. , 1.3, 0. , 8. , 54.3 , 1),
(2016, 1, 1, 5, 0, 2, 6, 12.7 , 1462, 36.5, 1.3, 0. , 0. , 37.8 , 2),
(2016, 1, 1, 5, 0, 2, 6, 8.7 , 1210, 26. , 1.3, 0. , 5.46, 32.76, 1),
(2016, 1, 1, 5, 0, 2, 6, 5.56, 759, 17.5, 1.3, 0. , 0. , 18.8 , 2),
(2016, 1, 1, 5, 0, 4, 2, 21.45, 2004, 52. , 0.8, 0. , 52.8 , 105.6 , 1),
(2016, 1, 1, 5, 0, 2, 6, 8.45, 927, 24.5, 1.3, 0. , 6.45, 32.25, 1),
(2016, 1, 1, 5, 0, 2, 6, 7.3 , 731, 21.5, 1.3, 0. , 0. , 22.8 , 2),
(2016, 1, 1, 5, 0, 2, 5, 36.3 , 2562, 109.5, 0.8, 11.08, 10. , 131.38, 1),
(2016, 1, 1, 5, 0, 6, 2, 12.46, 1351, 36. , 1.3, 0. , 0. , 37.3 , 2)],
So I can see from the results that each row has 15 comma-separations (i.e 15 columns) but the shape tells me that it is only 89560 rows and no columns. Am I reading this wrong? Is there a way that I can transform the shape of my taxi array dataset to reflect the true number of columns (i.e 15) as they are in the csv file?
Any and all help is appreciated
You can use this function to convert your structured to unstructured with your desired data type (assuming all fields are of the same data type, if not, keeping it as structured is better):
import numpy.lib.recfunctions as rfn
taxi = rfn.structured_to_unstructured(taxi, dtype=np.float)

Populate Defined Named Range with multi-element array of multi-element arrays

I have defined 5 arrays.
One with undefined dimensions to store the other 4:
Dim outputArr() As Variant
and the rest as follows:
Dim Arr1(5, 0), Arr2(12, 0), Arr3(5, 0), Arr4(12, 0) As Variant
I assign the elements of the latter as follows:
Arr1(0, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(1, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(2, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(3, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(4, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
Arr1(5, 0) = [{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}]
The above is applied to each array.
When I use
ReDim outputArray(3, 0)
outputArr = [{Arr1, Arr2, Arr3, Arr4}]
I get a 'Type Mismatch' error.
When I do not use Evaluate and assign without ReDim
outputArr = Array(Arr1, Arr2, Arr3, Arr4)
I can see the elements and their values in the Watch window, but when I try to populate Defined Named Ranges with the elements of outputArr I get an empty output
Range("nRange1name").Value = outputArr(0)
Range("nRange2name").Value = outputArr(1)
Range("nRange3name").Value = outputArr(2)
Range("nRange4name").Value = outputArr(3)
How can I work around this?
The use of variants in the OP code introduces unecessary dimensions. I don't understand why two transpose functions are needed but the following code pastes 2d arrays satisfactorily.
Option Explicit
Sub TestArrays()
Dim outputArr As Variant
Dim Arr1 As Variant
Dim Arr2 As Variant
Dim Arr3 As Variant
Dim Arr4 As Variant
Arr1 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr2 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr3 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
Arr4 = Array(Array(1, 2, 3, 4, 5, 6, 7, 8, 9), Array(1, 2, 3, 4, 5, 6, 7, 8, 9))
outputArr = Array(Arr1, Arr2, Arr3, Arr4)
' For Horizontal ranges
Range("A1:H2") = Application.WorksheetFunction.Transpose(Application.WorksheetFunction.Transpose(outputArr(2)))
'For Vertical ranges
Range("A4:B11") = Application.WorksheetFunction.Transpose(outputArr(3))
End Sub
You need to construct an actual 2D array to do something like that.
Dim arr(1 to 6, 1 to 12)
dim r as long, c as long
for r = lbound(arr, 1) to ubound(arr, 1)
for c = lbound(arr, 2) to ubound(arr, 2)
arr(r, c) = 0
next c
next r
Range("A1").Resize(ubound(arr, 1), ubound(arr, 2)).value = arr

Filling empty list of lists with zeros to get a fixed size list of 5 tuples

l have a sample of 1000 examples. Each sample contains a list of 18 lists which are of variable length and some of lists are empty.
Here is a sample :
len(My_list)
18
print(My_list)
array([list([(17, 163, 0.11258018, 15),(78, 193, 0.99713018, 17),(478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4),(456, 140, 0.65013665, 7), (61, 301, 0.7433308, 8)]),
list([]),
list([]),
list([]),
list([]),
list([]),
list([]),
list([]),
list([(63, 176, 0.18713018, 0),(199, 185, 0.88743243, 79), (282, 75, 0.752135, 84)]),
list([(62, 185, 0.13743243, 1)]),
list([]),
list([(67, 156, 0.14346971, 2)]),
list([(2, 15, 0.00639179, 3)]),
list([]),
list([]),
list([]),
list([]),
list([])],
dtype=object)
What l would like to do ?
for each list :
1-keeps the first 5 tuples
2- If a list is empty than create a list of five tuples as flollow
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]).
3- If a list is not empty but doesn't contain 5 elements then complete it to get five elements. As My_list[12] contains only one element list([(67, 156, 0.14346971, 2)]) hence :
My_list[12]=list([(67, 156, 0.14346971, 2),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)])
The expected output :
array([list([(17, 163, 0.11258018, 15),(78, 193, 0.99713018, 17),(478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(63, 176, 0.18713018, 0),(199, 185, 0.88743243, 79), (282, 75, 0.752135, 84),(0,0,0,0),(0,0,0,0)]),
list([(62, 185, 0.13743243, 1),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(67, 156, 0.14346971, 2),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(2, 15, 0.00639179, 3),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)]),
list([(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0),(0,0,0,0)])],
dtype=object)
What l have tried ?
My_list=np.asarray(My_list)
My_list = [joint if len(joint) != 0 else [(0, 0, 0,0)] for joint in My_list]
However, it doesn't make the job. It fills only empty lists with (0,0,0,0).Moreover, lists with one or more elements skip them. And it is expected to fill all empty lists or lists with less than five elments with (0,0,0,0) to get five elements per list.
Any cue ?
Here is one way: Glue 5 tuples to everything and trim later:
>>> ml
array([list([(17, 163, 0.11258018, 15), (78, 193, 0.99713018, 17), (478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4), (456, 140, 0.65013665, 7), (61, 301, 0.7433308, 8)]),
list([]), list([]), list([]), list([]), list([]), list([]),
list([]),
list([(63, 176, 0.18713018, 0), (199, 185, 0.88743243, 79), (282, 75, 0.752135, 84)]),
list([(62, 185, 0.13743243, 1)]), list([]),
list([(67, 156, 0.14346971, 2)]), list([(2, 15, 0.00639179, 3)]),
list([]), list([]), list([]), list([]), list([])], dtype=object)
>>>
>>> z = np.array([None, 5*[4*(0,)]])[[1]]
>>> z
array([list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)])],
dtype=object)
>>>
>>> res = np.frompyfunc(list.__getitem__, 2, 1)(ml + z, slice(5))
>>> res
array([list([(17, 163, 0.11258018, 15), (78, 193, 0.99713018, 17), (478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(63, 176, 0.18713018, 0), (199, 185, 0.88743243, 79), (282, 75, 0.752135, 84), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(62, 185, 0.13743243, 1), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(67, 156, 0.14346971, 2), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(2, 15, 0.00639179, 3), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)])],
dtype=object)
Explanation: arrays of object dtype delegate operations like addition to their elements. Therefor ml + z combines each original list with a copy of 5x4 zeros.
Next we only need to cut every list back to 5 elements. The operation somelist[:5] can be written as somelist.__getitem__(slice(5)) or even as list.__getitem__(somelist, slice(5)). This last form is what we "vectorize" using np.frompyfunc.
This a variant on #PaulP answer (and #Eir's comment). It's close enough that I wouldn't post it, except it is faster (and possibly clearer).
Define a function that operates on one list at a time - using that idea of adding the pad, and stripping off unneeded elements:
In [209]: z = [4*(0,) for _ in range(5)]
In [210]: def foo(alist):
...: return (alist + z)[:5]
This can be applied to each list via list comprehension:
In [211]: [foo(row) for row in arr]
Out[211]:
[[(17, 163, 0.11258018, 15),
(78, 193, 0.99713018, 17),
(478, 94, 0.7299528, 2),
(63, 268, 0.77531445, 3),
(169, 279, 0.7947326, 4)],
[(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)],
....
[(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]]
But if you want an object array, #Paul's approach using frompyfunc works nicely:
In [212]: np.frompyfunc(foo,1,1)(arr)
Out[212]:
array([list([(17, 163, 0.11258018, 15), (78, 193, 0.99713018, 17), (478, 94, 0.7299528, 2), (63, 268, 0.77531445, 3), (169, 279, 0.7947326, 4)]),
list([(0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0), (0, 0, 0, 0)]),
.... dtype=object)
Timings:
In [176]: timeit np.frompyfunc(list.__getitem__, 2, 1)(arr + z, slice(5))
14.8 µs ± 18.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [184]: timeit [foo(row) for row in arr]
7.6 µs ± 26.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [213]: timeit np.frompyfunc(foo,1,1)(arr)
8.49 µs ± 27.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Resources