Design of C++ dynamic arrays

Design of C++ dynamic arrays - arrays

I am in the process of designing some classes for dynamic arrays (something like a std::vector). The reason I don't want to use std::vector is because my C++ programs are often used as a library called from C/C++/Fortran/Delphi and therefore takes arrays input as a pointer. For security reasons, a std::vector can't steal a pointer at construction time. My Array1D can work as a std::vector but can also be constructed with a pointer. Unfortunately Visual Studio 2013 seems to be worried about my design. Before presenting the problem, I need to explain this design.
Here is the layout of my class
template <typename T>
class Array1D {
private:
T* data_;
T* size_; // No stored as an int for optimisation purposes
bool owner_;
public:
Array1D(int n) {
data_ = new T[n];
size_ = data_ + n;
owner_ = true;
}
Array1D(T* data, int n) {
data_ = data;
size_ = data + n;
owner_ = false;
}
...
};
Most of the time, it works as a std::vector and owner_ is set to true. One can also construct an Array1D from a pointer, and this time owner_ is set to false. In this case, some operations such as resizing are not allowed (through an assert). Copy constructor and assignment for the array A are designed as:
Array1D(const Array1D& B) : Deep copy of B into A. After construction, A owns its memory.
Array1D(Array1D&& B) : Move operation in all cases. After construction, the ownership status of A is the same as B.
operator=(const Array1D& B) : Deep copy of B into A. If A does not owns its memory, an assert is there to check that A and B have the same size. The assignment does not change the ownership status of A.
operator=(Array1D&& B) : Move operation if A and B owns their memory. Otherwise, we do a deep copy, and the size are checked with an assert if A does not own its memory. The assignment does not change the ownership status of A.
I have applied the same idea to my 2 dimensional array whose elements are stored in row-major order
template <typename T>
class Array2D {
private:
T* data_;
T* size_[2];
bool owner_;
public:
Array2D(int n, int p) {
data_ = new T[n];
size_[0] = data_ + n;
size_[1] = data_ + p;
owner_ = true;
}
Array1D(T* data, int n, int p) {
data_ = data;
size_[0] = data + n;
size_[1] = data + p;
owner_ = false;
}
...
Array1D<T> operator()(int i) {
Array1D<T> row(data_ + i * nb_columns(), nb_columns());
return row;
}
...
int nb_columns() const {
return static_cast<int>(size_[1] - data_);
}
};
The Array1D returned by operator()(int i) does not own its memory and contains a pointer to the ith-row owned by the Array2D object. Is is useful in the following kind of code
sort(Array1D<T>& A); // A function that sorts array in place
Array2D<double> matrix(5, 100); // Construct an array of 5 rows and 100 columns
... // Fill the array
sort(matrix(3)) // Sort the 4th row
Those "temporary views" for rows of an 2 dimensional arrays are quite useful but I prefer to limit them to temporary objects to limit aliasing.
Unfortunately, using Visual Studio 2013, I get the following warning from the IDE for the line sort(matrix(3)): "Options for binding r-value to l-value reference is non-standard Microsoft C++ extension". I understand that matrix(3) is an object that lives temporarly and modifying it through a sort looks strange. But, as it is a "view", modifying it modifies the memory owned by matrix and is useful.
So my questions are the following:
Is what I am doing valid C++? (modifying a temporary value)
Is there a flaw in this design?
PS: The full code is available on Github.

Is what I am doing valid C++? (modifying a temporary value)
It's not. Non-const lvalue references cannot bind to temporary objects:
http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/
Is there a flaw in this design?
You are modifying the contents that are wrapped in an object. Since you need to have an lvalue, it would simply be fixed by having an intermediate variable:
auto m_temp_lvalue = matrix(3); // get the 4th row
sort(m_temp_lvalue); // sort the 4th row
I hope it helps.

Related

Specify layout of global variables in C

Consider this piece of code, where two global variables are defined:
int a;
int b;
As far as I know, the compiler may or may not place a and b in adjacent memory locations (please let me know if this is incorrect). For example, with GCC one may compile with -fdata-sections and reorder the two sections or whatever.
Is it possible to specify that a and b must be adjacent (in the sense that &a + 1 == &b), in either standard or GNU extended C?
Background: I am making an OpenGL loader, which is literally (omitting casts):
void (*glActiveShaderProgram)(GLuint, GLuint);
void (*glActiveTexture)(GLenum);
...
void load_gl(void (*(*loader)(char *))()) {
glActiveShaderProgram = load("glActiveShaderProgram");
glActiveTexture = load("glActiveTexture");
...
}
Simple enough, but every call to load compiles into a call to load. Since there is a relatively large number of functions to load, that can take up a lot of code space. (That is the reason I dropped glad.)
So I had something like this, which reduces binary size by ~30kB, which is extremely important for me:
char names[] = "glActiveShaderProgram glActiveTexture ...";
char *p = names, *pp;
for (int i = 0; i < COUNT; ++i) {
pp = strchr(names, ' ');
*pp = '\0';
(&glActiveShaderProgram)[i] = load(p);
p = pp + 1;
}
But this does assume the specific layout of these function pointers. Currently I wrap the function pointers in a struct which is type-punned into an array of pointers, like this:
union { struct {
void (*glActiveShaderProgram)(GLuint, GLuint);
void (*glActiveTexture)(GLenum);
...
}; void (*table[COUNT])(); } gl;
But then one #define for every function is required to make the user happy. So I wonder if there exists some more elegant way to specify the layout of global variables.

As Ted suggested in the comment. You could put the variables next to each other inside an array?
int ab[2] = {a, b};
Another way to ensure adjacent memory placement is with a packed struct. example
more info

Using smart pointers for array

How can I create a smart pointer to an array of double. I want to convert this expression :
double* darr = new double[N]; // Notice the square brackets
using smart pointer auto_ptr
the following instruction doesn't work:
auto_ptr<double[]> darrp(new double[N]);
Also how to get the values of the array using the smart pointer.
Thanks
Younès

You can't do this with std::auto_ptr, as auto_ptr does not contain a specialization for array*
Although auto_ptr doesn't allow this, you can use std::tr1::shared_ptr for a smart pointer array:
#include <tr1/memory>
std::tr1::shared_ptr<double[]> d(new double[10]);
This will compile, but shared_ptr will incorrectly call delete (instead of delete[]) on your array which is undesirable, so you will need to provide a custom deleter.
The answer here provides the code that you will need (copied verbatim), although the answer is for C++11:
template< typename T >
struct array_deleter
{
void operator ()( T const * p)
{
delete[] p;
}
};
std::shared_ptr<int> sp( new int[10], array_deleter<int>() );
Which for you, means you will need:
std::tr1::shared_ptr<double> d( new double[10], array_deleter<double>() );
To access the elements in your smart pointer array, you will first need to use get() to dereference the smart pointer to obtain the raw pointer:
std::tr1::shared_ptr<double> d( new double[10], array_deleter<double>() );
for (size_t n = 0; n < 10; ++n)
{
d.get()[n] = 0.2 * n;
std::cout << d.get()[n] << std::endl;
}
* Although your question is about C++03, it's worth noting that std::unique_ptr does contain partial specialization for an array, allowing this:
std::unique_ptr<double[]> d(new double[10]); // this will correctly call delete[]

Smart Pointers in a language that compiles to C

I'm writing a simple language that compiles to C, and I want to implement smart pointers. I need a bit of help with that though, as I can't seem to think of how I would go around it, or if it's even possible. My current idea is to free the pointer when it goes out of scope, the compiler would handle inserting the frees. This leads to my questions:
How would I tell when a pointer has gone out of scope?
Is this even possible?
The compiler is written in C, and compiles to C. I thought that I could check when the pointer goes out of scope at compile-time, and insert a free into the generated code for the pointer, i.e:
// generated C code.
int main() {
int *x = malloc(sizeof(*x));
*x = 5;
free(x); // inserted by the compiler
}
The scoping rules (in my language) are exactly the same as C.
My current setup is your standard compiler, first it lexes the file contents, then it parses the token stream, semantically analyzes it, and then generates code to C. The parser is a recursive descent parser. I would like to avoid something that happens on execution, i.e. I want it to be a compile-time check that has little to no overhead, and isn't full blown garbage collection.

For functions, each { starts a new scope, and each } closes the corresponding scope. When a } is reached, the variables inside that block go out-of-scope. Members of structs go out of scope when the struct instance goes out of scope. There's a couple exceptions, such as temporary objects go out-of-scope at the next ;, and compilers silently put for loops inside their own block scope.
struct thing {
int member;
};
int foo;
int main() {
thing a;
{
int b = 3;
for(int c=0; c<b; ++c) {
int d = rand(); //the return value of rand goes out of scope after assignment
} //d and c go out of scope here
} //b goes out of scope here
}//a and its members go out of scope here
//globals like foo go out-of-scope after main ends
C++ tries really hard to destroy objects in the opposite order they're constructed, you should probably do that in your language too.
(This is all from my knowledge of C++, so it might be slightly different from C, but I don't think it is)
As for memory, you'll probably want to do a little magic behind the scenes. Whenever the user mallocs memory, you replace it with something that allocates more memory, and "hide" a reference count in the extra space. It's easiest to do that at the beginning of the allocation, and to keep alignment guarantees, you use something akin to this:
typedef union {
long double f;
void* v;
char* c;
unsigned long long l;
} bad_alignment;
void* ref_count_malloc(int bytes)
{
void* p = malloc(bytes + sizeof(bad_alignment)); //does C have sizeof?
int* ref_count = p;
*ref_count = 1; //now is 1 pointer pointing at this block
return p + sizeof(bad_alignment);
}
When they copy a pointer, you silently add something akin to this before the copy
void copy_pointer(void* from, void* to) {
if (from != NULL)
ref_count_free(free); //no longer points at previous block
bad_alignment* ref_count = to-sizeof(bad_alignment);
++*ref_count; //one additional pointing at this block
}
And when they free or a pointer goes out of scope, you add/replace the call with something like this:
void ref_count_free(void* ptr) {
if(ptr) {
bad_alignment* ref_count = ptr-sizeof(bad_alignment);
if (--*ref_count == 0) //if no more pointing at this block
free(ptr);
}
}
If you have threads, you'll have to add locks to all that. My C is rusty and the code is untested, so do a lot of research on these concepts.

The problem is slightly more difficult, since your code is straightforward, but... what if another pointer is made to point to the same place as x?
// generated C code.
int main() {
int *x = malloc(sizeof(*x));
int *y = x;
*x = 5;
free(x); // inserted by the compiler, now wrong
}
You doubtlessly will have a heap structure, in which each block has a header that tells a) whether the block is in use, and b) the size of the block. This can be achieved with a small structure, or by using the highest bit for a) in the integer value for b) [is this a 64bit compiler or 32bit?]. For simplicity, lets consider:
typedef struct {
bool allocated: 1;
size_t size;
} BlockHeader;
You would have to add another field to that small structure, which would be a reference count. Each time a pointer points to that block in the heap, you increment the reference count. When a pointer stops pointing to a block, then its reference count is decremented. If it reaches 0, then it can be compacted or whatever. The use of the allocated field has now gone.
typedef struct {
size_t size;
size_t referenceCount;
} BlockHeader;
Reference counting is quite simple to implement, but comes with a down side: it means there is overhead each time the value of a pointer changes. Still, is the simplest scheme to work, and that's why some programming languages still use it, such as Python.

Dynamically allocate array of file pointers

is it possible to 'dynamically' allocate file pointers in C?
What I mean is this :
FILE **fptr;
fptr = (FILE **)calloc(n, sizeof(FILE*));
where n is an integer value.
I need an array of pointer values, but I don't know how many before I get a user-input, so I can't hard-code it in.
Any help would be wonderful!

You're trying to implement what's sometimes called a flexible array (or flex array), that is, an array that changes size dynamically over the life of the program.) Such an entity doesn't exist among in C's native type system, so you have to implement it yourself. In the following, I'll assume that T is the type of element in the array, since the idea doesn't have anything to do with any specific type of content. (In your case, T is FILE *.)
More or less, you want a struct that looks like this:
struct flexarray {
T *array;
int size;
}
and a family of functions to initialize and manipulate this structure. First, let's look at the basic accessors:
T fa_get(struct flexarray *fa, int i) { return fa->array[i]; }
void fa_set(struct flexarray *fa, int i, T p) { fa->array[i] = p; }
int fa_size(struct flexarray *fa) { return fa->size; }
Note that in the interests of brevity these functions don't do any error checking. In real life, you should add bounds-checking to fa_get and fa_set. These functions assume that the flexarray is already initialized, but don't show how to do that:
void fa_init(struct flexarray *fa) {
fa->array = NULL;
fa->size = 0;
}
Note that this starts out the flexarray as empty. It's common to make such an initializer create an array of a fixed minimum size, but starting at size zero makes sure you exercise your array growth code (shown below) and costs almost nothing in most practical circumstances.
And finally, how do you make a flexarray bigger? It's actually very simple:
void fa_grow(struct flexarray *fa) {
int newsize = (fa->size + 1) * 2;
T *newarray = malloc(newsize * sizeof(T));
if (!newarray) {
// handle error
return;
}
memcpy(newaray, fa->array, fa->size * sizeof(T));
free(fa->array);
fa->array = newarray;
fa->size = newsize;
}
Note that the new elements in the flexarray are uninitialized, so you should arrange to store something to each new index i before fetching from it.
Growing flexarrays by some constant multiplier each time is generally speaking a good idea. If instead you increase it's size by a constant increment, you spend quadratic time copying elements of the array around.
I haven't showed the code to shrink an array, but it's very similar to the growth code,

Any way it's just pointers so you can allocate memory for them
but don't forget to fclose() each file pointer and then free() the memory

D dynamic array initialization, stride and the index operation

Sorry, this became a 3-fold question regarding arrays
I think (dynamic) arrays are truly powerful in D, but the following has been bothering me for a while:
In C++ I could easily allocate an array with designated values, but in D I haven't found a way to do so. Surely the following is no problem:
int[] a = new int[N];
a[] = a0;
But it looks inefficient, since line one will initialize with 0, and like 2 with a0. Could something similar to the following be done in D?
int[] a = new int(a0)[N]; // illegal
Another efficiency matter I have when using stride in std.range:
import std.stdio;
import std.range;
struct S
{
int x;
this(this)
{
writeln("copy ", x);
}
}
void f(S[] s)
{
}
int main()
{
S[] s = new S[10];
foreach (i, ref v; s)
{
v.x = i;
}
f(stride(s, 3)); // error
return 0;
}
Surely I was naive thinking I could simply use stride to create a new array without copying it's elements? There is no way to do so in D, right?
So I went and simulated as if the array was as stride would return, and implemented f as:
f(s, 3);
void f(S[] s, uint stride)
{
ref S get(uint i)
{
assert (i * stride < s.length);
return s[i * stride];
}
for (uint x ... )
{
get(x) = ...;
}
}
Would there be a way to instead write get(x) using the index operator get[x]? This way I could statically mixin / include the striding get function and keep the rest of the function similar. I'd be interested in the approach taken, since a local struct is not allowed to access function scope variables (why not?).

But it looks inefficient, since line one will initialize with 0, and like 2 with a0. Could something similar to the following be done in D?
Use std.array.uninitializedArray
S[] s = uninitializedArray!(S[])(N);
s[] = a0;
Surely I was naive thinking I could simply use stride to create a new array without copying it's elements? There is no way to do so in D, right?
Your function f has an S[] as an argument, which is different from what stride returns. The D way to solve this is to make your f function accept any range by making it a template:
void f(Range)(Range s)
{
foreach (item; s)
// use item
}
S[] s = new S[10];
f(s); // works
f(stride(s, 3)); // works too
Alternatively you can copy the array:
f(array(stride(s, 3)));
But you probably want to avoid copying the entire array if it is large.
Would there be a way to instead write get(x) using the index operator get[x]? This way I could statically mixin / include the striding get function and keep the rest of the function similar. I'd be interested in the approach taken, since a local struct is not allowed to access function scope variables (why not?).
You can overload the indexing operator in your own struct.
struct StrideArray
{
this(S[] s, uint stride) { m_array = s; m_stride = stride; }
S opIndex(size_t i) { return s[i * m_stride]; }
void opIndexAssign(size_t i, S value) { s[i * m_stride] = value; }
private S[] m_array;
private uint m_stride;
}
This is (kind of) the way the actual stride function works. I'd recommend reading up on Ranges.

you can duplicate (create a copy of) an array with .dup (this will also work with slices) or you can set the elements with the array initializer
int[] a=a0.dup;
int[] b=[e1,e2,e3];
you can make the f generic (stride() returns a struct that you can iterate over, not an array)
void f(Z)(Z s)if(isInputRange!Z){
foreach(elem;s){
//...
}
}
remember that arrays are essentially structs with a pointer field to some memory block and a size field