Modern arrays#
Arrays are useful to store large sets of data that share the same data type. Arrays are very fast to access and modify data, and are useful in several cases, like
Reading and storing data.
Applying the same operation to a large collection of data.
Store data in a systematic way instead of having many variables.
Model vectors, matrices, tensors.
Model vectorial and matrix operations.
Solve large linear problems, linear algebra problems, compute eigen values and eigen vectors.
Solve ODE systems.
Apply finite differences and finite elements.
Vectorize operations and use HPC.
Model data structs, from physical systems to graphical structs and so on.
Manipulate memory contents and program close to the metal.
…
There are many more data structs or containers: lists, dictionaries(maps), queue, binary trees, etc (see https://www.cs.usfca.edu/~galles/visualization/Algorithms.html). Arrays, in particular, are very fast to access and retrieve data from memory. But they are not fast if you need to insert and delete data in the middle. In general, for scientific computing, arrays are the default data struct, but you need always to consider which one would be the best for your case.
Since c++11 it is possible to use modern data structs that are equivalent to the
primitive data arrays but give a better user interface, also handling
automatically the memory. Although there are two analog structures, std::array
and std::vector (and in the future we will use std::valarray), we will focus
mainly on the std::vector
since it handles dynamic memory (std::array
is for
static arrays). The key here is that these data structures create a continuous
block memory block that can be accessed very fast: https://hackingcpp.com/cpp/std/sequence_containers.html

The following example shows the basic features for both of them and also a new
kind of for
. It is recommended to run this on a graphical debugger , either
locally in emacs or visual code, or online on the python tutor or onlinegdb, at
the link https://onlinegdb.com/CKeYkLEiz .
#include <iostream>
#include <array>
#include <vector>
int main(void)
{
std::array<double, 10> data1;
data1[5] = 9.87;
data1[0] = -7.8765;
data1[9] = 1000.343;
std::cout << data1[5] << "\n";
for (auto val : data1) {
std::cout << val << "\n";
}
std::vector<double> data2;
data2.resize(4); // size = 4, capacity 4
data2[3] = 9.87;
data2[0] = -7.8765;
data2[1] = 1000.343;
std::cout << data2[2] << std::endl;
for (auto & val : data2) { // reference
val *= 2; // modify the value
std::cout << val << "\n";
}
data2.resize(5); // size = 5, capacity = 8 (double the initial, to avoid frequent mallocs)
data2.push_back(0.987); // size = 6, capacity = 8
data2.push_back(12343.987); // size = 7, capacity = 8
data2.push_back(-0.000987); // size = 8, capacity = 8
data2.push_back(3.986544); // size = 9, capacity = 16
return 0;
}
As you can see, a vector is a dynamc array of homogeneous contigous elements.
Initial exercises#
Find the maximum size you can use for a
std::array
and astd::vector
. Are they the same?Print the address of the first and second elements in a
std::array
andstd::vector
. Are they contigous? To print the memory address of the first element use the&
operator as&data[0]
Use the algorithm foreach with a lambda function to divide by two each element in an array. Use also for each to print the elements and their memory address.
Using std::array
: Fixed size, goes to the stack, automatic for#
// g++ -g -fsanitize=address
#include <iostream>
#include <array>
int main(void)
{
const int N = 10;
//double data[N] {0};
std::array<double, N> data;
std::cout << "size: " << data.size() << std::endl;
std::cout << "max_size: " << data.max_size() << std::endl;
std::cout << "&data[0]: " << &data[0] << std::endl;
std::cout << "&data[1]: " << &data[1] << std::endl;
//std::cout << data[-1] << std::endl; // detected by sanitizers address
// initialize the array with even numbers
for(int ii = 0; ii < N; ++ii) {
data[ii] = 2*ii;
}
// print the array
for(const auto & val : data) {
std::cout << val << " ";
}
std::cout << "\n";
// TODO: compute the mean
std::cout << "Average = " << sum/data.size() << std::endl;
return 0;
}
You might visualize this in the python/c++ tutor.
And we can simplified the code even more if we use the standard library:
// g++ -g -fsanitize=address
#include <numeric>
#include <algorithm>
#include <iostream>
#include <array>
int main(void)
{
const int N = 10;
//double data[N] {0};
std::array<double, N> data;
std::cout << "size: " << data.size() << std::endl;
std::cout << "max_size: " << data.max_size() << std::endl;
std::cout << "&data[0]: " << &data[0] << std::endl;
std::cout << "&data[1]: " << &data[1] << std::endl;
//std::cout << data[-1] << std::endl; // detected by sanitizers address
// initialize the array with even numbers
int ii = 0 ;
auto init = [&ii](double & x){ x = 2*ii; ii++; };
std::for_each(data.begin(), data.end(), init);
// print the array
auto print = [](const int & x) { std::cout << x << " " ; };
std::for_each(data.begin(), data.end(), print);
std::cout << "\n";
// compute mean
double avg = std::accumulate(data.begin(), data.end(), 0.0)/data.size();
std::cout << "Average = " << avg << std::endl;
}
Exercises#
Find the largest
std::array
size you can get in your system. Is it related to the ram size? check the commandulimit -s
Write a program that computes the dot product between two
std::arrays
. You can use either indexes or theinner_product
ortransform_reduce
algorithms.Can you read the
std::array
size from the command line usingargv
? try it.
Using std::vector
: Dynamic size, uses the heap#
Vectors allow to use dynamic size arrays very easily and are the recommend data structure for our current math vector and matrix applications. You can see more info here:
https://hackingcpp.com/cpp/std/sequence_containers.html#vector
https://www.learncpp.com/cpp-tutorial/an-introduction-to-stdvector/
https://www.learncpp.com/cpp-tutorial/stdvector-capacity-and-stack-behavior/
To declare a vector of double
, you just use
...
std::vector<double> data1(1000); // 1000 doubles
std::vector<double> data2; // no size yet
data2.resize(1200); // reserve space to store 1200 double
std::vector<double> data3 {1, 2, 3}; // 3 ints, type deduced
Exploration : Declaration and operation on a vector#
Declare a vector of double whose size is read from the command line as an argument. Fill it as
Compute the average.
Exploration: Vectors and functions#
Same as before, but write a function to fill the vector and another to compute the vector average. Print the average in scientific notation. Test it with several sizes.
Exploration: Vectors and lambda functions#
Same as before, but use a lambda function to fill the vector.
Exploration: Vectors and std
algorithms#
Same as before, but use lambda functions and standar algorithms to do the same
// g++ -g -fsanitize=address
#include <algorithm>
#include <numeric>
#include <iostream>
#include <vector>
int main(int argc, char **argv)
{
// TODO
return 0;
}
A short discussion about primitive arrays#
Primitive arrays in C++ are fixed-size collections of elements of the same type stored in contiguous memory locations. They can be declared as
// Declaration and initialization
int numbers[5] = {1, 2, 3, 4, 5};
char name[6] = {'C', 'l', 'a', 'u', 'd', 'e'};
// Access elements using index
int third_number = numbers[2]; // Gets 3 (zero-indexed)
Characteristics of Primitive Arrays:
Fixed size: Size must be known at compile time
Stack allocation: Usually stored on the stack (unless global)
No bounds checking: C++ doesn’t prevent accessing invalid indices
No built-in size information: Size must be tracked separately
Cannot be returned directly from functions (decays to pointer)
Exercise Declare a primitive array of doubles of large size. What is the maximum size that you can get?
Primitive arrays with dynamic memory#
Dynamic arrays allow you to determine array size at runtime using pointers and heap memory allocation. See https://hackingcpp.com/cpp/lang/memory_basics.html. In this case, you must use a pointer
, which is a variable that stores a memory address, and then ask the OS for memory (and release it), all manually.
// Allocate memory for 10 integers
int * dynamic_array = new int[10];
// todo: check if memory was actually allocated
// Use the array
dynamic_array[0] = 42;
dynamic_array[1] = 73;
// IMPORTANT: Must deallocate manually
delete[] dynamic_array;
Characteristics of Dynamic Arrays:
Runtime size determination: Size can be decided during execution
Heap allocation: Memory is allocated on the heap
Manual memory management: You must remember to
delete[]
Raw pointers: No built-in bounds checking or size information
Potential memory leaks: If not properly deallocated
Comparison with std::vector
#
std::vector
from the C++ Standard Library provides a more robust, safe, and flexible alternative.
#include <vector>
// Create a vector
std::vector<int> numbers = {1, 2, 3, 4, 5};
// Add elements
numbers.push_back(6);
// Get size
size_t size = numbers.size(); // Returns 6
// Access with bounds checking
int safe_access = numbers.at(2); // Throws exception for invalid index
Advantages of std::vector
:
Dynamic sizing: Automatically resizes as elements are added
Automatic memory management: No manual deallocation required
Bounds checking: Option to use .at() method for safer access
Rich functionality: Includes methods for insertion, deletion, searching:
push_back
,pop_back
,insert
,erase
,clear
,size
,empty
, and more.Compatible with algorithms: Works with the entire C++ Standard Library
Contiguous storage: Same performance characteristics as arrays
Size information: Built-in tracking of current element count
Iterator support: Can be used with standard iterators
Almost always prefer to use std::vector
.
EXERCISE: Comparing primitive arrays, dynamics memory and
std::vector
Please write two programs. The first one, using a primitive array and dynamic memory, that starts with 5 elemments like{2, 3, 4, 5, 6}
; then, print the memory address of the first element (&data[0]
), and then append the number 10 at the end. Print the final array. Now write the second program doing the same but using astd::vector
.
Exercises#
Always try to use the STL, and also use functions to implement the following:
Compute the norm of a vector.
Implement the Gauss 7 point method, storing the weights and the coordinates points in a vector.
Compute the dot product between two vectors. Use
std::inner_product
.Assume that a vector represents the coefficients of a n-degree polynomial. Compute the derivative of the polynomial and save the coefficients in another vector.
data = {1 , 3, 4.5}; // 1*x^0 + 3*x^1 + 4.5*x^2 newdata = {3, 9.0}; // 0 + 3*x^0 + 2*4.5*x^1
Compute the min of a vector.
Compute the max of a vector.
Compute the argmin (index of the min) of a vector
Compute the argmax (index of the max) of a vector
Compute the pnorm
(27)#\[\begin{equation} \left(\sum_{i} x_{i}^{p}\right)^{1/p} \ . \end{equation}\]Read the contents of a large file into a vector. Compute and print the mean, the median, the 25, 50, 75 percentiles, the min value, and the max value.
Use the standard algorithm sort to sort a data array.
Fill a vector with random numbers
// llenar un vector con numeros aleatorios #include <iostream> #include <string> #include <vector> #include "random_vector.h" int main(int argc, char **argv) { // read variables if (argc != 2) { std::cerr << "Usage: " << argv[0] << " N" << std::endl; std::cerr << "N: size of the vector" << std::endl; return 1; } const int N = std::stoi(argv[1]); // set the vector std::vector<double> data; data.resize(N); // fill the vector fill_randomly(data); // print the vector std::string fname = "data.txt"; print(data, fname); return 0; }
Create a vector of
ints
, and then usestd::shuffle
to randomly shuffle it.Use a vector to compute the counter histogram for some data read from a file.
Histogram from random numbers, saving to file. Write a program that generates N random numbers, with the Weibull distribution and a given seed. Both N, the seed, and the Weibull parameters must read from the command line. Also, you must compute the histogram and print it to a file.