performance – C++ multi-threaded determination of curling numbers in vectors

for a thesis I am creating a program that constructs curling sequences.
The curling number of a sequence is defined as ‘the largest frequency of any period at the end of the sequence’. (For example, 11232323 has curling number 3, because 23 is repeated three times)

Currently, I am mostly interested in the appearance of a 1 as curling number, when the sequence is generated with any possible combination of 2’s and 3’s (for example, we generate 2322322 and want to find the first occurence of ‘1’ as curling number).

I changed to multi-threading recently, but my performance lacks for what I need.

Does anyone have any suggestions on how to improve the speed of this multi-threaded program? I hope I provided enough comments to make it clear, but any questions will be answered.

#include <vector>
#include <iostream>
#include <chrono>
#include <thread>
#include <mutex>
#include <cmath>
#pragma warning (disable : 26451)
using namespace std::chrono;

const int range = 200;                          // used to reserve size for each sequence
std::mutex m_tail;                              // thread-lock for the tail
int tail = 0;                                   // value to keep track of the longest tail encountered
int length = 0;                                 // user input to define length of generator 

thread_local std::vector<short int> sequence(range);    // sequence for each individual thread
thread_local long double duur = 0;              // value for each individual thread to time functions

// this function finds the curling number (curl) of a sequence
// the curl of a sequence is defined as: the highest frequency of a period at the end of a sequence
// some examples:
// the curl of 11232323 is 3, because 23 is the most-repeated element at the end of the sequence
// the curl of 2222 is 4, because 2 is the most-repeated element at the end of the sequence
// the curl of 1234 is 1, because every period at the end of the sequence is repeated only once
int krul(short int j, short int last_curl) {
    short int curl = 1;                                     // the minimum curl of a sequence is 1
    for (short int length = 1; length <= int(j / (curl + 1) + 1); ++length) {
        short int freq = 1;                                 // the minimum frequency of any period is 1
        while ((freq + 1) * length <= j and std::equal(sequence.begin() + j - (freq + 1) * length, sequence.begin() + j - freq * length, sequence.begin() + j - length)) {
                                                            // while within length of the sequence, and while periods match, continue
            ++freq;                                         // if they match, the frequency is increased with 1
            if (freq > curl) {                              // if the frequency of this period is higher than the curl yet
                curl = freq;                                // update the value for curl
                if (curl > last_curl) {                     // mathematical break: the curl can't be 'last_curl + 2', so when we encounter 'last_curl + 1' we can stop
                    return curl;
    return curl;

// this function constructs a whole sequence by adding the curling number and then recalculating the curling number since the sequence got a new element
// and then adds the new curling number and recalculates and so on and so on
int constructor(std::vector<short int> sequence_generator) {
    sequence = (std::move(sequence_generator));             // we move the sequence from the generator to the main sequence
    short int j = length;
    short int last_curl = length;
    for (j; j <= range; ++j) {                              // the sequence is built from starting length to given range
        krul(j, last_curl);
        short int curl = krul(j, last_curl);                // the resulting curling number is the highest of all the candidate curling numbers.
        if (curl == 1) { return j; }                        // for this program, we want to stop if the curling number == 1 and return the length of the sequence (j)
        sequence(j) = curl;
        if (curl >= 4) { return j + 1; }                    // if the curling number >= 4, the next curling number is 1 so we can already break and return the length of the sequence (j + 1)
        last_curl = curl;                                   // we update the value of last_curl for the next calculation
    return 0;                                               // we don't encounter this because each sequence breaks, but just in case we return 0

// this function takes a number and generates its binary twin, but then existing of 2's and 3's
// because we want every possible generator existing of any combination of 2's and 3's
// and after construction of the generator we call the constructor of the sequence
int decToBinary(unsigned long long n) {
    std::vector<short int> sequence_generator(range);
    for (long int i = length - 1; i >= 0; i--) {
        long int k = n >> i;
        if (k & 1) {
            sequence_generator(length - 1 - i) = 3;
        else {
            sequence_generator(length - 1 - i) = 2;
    return (constructor(sequence_generator) - length);              // the 'tail' of the sequence is the total length ('j' of constructor) minus the length of the generator

// this function starts a thread and calls the decToBinary function for a range of sequences
void multi_threader(int thread_number, int thread_count) {
    std::cout << "thread " << thread_number << " initiated!" << std::endl;

    auto start_time = high_resolution_clock::now();                 // timer to keep track of elapsed time of thread
    int thread_tail = 0;                                            // value to keep track of maximum tail length of this thread
    unsigned long long start = (thread_number - 1) * pow(2, length) / (thread_count);   // start value for this thread
    unsigned long long stop = pow(2, length) / (thread_count)*thread_number;            // end value for this thread
    for (unsigned long long int i = start; i < stop; ++i) {
        auto start2 = high_resolution_clock::now();                                     // partial timer start

        int local_tail = decToBinary(i);                                                // tail of this sequence
        if (local_tail > thread_tail) {                                                 // if larger than any earlier tail in this sequence...
            thread_tail = local_tail;                                                   // update its value
        auto stop2 = high_resolution_clock::now();                                      // partial timer end
        duur = duur + duration_cast<microseconds>(stop2 - start2).count();              // print elapsed partial time
        std::lock_guard<std::mutex> l(m_tail);                      // lock 'tail' while editing
        if (thread_tail > tail) {                                   // if largest tail in this thread is larger than any thread before...
            tail = thread_tail;                                     // update the value
    std::cout << duur / 1000 << std::endl;
    auto stop_time = high_resolution_clock::now();
    auto duration = duration_cast<microseconds>(stop_time - start_time).count() / 1000;
    std::cout << "thread " << thread_number << " exited, duration: " << duration << " ms, result " << thread_tail << std::endl;

// this function calls for as many threads as the user requests and passes the length the user requested on to the thread
void iterate_generator() {
    std::vector<std::thread> thread_vector;
    int thread_count, thread_number;
    thread_number = 0;
    std::cout << "enter the length for the sequence generator (e.g. 10, maximum 40)" << std::endl;
    std::cin >> length;
    length = 25;            // example value
    std::cout << "enter the desired number of threads (e.g. 2, maximum 4)" << std::endl;
    std::cin >> thread_count;
    thread_count = 4;       //example value

    // start the threads and join them
    for (int i = 0; i < thread_count; ++i) {
        std::thread th = std::thread((thread_number, thread_count)() {multi_threader(thread_number, thread_count); });

    for (auto& th : thread_vector) { th.join(); }

int main() {
    while (true) {
        std::cout << tail << std::endl << std::endl;

Thanks in advance!!

linear algebra – Using Steinitz Exchange Lemma to prove that a set of four vectors in $mathbb{R}^2$ is linearly dependent

Cheers, so I am asked to prove that in the vector space of $V = mathbb{R}^2$, every set of 4 vectors is linearly dependent. I tried solving it using Steinintz Exchange Lemma, so:

Let $V = mathbb{R}^2$ be the vector space of interest, and a set ${ v_1, v_2 }$ be a basis for our vector space (e.g. ${ (1,0), (0,1) }$). Now let $A subset V, A = { a_1 , a_2 , a_3 , a_4 } $. As $A subset V$ we can say that: $$
a_i=sum_{j=1}^2 gamma_{ij}v_j qquad(i=1,dots,4)
I then got a bit stuck, and saw a solution that proceeded by saying that:

If $alpha_1a_1+dots+alpha_4a_4=0$, then

sum_{i=1}^4alpha_igamma_{ij}=0 qquad(j=1,2)
and since $4 > 2$ there are infinetely many solutions so A is not linearly independent.

Although, I understand the whole logic here, and why the result solves our question, why did we suppose that $alpha_1a_1+dots+alpha_4a_4=0$, and especially why did we use the elements of A in tuples? Could it be done with any other way? Thanks for the help!

linear algebra – What is the dimension of span of a vector orthogonal to ‘n’ linearly independent vectors?

So let’s say I have “n” linearly independent vectors in “m” dimensions (m > n).
Is it true that the dimension of span of these n vectors is also n?
And if I find a vector orthogonal to all the n vectors, what will be the span of combined system of the n vectors and the orthogonal vector? n+1?
If yes, how can I prove it?

differential geometry – Is it possible to reverse engineer the metric to find the basis vectors of the manifold

For a given surface that has metric components $g_{munu}$ is it possible to find it’s basis vectors and more importantly the exact surface that space is on. Since we know that
$$g_{munu} = e_mu cdot e_nu$$

However, for an intrinsic geometry, the transformation that gives it its geometry is given by a general transformation $x’^mu b_mu = G^mu(x^mu) b_mu$ if we take the partial derivative with respect to $x^nu$ we get that
$$e_nu=partial_nu G^mu(x^mu) b_mu$$
The new space that these intrinsically curved coordinates live in is measured with orthogonal basis vectors $b_mu$. This gives us
$$g_{munu}=partial_mu G^alpha(x^mu) b_alpha cdot partial_nu G^beta(x^mu) b_beta$$
$$g_{munu}=partial_mu G^alpha(x^mu) partial_nu G^beta(x^mu) b_alpha cdot b_beta$$
$$g_{munu}=partial_mu G^alpha(x^mu) partial_nu G^beta(x^mu) delta_{alphabeta}$$

With this information is it possible to find the transformation $G^mu(x^mu)$ in terms of the metric components?

linear algebra – Sampling distribution for approximating a function on set of vectors

I have a set of vectors $X = {x_1, dots, x_n}$ and a vector $y = f(X)$. These vectors are not orthogonal to each other. For simplicity, we can also say that $f()$ is just the mean.

Now, I would like to compute values for a discrete distribution $W={w_1, dots, w_n}$ over the elements of X, such that by sampling $m < n$ elements from X according to this distribution, I will get that $f(X_m) approx f(X)$.

Any kind of reference and pointer will be appreciated.

vectors – The Path of the center of a cutting tool that will cut out an ellipse (x/4)2+(y/2)2=1

I am currently stuck on this question and it’s implementation on Mathematica.

You are the chief engineer at the Badger Steel Plate Company in Madison, Wisconsin. In comes an order for 750 square steel plates, each measuring 12 inches wide and 12 inches long.

Go to the drawing board

plate = Graphics({Blue, Thickness(0.01), Line({{-6, -6}, {-6, 6}, {6, 6}, {6, -6}, {-6, -6}})}); 
plateplot = Show(plate, Axes -> True, AxesOrigin -> {0, 0}, AspectRatio -> Automatic, AxesLabel -> {"x", "y"})

But here’s the kicker:

The plates are to have everything inside the ellipse (x/4)2+(y/2)2=1 cut out.

Take a look:

Clear(x, y, t); 
x(t_) = 4 Cos(t); 
y(t_) = 2 Sin(t); 
hole = ParametricPlot({x(t), y(t)}, {t, 0, 2 Pi}, PlotStyle -> {{Blue, 
Show(plateplot, hole)

You have a new robotic router that takes instructions from Mathematica and whose cutting center can be programmed to follow any curve you tell it to follow. If you are going to use a bit 1 inch in diameter, then what curve should you program in as the path of the center of the router to cut out the ellipse?

After you have found the correct curve, add its plot to the plot above.

Big Tip:

If your curve is an ellipse, then you screwed up.

The normal vector D(unittan(t), t) could be very useful.

I was able to obtain the plot using normal vector D(unittan(t), t) for the plot which produced the Following plotenter image description here

Now I have the following question.

Actually, the bit size of 1 inch in diameter used above was arbitrarily chosen by reaching into the 
drawer and pulling out a bit. You could always get by with a smaller bit. Why?
But you cannot use bits that are too large. Why?
Try to estimate the diameter of the largest bit that you could use to do the job.

Any help on demystifying this question will be greatly appreciated

bitcoincore development – How do I tweak the bip340_test_vectors to check that signature verification fails for other test vectors?

Disclaimer: This is an educational exercise. This Python code is for testing and should not be used in production or with mainnet Bitcoin. A private key of 3 is absolutely insecure.

There are two Python files in the BIPs repo and which you can view by cloning the BIPs repo.

git clone

These two files are in /bitcoin/bips/tree/master/bip-0340.

If you run python3 from within the bip-0340 directory you will get the output and error messages that are in the CSV file test-vectors.csv.

If you open with your text editor e.g. vim

you can edit any of the test vectors’ secret key (private key), message, auxiliary randomness.

For example vector0 has the secret key 3.

seckey = bytes_from_int(3)

You could change the secret from 3 to 4.

seckey = bytes_from_int(4)

Save and close your text editor and then run python3 again.

This will generate an AssertionError.

Traceback (most recent call last):
  File "", line 253, in <module>
  File "", line 31, in vector0
    assert(not has_square_y(pubkey_point))

There is an assert that checks that for a particular public key (X,Y) (calculated from the secret (private) key) the Y coordinate isn’t square. This is a check inherited from a previous design decision to use squaredness as the tiebreaker for two Y coordinates. The tiebreaker is now evenness rather than squaredness.

Every private key (scalar) maps to a single corresponding public key (point). In BIP 340 we restrict ourselves to public keys with an even Y coordinate. This could lead to only half of the public keys being valid and hence only half of the private keys being valid too. Instead we say that every private key has a public key and if d.G has an odd Y coordinate we effectively sign with -d (the negated private key) instead. With that change you could say that every scalar is a valid private key but there are two private keys that map to the same public key. Only one private key maps to a public key with an even Y coordinate though. (Only one BIP 340 private key maps to a BIP 340 public key.)

Remember when you are finished experimenting you can use this to reset:

git reset --hard HEAD

This will discard the changes you have made and go back to the state of the code before you started experimenting.

Thanks to Pieter Wuille on IRC for some additions.