discrete mathematics – Faster algorithm to calculate this GCD sum?

sum_{i=0}^{2n-1} gcd(i, 4n+1)

Apart from the obvious $O(n)$ algorithm, is there a faster way to calculate this sum?

A possible approach I thought was to factorize $4n+1$ and find numbers $le 2n-1$ that share a factor with $4n+1$.

For example if say $n=5$ then $4n+1=21$ and $2n-1=9$. Factorisation of $21=3*7$. Therefore result = $lfloorfrac{9}{3}rfloor *3 + lfloorfrac{9}{7}rfloor *7+1*(10-3-1+1)+21=42.$ This would take $O(n^{1/2})$ for factorization and $O(D)$ for calculating the sum where, $D= $number of distinct prime factors of $4n+1$ .

How can I make my resource mapping faster?

I have a Direct3D 11 application and recently I’ve started to implement a new feature on it, the UI(User Interface).

It seems to work well, but I’m having one problem with optimization when it comes to moving stuff on my window, currently I can create a 2D square with textures and a 300×300 resolution.I can drag it around my screen by calling a function that updates the square position on my screen.

bool Model::UpdateModel2D(ID3D11DeviceContext* pDeviceContext, short NewX, short NewY)

if ((m_PreviousX == NewX) || (m_PreviousY == NewY)) // detects if the UI position is the same as before
    return true;
ModelData* pNewData;
ZeroMemory(&ms, sizeof(D3D11_MAPPED_SUBRESOURCE));

ModelData pData(6); //Builds two triangles in order to form a square.

pData(0).VertexData = XMFLOAT3(NewX, NewY, 0.0f);
pData(0).TextureCoords = XMFLOAT2(0.0f, 0.0f);

pData(1).VertexData = XMFLOAT3(NewX + 300.0f, NewY + 300.0f, 0.0f);
pData(1).TextureCoords = XMFLOAT2(1.0f, 1.0f);

pData(2).VertexData = XMFLOAT3(NewX, NewY + 300, 0.0f);
pData(2).TextureCoords = XMFLOAT2(0.0f, 1.0f);

pData(3).VertexData = XMFLOAT3(NewX, NewY, 0.0f);
pData(3).TextureCoords = XMFLOAT2(0.0f, 0.0f);

pData(4).VertexData = XMFLOAT3(NewX + 300.0f, NewY, 0.0f);
pData(4).TextureCoords = XMFLOAT2(1.0f, 0.0f);

pData(5).VertexData = XMFLOAT3(NewX + 300.0f, NewY + 300.0f, 0.0f);
pData(5).TextureCoords = XMFLOAT2(1.0f, 1.0f);

hr = pDeviceContext->Map(m_pVertexBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &ms);
if (FAILED(hr))
    return false;

pNewData = (ModelData*)ms.pData;
memcpy(pNewData, pData, sizeof(ModelData) * m_NumVertices); //updates the position of the 2D square
pDeviceContext->Unmap(m_pVertexBuffer, 0);

return true;

To drag this square I have to left click on it and move my mouse, but this is an issue because my application get’s too slow, and it looks like the longer I hold it on my mouse, the slower the application get’s, until it get’s released. If someone knows how to improve updating my buffers, please help me.

javascript – Sorting an array of positive integers including 0 much faster than Radix Sort

I was working on an Limit Order Book structure in JS and came up with this algorithm. I am pretty sure this must have already been implemented but couldn’t even find a clue over the web.

The thing is, it’s very fast especially when you have an array of many duplicate items. However the real beauty is, after inserting k new items into an array of already sorted n items the pseudo sort (explained below) takes only O(k) and sort takes only O(n+k). To achieve this i keep a pseudo sorted array of m items in a sparse array where m is the number of unique items.

Take for instance we need to sort (42,1,31,17,1453,5,17,0,5) where n is 9 and then we just use values as keys and construct a sparse array (Pseudo Sorted) like;

Value: 1 1 2  2  1  1    1
Index: 0 1 5 17 31 42 1453

Where Value keeps the repeat count. I think now you start to see where I am getting at. JS have a fantastic ability. In JS accessing the sparse array keys can be very fast by jumping over the non existent ones. To achieve this you either use a for in loop of Object.keys().

So you can keep your sparse (pseudo sorted) array to insert new items and they will always be kept in pseudo sorted state having all insertions and deletions done in O(1). Whenever you need a real sorted array just construct it in O(n). Now this is very important in a Limit Order Book implementation because say you have to respond to your clients with the top 100 bids and bottom 100 asks in every 500ms over a web socket connection, now you no longer need to sort the whole order book list again even if it gets updated with many new bids and asks continuously.

So here is sparseSort code which could possibly be trimmed up by employing for loops instead of .reduces etc. Still beats Radix and Array.prototype.sort().

function sparseSort(arr){
    var tmp = arr.reduce((t,r) => t(r) ? (t(r)++,t) : (t(r)=1,t),());
    return Object.keys(tmp)
                 .reduce((a,k) => {
                           for (var i = 0; i < tmp(k); i++) a.push(+k);
                           return a;

Here you can see a bench against Array.prototype.sort() and radixSort. I just would like to know if this is reasonable and what might be the handicaps involved.

keyboard shortcuts – A *faster* faster way to access Strikethrough on Google Docs

The keyboard shortcut for strike-through in the Google Docs web UI is AltShift5 (yes that is a five), and the toolbar doesn’t have a strike-through button.

It’s very awkward (I actually have to swivel my whole body slightly to press it, my hand and wrist just don’t work that way, heh; also sometimes I have to wear a wrist brace on my left hand, which makes it even worse).

I use Chrome on Windows. I found A faster way to access Strikethrough on Google Docs when searching for other shortcuts, it got my hopes up (especially because the OP’s use case is the same as mine)… but it wasn’t what I was looking for:

  • The top answer just describes the aforementioned keyboard shortcut.
  • This answer has a userscript but it doesn’t work any more.
  • This answer seems to point to the same script.
  • Everything there is over 10 years old, anyways, a lot has probably changed.

So, my question is, is there some way to make toggling (or even just setting) strike-through on highlighted text more convenient than the current tendinitis-inducing keyboard shortcut (preferably by keyboard rather than mouse, if possible)?

As far as I can tell, I can’t modify keyboard shortcuts or customize the toolbar. But maybe there’s a way to do that? Or some outside-the-box trick to accomplish something similar? Fwiw, I’m open to console- or script-based solutions, too, I do have Tampermonkey installed. I don’t know, I poked around in the UI a bit but I don’t really have any ideas.

Selling – Private Dedicated API for Faster CAPTCHA Solving | Proxies-free

We have a new special offer for you our valued customers, The offer is titled, “CAPTCHAs.IO’s Private Dedicated API, Reliable Dedicated Private API for faster reCAPTCHA and Image CAPTCHA Solving”.

Your very own reCAPTCHA / CAPTCHA private API CAPTCHA solving access server. With unlimited threads and API calls. This is a dedicated server which we will install latest CapMonster Pro. Running with 20 threads of CapMonster Pro (latest version) with 1 year subscription and could handle 2000 captchas/minute load speed. Supported CAPTCHAs are reCAPTCHAs v2, v3, and invisible with 100,000++ normal image CAPTCHAs.

reCAPTCHA v2, v3 and invisible solving time is 10 to 60 seconds on average, depending on the complexity of reCAPTCHA set in the Google Admin Console for reCAPTCHA. For image CAPTCHAs solving time is 0 to 1 second on average.

Our datacenter is a European colocation datacenter strategically located in the Netherlands, and connected to globally based Points of Presence. Constructed, owned and operated by sister company Greenhouse Datacenters, these Tier 3 designed facilities enable flexible, sustainable, secure and highly connected IT infrastructure deployments. Servers enjoy a connection of 10 Tbit/s global network with only 45% utilization with 10g DDoS protection.

Best of all these servers are genuine hardware and no fake virtual hardware shared, operates fast and stable 24/7/365.

To order please visit https://captchas.io/servers/


c# – How to make this algorithm faster. Calculates and searches through large arrays

Ive got this algoritm thats “complex”. The comments in code give examples of how large the various data types could be. My cpu usage is less than 10% when running this and ram usage is good. no leakage or anything.

I have a list of arrays. where each array is x coordinates. We are storing a list of several x coordinate-groups essentially. = “xs” in code
And I have the same thing but for y-values = “ys”

Each array in xs and ys are different sizes. HOWEVER, the size of xs and ys is always the same. So if an array in xs contains 321654 elements then there is a corresponding array in ys with exactly 321654 elements.

The corresponding elements or “paired” xs and ys arrays are always at the same index in their corresponding list. so if xs_array(321654) is in xs(4) then ys_array(321654) is at ys(4).

The following code aims to get mean values. standard deviation and -1std and +1std from mean, as y coordinates from a collection of coordinates. It does this by taking the smallest arrays (the smallest set of x and y coordinates. It then looks at each array in xs and finds the index at which the x-coordinates are in the array. It then goes into ys, finds the corresponding xs array, and gets the y value from the x-coordinate. It goes through this, adds it all up. calculate, mean, std etc etc.

    List<Double()> xs;  //each array may be be e.g 40000 elements. and the list contains 50 - 100 of those
    List<Double()> ys; //a list containing arrays with an exact equal size as xs

public void main_algorithm()
    int TheSmallestArray = GetSmallestArray(xs); //get the smallest array out of xs and store the size in TheSmallestArray
    for (int i = 0; i < TheSmallestArray; i++) 
        double The_X_at_element = The_Smallest_X_in_xs(i); //store the value at the index i
        //go through each array find the element at which x_values is at. If it doesnt exist, find the closest element. 
        List<Double> elements = new List<double>(); //create a new list of doubles
        for (int o = 0; o<xs.Count(); o++) //go through each array in xs
            //go through the array o and find the index at which the number or closest number of the_X_at_element is
            int nearestIndex = Array.IndexOf(xs(o), xs(o).OrderBy(number => Math.Abs(number - The_X_at_element)).First()); 
            double The_Y_at_index = ys(o)(nearestIndex); //go through ys and get the value at this index
            elements.Add(The_Y_at_index); store the value in elements
        mean_values.Add(elements.Mean()); //get the mean of all the values from ys taken
        standard_diviation_values.Add(elements.PopulationStandardDeviation());//get the mean of all the values from ys taken
        Std_MIN.Add(mean_values(i) - standard_diviation_values(i)); //store the mean - std and add to min
        Std_MAX.Add(mean_values(i) + standard_diviation_values(i)); //store the mean + std and add to max

 public int GetSmallestArray(List<double()> arrays)
    int TheSmallestArray = int.MaxValue;
    foreach(double() ds in arrays)
        if(ds.Length < TheSmallestArray)
            TheSmallestArray = ds.Length; //store the length as TheSmallestArray
            The_Smallest_X_in_xs = ds;//and the entirety of the array ds as TheSmallestArray_xs
    return TheSmallestArray;

algorithms – Is there a theorem that says when an array of numbers can be searched faster than linearly?

In general, if you know nothing about the array, then searching linearly is the best you can do (a simple adversarial argument is enough to justify this).

However, if you know more about the structure of the array, there are plenty of things you can do.

For example, imagine the array has the following property: its elements are sorted increasingly up to a certain index $i$, and sorted decreasingly after that. Can you do faster than a linear search? Yes! You can still search elements in $O(lg n)$.

What if I ask you to search in an array $A$ that is sorted increasingly except for $1$ element, which is out of place? (you don’t know which element). You can still search in $O(lg n)$, how :)?

What if the array $A$ holds the following property: when you divide it in chunks of size $sqrt{n}$, say $A(0..sqrt{n}-1), A(sqrt{n}..2sqrt{n}-1), ldots, A(n-sqrt{n}..n-1)$, then each of the chunks is sorted. Can you do faster than a linear search? Yes! if you do a binary search in each chunk, then you get complexity $O(sqrt{n}lg n)$.

What if you know the array $A$ has only two elements, $x$ and $y$, and they both appear the same number of times? Can you find an element in less than $n$ comparisons? Yes!

In general, there are many different kinds of informations you can have about an array that speed up search, and different forms of knowledge would have different impacts on how much you can speed up.

python – Cython seems faster than C++?

I decided to run some benchmarks, to compare performance numbers between Cython and C++, and it seems like Cython is more than 4 times faster than C++. How could this be?

My Cython code:

%%cython -fa
import numpy as np
import time

#a = np.zeros((10000, 100000))
cdef int s = 0
cdef int i, j
cdef double beginTime = time.time()
cdef int m = 10000, n = 10000000

for i in range(m):
    print(f"rProgress: {round(i/100)}%, time elapsed: {round(time.time()-beginTime)}s", end="")
    for j in range(n):
        #a(i, j) = 3
        s += i*j
print(f"nPerformance: {round(n/1e7*m/(time.time() - beginTime))/100} GFLOPS @ {round(n/1e7*m)/100}B ops")

Run results: 2.61 GFLOPS @ 100B ops

enter image description here

My C++ code:

#include <math.h>

#include <chrono>
#include <ctime>
#include <iostream>
#include <thread>

void dummyLoop(int m, int n) {
    int s = 0;
    auto beginTime = std::time(0);

    for (int i = 0; i < m; i++) {
        std::cout << "rProgress: " << round(100 * i / m)
                  << "%, time elapsed: " << std::time(0) - beginTime << "s";
        for (int j = 0; j < n; j++) {
            s += i * j;
    std::cout << "nPerformance: "
              << round(n / 1e7 * m / (std::time(0) - beginTime)) / 100
              << " GFLOPS @ " << round(n / 1e7 * m) / 100 << "B opsn";
int main() {
    dummyLoop(10000, 1000000);

Run results: 0.63 GFLOPS @ 10B ops

enter image description here

I have tested so that each runs 10B ops, and 100B ops, I still get the same results. I have a 10700k, with boost clock of 5.1 GHz, so 2.6 GFLOPS sounds reasonable enough, as there’s a multiply and add operation. Also I’m pretty sure both are running on a single thread as I have htop on. So how can Cython beat C++?

calculus and analysis – How to integrate faster

I always wanted to know how to speed up my integral computation in mathematica. Is there some techniques that i am unaware to make the integration faster.

Here an example:

func = (2 ((Mu)G2-(Mu)Pi2) (5 x^2+5 x (z-2)-4 ((Eta)-(Rho)+z-1)))/mb^2-12 (x+z-2) (-(Eta)-(Rho)+x+z-1)

The integration is given as:

Integrate(func, {x, 2*Sqrt((Eta)), 1 + (Eta) - (Rho)}, 
  {z, -((-(2*((Eta) - (Rho) + Sqrt(x^2 - 4*(Eta)) - 1)) + x*(Sqrt(x^2 - 4*(Eta)) - 2) + x^2)/
     (Sqrt(x^2 - 4*(Eta)) + x - 2)), (2*(-(Eta) + (Rho) + Sqrt(x^2 - 4*(Eta)) + 1) - 
     x*(Sqrt(x^2 - 4*(Eta)) + 2) + x^2)/(Sqrt(x^2 - 4*(Eta)) - x + 2)}, 
  Assumptions -> {0 < (Rho) < 1, 0 < (Eta) < 1, (Rho) < (Eta), (Eta) + 1 > 2*Sqrt((Eta)) + (Rho), 
    Element(mb, Reals)}, GenerateConditions -> False)

On my laptop i need 42.219 s to solve the integral. However, my integrals are getting more and more complicated so to learn new optimization methods would be much appreciated.