cython vs golang for resource intensive tasks

my question stems from an older question which I asked a while ago. now the there is consensus among team to explore using cython or golang or something better which has good concurrency support. we have some python custom auth module which we would like to reuse.

so my question is,

  1. what language in your opinion is suited for the task given at hand?
  2. what other languages modules are available where we can use existing python custom auth?
  3. does language choice really matter?

python – Cython seems faster than C++?

I decided to run some benchmarks, to compare performance numbers between Cython and C++, and it seems like Cython is more than 4 times faster than C++. How could this be?

My Cython code:

%%cython -fa
import numpy as np
import time

#a = np.zeros((10000, 100000))
cdef int s = 0
cdef int i, j
cdef double beginTime = time.time()
cdef int m = 10000, n = 10000000

for i in range(m):
    print(f"rProgress: {round(i/100)}%, time elapsed: {round(time.time()-beginTime)}s", end="")
    for j in range(n):
        #a(i, j) = 3
        s += i*j
print(f"nPerformance: {round(n/1e7*m/(time.time() - beginTime))/100} GFLOPS @ {round(n/1e7*m)/100}B ops")

Run results: 2.61 GFLOPS @ 100B ops

enter image description here

My C++ code:

#include <math.h>

#include <chrono>
#include <ctime>
#include <iostream>
#include <thread>

void dummyLoop(int m, int n) {
    int s = 0;
    auto beginTime = std::time(0);

    for (int i = 0; i < m; i++) {
        std::cout << "rProgress: " << round(100 * i / m)
                  << "%, time elapsed: " << std::time(0) - beginTime << "s";
        for (int j = 0; j < n; j++) {
            s += i * j;
        }
    }
    std::cout << "nPerformance: "
              << round(n / 1e7 * m / (std::time(0) - beginTime)) / 100
              << " GFLOPS @ " << round(n / 1e7 * m) / 100 << "B opsn";
}
int main() {
    dummyLoop(10000, 1000000);
}

Run results: 0.63 GFLOPS @ 10B ops

enter image description here

I have tested so that each runs 10B ops, and 100B ops, I still get the same results. I have a 10700k, with boost clock of 5.1 GHz, so 2.6 GFLOPS sounds reasonable enough, as there’s a multiply and add operation. Also I’m pretty sure both are running on a single thread as I have htop on. So how can Cython beat C++?

python – Cython Fibonacci Sequence

For my current job, I’ve been told that soon I’m going to be porting our Python code to Cython, with the intent of performance upgrades. In preparation of that, I’ve been learning as much about Cython as I can. I’ve decided that tackling the Fibonacci sequence would be a good start. I’ve implemented a Cython + Iterative version of the Fibonacci sequence. Are there any more performance upgrades I can squeeze out of this? Thanks!

fibonacci.pyx

cpdef fib(int n):
    return fib_c(n)

cdef int fib_c(int n):
    cdef int a = 0
    cdef int b = 1
    cdef int c = n
    while n > 1:
        c = a + b
        a = b
        b = c
        n -= 1
    return c

And how I test this code:

testing.py

import pyximport ; pyximport.install() # So I can run .pyx without needing a setup.py file #
import time

from fibonacci import fib

start = time.time()
for i in range(100_000):
    fib(i)
end = time.time()
print(f"Finonacci Sequence: {(end - start):0.10f}s (i=100,000)") # ~2.2s for 100k iterations

python – How to write Cython pyx file wrapper for C++ files with no real constructor?

I am very new to Cython, I have a C++ library that I want to cythonize and create a python module with. There are structures in the header files that are not covered in the docs. For example, how do I write this header file in a pyx file:

using namespace std;
class MeshCollider;
class Mesh : public Entity{ 
public:
    int no_surfs;
    list<Surface*> surf_list;
    vector<Bone*> bones;
    Matrix mat_sp;
        Mesh(){
        reset_bounds=true;
        min_x=0.0;min_y=0.0;
    }
        Mesh* CopyEntity(Entity* ent=NULL);
}

I don’t know C++ well but in this code, there seems to be no constructor of the same name as the class.
And in another header file named 3ds.h, there is this

namespace load3ds{
  Mesh* Load3ds(string URL, Entity* parent_ent);
}

I just want to mention all the functions that will be accessible to Python from the module and because the library is huge, can I only wrap the class and function declarations that are in the header files and leave out the Cpp files when writing the pyx? Can I also omit the variable declarations in the header, because writing everything in the entire library will be impossible because of its size and I could not find any easy tool to autogenerate the pyx files?

python – Optimize nested for loop with array of varying size with cython

Dynamically growing arrays (also called resizable arrays, variable arrays variable-length arrays) are a type of array is very useful when you don’t know the exact size of the array at design time. First you need to define an initial number of elements (wiki).

Problem

Cython can be used to improve the speed of nested for loops in python. The thing is it helps to have some experience in C (which I lack). I am trying to perform a nested for loop similar to the one below as fast as possible in Cython.

import numpy as np

my_list = (1,2,3)
n = 10
a = 0.5

Estimate_1_list = ()
Estimate_2_list = ()

for l in my_list:

    # Resizable matrices
    a_mat = np.zeros((l,n+1),float)
    b_mat = np.zeros((l,n+1),float)
    
    for i  in range(n):
        t = i*a
        
        for j in range(l):
            
            # Fill matrices
            a_mat(j,i+1) = a_mat(j,i+1) + np.random.random()
            
            b_mat(j,i+1) = a_mat(j,i+1)/(2*t+3)
    
    # Append values of interest to use at different values of matrix size
    Estimate_1_list.append(np.mean(a_mat(:,n)))
    Estimate_2_list.append(np.std(a_mat(:,n)))   
results = (Estimate_1_list,Estimate_2_list)

My solution

My best solution using Cython so far is below. It is obviously not the fastest. I am trying to run this simulation faster in Cython than I have below.

import cython
# Load cython extension
%load_ext Cython

%%cython
import numpy as np

def my_function(list my_list, int n, int a ):
cdef list Estimate_1_list = ()
cdef list Estimate_2_list = ()
cdef int l,i,t,j
for l in my_list:

    # Resizable matrices (could I use memory view?)
    a_mat = np.zeros((l,n+1),float)
    b_mat = np.zeros((l,n+1),float)

    for i  in range(n):
        t = i*a

        for j in range(l):

            # Fill matrices
            a_mat(j,i+1) = a_mat(j,i+1) + np.random.random()

            b_mat(j,i+1) = a_mat(j,i+1)/(2*t+3)

    # Append values of interest to use at different values of matrix size
    Estimate_1_list.append(np.mean(a_mat(:,n)))
    Estimate_2_list.append(np.std(a_mat(:,n)))  
    
# Return results 
results = (Estimate_1_list,Estimate_2_list)
return results


#Test
my_list = (1,2,3)
n = 10
a = 0.5
my_function((1,2,3), 10, 0.5 )

((0.13545224609230933, 0.6603542545719762, 0.6632002117071227),
 (0.0, 0.19967544614685195, 0.22125180486616808))

I am new to Cython but managed to find questions that are similar to this in Java and Cython on stack exchange but have not found anything implementing this in Cython in a similar fashion to what I have above. I have put this in the list below.

The main problem that I ran into is that Cython has different scoping rules to python since C and python have different scoping rules. In other words, we cannot create a new vector in the loop and assign it to the same name (source).

Very similar questions

  1. dynamic array creation in cython
  2. Python to cython – improve performance for iterations over large arrays
  3. Improve cython array indexing speed

Conclusion

Please can anyone help me to improve on the Cython solution above. I am trying to improve on my ability to work with Cython and do not have any C/C++ experience prior to python. Please forgive me If I have described anything wrong. This was my best attempt at explaining the problem.

I am trying to take the above python code and optimize it so that it can run faster using cython. My solution works but is too slow. From what I understand one can improve cython code by using a more C like approach.

Calling python function fetching ‘c’ like struct using cython and ctypes interface in c program

I am new to cython/ctypes, and i m trying to call python function from c program using cython interface, but the data is either empty or not correct. Here is the sample program

python function

$ cat c_struct.py
#!/usr/bin/env python2.7

from ctypes import *

class Request(Structure):
    _pack_ = 1
    _fields_ = (
            ('type', c_ubyte),
            ('subtype', c_ubyte),
            ('action', c_ubyte),
            ('checksum', c_ushort)
            )

    def __repr__(self):
        return "'type': {}, 'subtype': {}, 'action': {}, 'checksum': {}".format(self.type,
                self.subtype, self.action, self.checksum)


req_msg = Request()
def get_message(typ):
    if typ == 1:
        req_msg.type = 12
        req_msg.subtype = 2
        req_msg.action = 3
        req_msg.checksum = 0x1234
        return req_msg
    else:
        return "String object From Python"

cython wrapper

$ cat caller.pyx
import sys
sys.path.insert(0, '')

from c_struct import get_message

cdef public const void* get_data_frm_python(int typ):
    data = get_message(typ)
    print "Printing in Cython -",data
    return <const void*>data

and finally my ‘C’ caller

cat main.c
#include <Python.h>
#include "caller.h"
#include <stdio.h>

typedef struct {
    uint8_t type;
    uint8_t subtype;
    uint8_t action;
    uint16_t checksum;
} __attribute__ ((packed)) request_t;

int
main()
{
    PyImport_AppendInittab("caller", initcaller);
    Py_Initialize();
    PyImport_ImportModule("caller");

        const char* str = get_data_frm_python(2);
        printf("Printing in C - %sn", str);

        const request_t* reqP = (request_t*)get_data_frm_python(1);
        printf("Printing in C - %u, %u, %u, %un", reqP->type, reqP->subtype, reqP->action, reqP->checksum);

    return 0;
}

and a simple makefile to build it

$ cat Makefile
target = main
cy_interface = caller

CY := cython
PYTHONINC := $(shell python-config --includes)
CFLAGS := -Wall $(PYTHONINC) -fPIC -O0 -ggdb3
LDFLAGS := $(shell python-config --ldflags)

CC=gcc

all: $(target)

%.c: %.pyx
        $(CY) $+

%.o: %.c
        $(CC) -fPIC $(CFLAGS) -c $+

$(target): $(cy_interface).o $(target).o
        $(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)

And finally the output 🙁

$ ./main
Printing in Cython - String object From Python
Printing in C -
Printing in Cython - 'type': 12, 'subtype': 2, 'action': 3, 'checksum': 4660
Printing in C - 1, 0, 0, 0

Can someone please help me in to understand on what am i doing wrong ?

Note :- If i change void* to char* atleast the string data is fetched properly, but segfaults for struct

$ cat caller.pyx
import sys
sys.path.insert(0, '')

from c_struct import get_message

cdef public const void* get_data_frm_python(int typ):
    data = get_message(typ)
    print "Printing in Cython -",data
    return <const char*>data


$ ./main
Printing in Cython - String object From Python
Printing in C - String object From Python
Printing in Cython - 'type': 12, 'subtype': 2, 'action': 3, 'checksum': 4660
TypeError: expected string or Unicode object, Request found
Exception TypeError: 'expected string or Unicode object, Request found' in 'caller.get_data_frm_python' ignored
Segmentation fault

Python – protection / obfuscation of C ++ & Cython code (cross-platform)

I wrote a library in C ++ that is included in Cython so that its functions can be called from Python. I need to protect / obfuscate the C ++ source code (as well as the Cython code, but this is not necessary) while maintaining the ability to make it available to users on Windows / Unix / Mac. Does anyone have any clues or hints on how I could do this please?

How to translate Python infinite integers in Cython?

I want to factorize very large numbers (for example, 100-bit, 200-bit numbers) with Cython.

Hello everybody, I have implemented the elliptic curve method for factoring in Python 3.6. Now I want to speed up my code with Cython (version 29.13). I'm a beginner in the Cython world, but I know that Cython works better when I define the type for variables. So I converted all my Python classes to Cython classes and now I want to type variables. I've read that Cython automatically converts the "cdef int" declaration into a classic Python integer with infinite length, but that's not the case. When I try to factorize numbers like this "5192296858543544183479685583896053", I get an OverflowError because "int is too big to convert to C long". Are there any ways to declare a huge integer to speed up my code? The only variables without type declaration are the variables, which can be very large integers.

PS: I've already tried using the cpython type uPY_LONG_LONG (long unsigned), but it was useless because I always got the same error.

Thank you for your attention and in advance for your answers.