python – ffill with limit in numpy

For performance reasons I’d like to use Numpy to do the same kind of forward fill I can get with Pandas like so:

s = pd.Series((1, 2, nan, nan, nan, 7, 8, nan, nan, nan, nan, nan, nan, nan, nan))
s.ffill(limit=7)

which results in:

array(( 1., 2., 2., 2., 2., 7., 8., 8., 8., 8., 8., 8., 8., 8., nan))

I’ve got a forward fill without limit working, but that doesn’t do it for me.

python – Efficient NumPy sliding window function

Here is a function for creating sliding windows from a 1D NumPy array:

from math import ceil, floor
import numpy as np

def slide_window(A, win_size, stride, padding = None):
    '''Collects windows that slides over a one-dimensional array.

    If padding is None, the last (rightmost) window is dropped if it
    is incomplete, otherwise it is padded with the padding value.
    '''
    if not (0 < stride <= win_size):
        fmt = 'Stride must satisfy 0 < stride <= %d.'
        raise ValueError(fmt % win_size)

    n_elems = len(A)
    if padding is not None:
        n_windows = ceil(n_elems / stride)
        shape = n_windows, win_size
        A = np.pad(A, (0, n_windows * win_size - n_elems),
                   constant_values = padding)
    else:
        n_windows = floor(n_elems / stride)
        shape = n_windows, win_size

    elem_size = A.strides(-1)
    return np.lib.stride_tricks.as_strided(
        A, shape = shape,
        strides = (elem_size * stride, elem_size),
        writeable = False)

Meant to be used like this:

>>> slide_window(np.arange(5), 3, 2, -1)
array((( 0,  1,  2),
       ( 2,  3,  4),
       ( 4, -1, -1)))

Is my implementation correct? Can the code be made more readable? In NumPy 1.20 there is a function called sliding_window_view, but my code needs to work with older NumPy versions.

Transformando arquivo txt com Python e Numpy

Olá, pessoal estou iniciando aqui com python e estou com o seguinte problema, tenho um arquivo de log com formato txt, onde tenho as informações abaixo:
inserir a descrição da imagem aqui

gostaria de separar o mesmo para ficar da seguinte forma:
inserir a descrição da imagem aqui

pensei em fazer o código no python primeiro substituindo os seguintes itens por “,” (Virgula), tipo primeira coluna preciso separar o IP das demais informações, ante do IP tem o txt: , o segundo e a data que tem o – – ( antes dela, o mesmo antes do Get, assim conseguiria depois gerar um tabela automática com estes dados, e futuramente vou inserir estes dados no banco de dados mysql

python – Make numpy calculation super fast by reducing size of intermediate arrays

I am trying to perform a certain numerical computation in Python with Numpy:

import numpy as np

# Given:

I = 100
O = 1000
F = 10

o = np.random.random((O,F))
ev = np.random.choice(a=(False, True), size=(O,F))
x = np.random.random((I,F))

const1 = 1.3456
const2 = 2.3456
const3 = 3.3456
const4 = 4.3456

# My calculations:

diff = o(np.newaxis, :, :) - x.reshape((-1, 1, x.shape(x.ndim - 1)))

# Might be accessible from cache:
nsh = np.einsum('iof,iof,of,->io', diff, diff, ev, 1 / const1)
kv = const2 * np.exp(nsh + const3)
# :Might be accessible from cache

result = np.einsum('io,iof,of->if', kv, diff, ev)

I was trying to optimize for speed using broadcasting in the first line diff = and einsum in the last line. However, as one can easily see, the first line produces a very large array diff. Is it possible to avoid the creation of this large array while improving the speed or at least keeping the current speed?

One idea would be to replace

nsh = np.einsum('iof,iof,of,->io', diff, diff, ev, 1 / const1)

with

X = x
Y = o
X_sqr = np.sum(X ** 2, axis=1)
Y_sqr = np.sum(Y ** 2, axis=1)
nsh = (X_sqr(:, np.newaxis) - 2.0 * X.dot(Y.T) + Y_sqr) / const1

which is much faster. However, it does not include ev yet, because I don’t know how I have to include it, to get it right. And also it does not remove the need to create diff, because I still need it in the final line.

python – Is there a Numpy or pyTorch function for this code?

Basically is there a Numpy or PyTorch function that does this:

dims = vp_sa_s.size()
        for i in range(dims(0)):
            for j in range(dims(1)):
                for k in range(dims(2)):
                     #to mimic matlab functionality: vp(mdp_data.sa_s)
                    try:
                        
                        vp_sa_s(i,j,k) = vp(mdp_data('sa_s')(i,j,k))
                        
                    except:
                        print('didnt work with' , mdp_data('sa_s'))

Given that vp_sa_s is size (10,5,5) and each value is a valid index vp i.e in range 0-9. vp is size (10,1) with a bunch of random values.

Matlab do it elegantly and quickly with vp(mdp_data.sa_s) which will form a new (10,5,5) matrix. If all values in mdp_data.sa_s are 1, the result would be a (10,5,5) tensor with each value being the 1st value in vp.

Does a function or method that exists that can achieve this in less than O(N^3) time as the above code is terribly inefficient.

Thanks!

python – Solving a System of linear equations with 3 variables without numpy

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

performance – Optimizing python function using numpy. where and numpy.unique to find availabilities

I’m using a function to determine if resources can be used again or not.
This is the numpy array i’m using.

 'Resource_id', 'Start_date', 'end_date', 'start_time', 'end_time', 'overload'
   (548, '2019-05-16', '2019-05-16', '08:45:00', '17:40:00',2),
   (546, '2019-05-16', '2019-05-16', '08:45:00', '17:40:00',2),
   (546, '2019-05-16', '2019-05-16', '08:45:00', '17:40:00',2),
   (543, '2019-05-16', '2019-05-16', '08:45:00', '17:40:00',1),
  1. First step is to find all resource available on a date, for example (2019-05-16 from 8:30 to 17:30). To achieve this i used np.where like the example below:

     av_resource_np = resource_availability_np(np.where(
     (resource_availability_np(:,1) <= '2019-05-16')
     & (resource_availability_np(:,2) >= '2019-05-16')
     & (resource_availability_np(:,3) <= '17:30:00') 
     & (resource_availability_np(:,4) >= '08:30:00')))
    
  2. Here i try to find unique resource ids and the sum of their overload factor using np.unique()

     unique_id, count_nb = np.unique(av_resource_np(:,(0,5)), axis=0, return_counts=True)
     availability_mat = np.column_stack((unique_id, count_nb ))
    

Which yields the followig results:

'Resource_id' 'overload' 'Count'
548           2           1
546           2           2
543           1           1
  1. A simple filtering is done to select which resource hat can’t be used in this date. If a resource is used in the same date more or equal (>=) to itsoverload, then we can’t use it again.

      rejected_resources = availability_mat (np.where(availability_mat (:, 2) >= availability_mat (:, 1)))
    

Result here should be both resource 543 and 546 which can’t be used again.

So this is main idea behind my function, the problem here is that it takes more than 60% of the whole program runtime and i would appreciate any advice about how to make it more efficient/faster. Thank you.

Full code:

def get_available_rooms_on_date_x(date, start_time, end_time, resource_availability_np):

    av_resource_np = resource_availability_np(np.where(
    (resource_availability_np(:,1) <= date)
    & (resource_availability_np(:,2) >= date)
    & (resource_availability_np(:,3) <= end_time) 
    & (resource_availability_np(:,4) >= start_time)))

    unique_id, count_nb = np.unique(av_resource_np(:,(0,5)), axis=0, return_counts=True)

    availability_mat = np.column_stack((unique_id, count_nb ))
    
    rejected_resources = availability_mat (np.where(availability_mat (:, 2) >= availability_mat (:, 1)))
    
    return rejected_resources

python – How can I view a 1D array in numpy as a 2D (1 by n) array?

I apologize if this has been asked before, but I can’t seem to find an answer or I am not searching for the answer correctly.

I am currently writing code in python using numpy and my function takes a input as a matrix. I want to view a 1D array as a (1 by n) 2D array.

Here is a minimal example of my issue

import numpy as np


def add_corners(A, B):
    r = A(0, 0) + B(B.shape(0) - 1, B.shape(1) - 1)
    return r


C = np.array(((1, 2, 3), (4, 5, 6)))
D = np.array(((9, 8), (7, 6), (5, 4), (10, 11)))
E = np.array((1, 2, 3, 4, 5))

print(add_corners(C, D))
print(add_corners(C, E))

print(add_corners(C,E)) leads to an error, since E.shape(1) is not well defined. Is there a way to get around this without having to add an if statement to check if my input contains a 1D array? That is, I want to refer to the entries of E as E(1,x) as opposed to just E(x).

Any help is greatly appreciated!