How can I reduce this C# methods length?

I’m a 16-year-old so I never studied computer science, although I do try to adopt the best practices I can when it comes to programming. There’s no constant value for how long a method should be, but I think we can all agree that the method in question does too many things, and can be reduced in length.

The class in question is the core entry point of my whole application. Each instance of this application acts as a "worker" for a scraper network. I would like to focus on the ProcessQueueItem method, and if possible the Process method too if possible..

Here is the method in question,

private ScraperQueueItemResult ProcessQueueItem(ScraperQueueItem item, ISocialEventHandler eventHandler)
{
    _logger.Trace($"Processing {item.Item}");
    
    try
    {
        eventHandler.NavigateToProfile();

        if (eventHandler.IsProfileVisitsThrottled())
        {
            eventHandler.SwitchAccount(true);
            return ProcessQueueItem(item, eventHandler);
        }

        if (!eventHandler.TryWaitForProfileToLoad())
        {
            return new ScraperQueueItemResult(
                item.Id, "Method 'TryWaitForProfileToLoad' returned false.", eventHandler.GetPageSource(), false
            );
        }

        var profile = eventHandler.CreateProfile();

        if (!profile.ShouldScrape(out var validationResult))
        {
            return new ScraperQueueItemResult(
                item.Id, validationResult.ToString(), eventHandler.GetPageSource(), false
            );
        }

        var connectionsToStore = new List<ProfileConnection>();

        if (profile.ShouldCollectConnections())
        {
            profile.Connections = eventHandler.GetConnections();

            connectionsToStore = eventHandler.GetFilteredConnections(profile.Connections);

            if (connectionsToStore.Any())
            {
                _logger.Trace(
                    $"Collected {profile.Connections.Count} / {profile.FollowerCount}, storing {connectionsToStore.Count} of them in the database.");
            }
        }

        if (profile.ShouldSave(out validationResult))
        {
            if (!profile.IsPrivate)
            {
                profile.Posts = eventHandler.GetPosts(profile.Username);
            }

            profile.Save();

            if (connectionsToStore.Any())
            {
                _scraperQueue.AddItems(eventHandler.ConvertConnectionsToQueueItems(connectionsToStore), profile.Id);
            }

            return new ScraperQueueItemResult(
                item.Id, "success", "", true
            );
        }

        if (connectionsToStore.Any())
        {
            _scraperQueue.AddItems(eventHandler.ConvertConnectionsToQueueItems(connectionsToStore), profile.Id);
        }

        return new ScraperQueueItemResult(
            item.Id, validationResult.ToString(), eventHandler.GetPageSource(), false
        );
    }
    catch (Exception e)
    {
        _bugSnagClient.Notify(e, report =>
        {
            report.Event.Metadata.Add("queue_item", item.Item);
            report.Event.Metadata.Add("current_url", eventHandler.GetUrl());
            report.Event.Metadata.Add("page_source", eventHandler.GetPageSource());
            report.Event.Metadata.Add("created_at", DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss"));
        });
        
        // TODO: Try and requeue the item?

        _logger.Error(e.Message);
        
        return new ScraperQueueItemResult(
            item.Id, e.Message, eventHandler.GetPageSource(), false
        );
    }
}

ScraperWorker class:

public class ScraperWorker
{
    private readonly ScraperQueue _scraperQueue;
    private readonly Dictionary<string, ISocialEventHandler> _eventHandlers;
    private readonly ScraperWorkerDao _scraperWorkerDao;
    private readonly ScraperQueueDao _scraperQueueDao;
    private readonly Client _bugSnagClient;
    private readonly ILogger _logger;

    private bool _isProcessing;

    public ScraperWorker(
        ScraperQueue scraperQueue, 
        Dictionary<string, ISocialEventHandler> eventHandlers, 
        ScraperWorkerDao scraperWorkerDao, 
        ScraperQueueDao scraperQueueDao,
        Client bugSnagClient,
        ILogger logger)
    {
        _scraperQueue = scraperQueue;
        _eventHandlers = eventHandlers;
        _scraperWorkerDao = scraperWorkerDao;
        _scraperQueueDao = scraperQueueDao;
        _bugSnagClient = bugSnagClient;
        _logger = logger;
        
        Process();
    }

    public void Start()
    {
        _isProcessing = true;
    }

    public void Stop()
    {
        _isProcessing = false;
    }

    private void Process()
    {
        var ticksSinceStatusUpdate = 0;
        
        new Thread(() =>
        {
            while (true)
            {
                ticksSinceStatusUpdate++;

                if (ticksSinceStatusUpdate >= 10)
                {
                    RenewWorkerStatus(); // Checks if we have paused the worker via an external service.
                    ticksSinceStatusUpdate = 0;
                }

                if (!_isProcessing)
                {
                    _logger.Warning($"The worker is currently paused, sleeping for 30 seconds.");

                    Thread.Sleep(TimeSpan.FromSeconds(30));
                    continue;
                }
                
                if (!_scraperQueue.TryGetItem(out var item))
                {
                    _logger.Warning($"The queue is currently empty, sleeping for 30 seconds.");

                    Thread.Sleep(TimeSpan.FromSeconds(30));
                    continue;
                }

                _scraperWorkerDao.UpdateWorkerLastSeen(StaticState.WorkerId);

                var eventHandler = ResolveEventHandlerFromItem(item.Item);
                
                eventHandler.SetCurrentItem(item.Item);

                if (eventHandler.IsLoginNeeded())
                {
                    eventHandler.Login();
                }

                var result = ProcessQueueItem(item, eventHandler);

                if (result.IsSuccess)
                {
                    _logger.Success($"Finished processing {item.Item}");
                }
                else
                {
                    _logger.Trace($"Finished processing {item.Item}");
                }

                Console.WriteLine();

                _scraperQueueDao.MarkItemAsComplete(item.Id);
                _scraperQueueDao.StoreItemResultInDatabase(result);
            }
        }).Start();
    }

    private void RenewWorkerStatus()
    {
        _isProcessing = _scraperWorkerDao.IsWorkerRunning(StaticState.WorkerId);
    }

    private ScraperQueueItemResult ProcessQueueItem(ScraperQueueItem item, ISocialEventHandler eventHandler)
    {
        _logger.Trace($"Processing {item.Item}");
        
        try
        {
            eventHandler.NavigateToProfile();

            if (eventHandler.IsProfileVisitsThrottled())
            {
                eventHandler.SwitchAccount(true);
                return ProcessQueueItem(item, eventHandler);
            }

            if (!eventHandler.TryWaitForProfileToLoad())
            {
                return new ScraperQueueItemResult(
                    item.Id, "Method 'TryWaitForProfileToLoad' returned false.", eventHandler.GetPageSource(), false
                );
            }

            var profile = eventHandler.CreateProfile();

            if (!profile.ShouldScrape(out var validationResult))
            {
                return new ScraperQueueItemResult(
                    item.Id, validationResult.ToString(), eventHandler.GetPageSource(), false
                );
            }

            var connectionsToStore = new List<ProfileConnection>();

            if (profile.ShouldCollectConnections())
            {
                profile.Connections = eventHandler.GetConnections();

                connectionsToStore = eventHandler.GetFilteredConnections(profile.Connections);

                if (connectionsToStore.Any())
                {
                    _logger.Trace(
                        $"Collected {profile.Connections.Count} / {profile.FollowerCount}, storing {connectionsToStore.Count} of them in the database.");
                }
            }

            if (profile.ShouldSave(out validationResult))
            {
                if (!profile.IsPrivate)
                {
                    profile.Posts = eventHandler.GetPosts(profile.Username);
                }

                profile.Save();

                if (connectionsToStore.Any())
                {
                    _scraperQueue.AddItems(eventHandler.ConvertConnectionsToQueueItems(connectionsToStore), profile.Id);
                }

                return new ScraperQueueItemResult(
                    item.Id, "success", "", true
                );
            }

            if (connectionsToStore.Any())
            {
                _scraperQueue.AddItems(eventHandler.ConvertConnectionsToQueueItems(connectionsToStore), profile.Id);
            }

            return new ScraperQueueItemResult(
                item.Id, validationResult.ToString(), eventHandler.GetPageSource(), false
            );
        }
        catch (Exception e)
        {
            _bugSnagClient.Notify(e, report =>
            {
                report.Event.Metadata.Add("queue_item", item.Item);
                report.Event.Metadata.Add("current_url", eventHandler.GetUrl());
                report.Event.Metadata.Add("page_source", eventHandler.GetPageSource());
                report.Event.Metadata.Add("created_at", DateTime.Now.ToString("yyyy-MM-dd HH:mm:ss"));
            });
            
            // TODO: Try and requeue the item?

            _logger.Error(e.Message);
            
            return new ScraperQueueItemResult(
                item.Id, e.Message, eventHandler.GetPageSource(), false
            );
        }
    }

    private ISocialEventHandler ResolveEventHandlerFromItem(string item)
    {
        var host = new Uri(item).Host.Replace("www.", "");

        if (_eventHandlers.ContainsKey(host))
        {
            return _eventHandlers(host);
        }

        throw new Exception($"Failed to resolve event handler for host '{host}'");
    }
}

lens – Seeming contradiction on wikipedia about rear nodal points and how the principal plane is related to focal length

There are two lines on the wikipedia pages for “cardinal points” and “focal length” that seem to contradict each other, and I would be extremely grateful if someone could explain to me why they do not. In the page for cardinal points, it says:

If the medium surrounding the optical system has a refractive index of 1 (e.g., air or vacuum), then the distance from the principal planes to their corresponding focal points is just the focal length of the system. In the more general case, the distance to the foci is the focal length multiplied by the index of refraction of the medium.

This makes sense to me. I also get that these principal planes can often be located outside of the lens with some clever optics, allowing for lenses that are physically shorter than their focal length. However, on the page for focal length, the page reads:

When a photographic lens is set to “infinity”, its rear nodal point is separated from the sensor or film, at the focal plane, by the lens’s focal length. Objects far away from the camera then produce sharp images on the sensor or film, which is also at the image plane.

I don’t see how these can both be true, because if the focal point, the point as I understand it to be where all the light converges, was on the film plane, an image wouldn’t be rendered, it would just be an indistinguishable point of light. Does the light not have to travel a distance past the focal point to the film plane in order to form an image?

I think it is possible that I am getting my front and rear nodal points confused, or that I have a larger fundamental misunderstanding about how focal length is measured. Thank you so much for your help!

graphs – Can Johnson’s algorithm for simple cycles be modified in order to find only cycles up to length L (but all of them)?

I have a question regarding Johnson’s algorithm for finding all simple cycles in a graph.

I was wondering it is possible to modify the algorithm in order to find only cycles up to a given length.

After having read into the algorithm, my approach would the following:

Assume I want only cycles up to length L.
Then, if the stack has reached height L, if the current node on top of the stack is not a neighbor of the starting node, I treat it as having no neighbors left, i.e. I remove it again from the stack.

I am not deep enough into network theory, however, to be sure that this modification still guarantees me finding all simple cycles (up to length L).

astrophotography – Equivalent of the 500 rule for working out how many frames you can take with a given focal length before resetting/moving a manual mount?

I’m new to astrophotography and I don’t have a tracker. I just have a light pollution filter, an intervalometer and a camera on a tripod with a few different prime lenses. I’m using deep sky stacker to make photos of deep sky objects.

I’ve read about the 500 rule which has been useful for estimating the maximum length of my exposures with my different lenses to avoid star trails. However, I’d like to use my intervalometer to get a consistent number of exposures before resetting my camera on the subject so that I get a consistent crop factor of wasted image, for stars which have slid on/off the view during my shoot.

Is there a similar rule for doing this? It seems like it should be possible but I can’t figure out the math.

r – “Aesthetics must be either length 1” with different x, y and colour arguments

I have a problem that should be quite simple to fix but I haven’t found any answers that are directly applicable to my situation.

I am trying to create a plot with geom_point in which the points shown are a filtered value from a character vector. y is a continuous numeric, x is a date and fill is a character vector.

Here’s my sample data:

year    month   day attempt n_test
2019    6   22  1   NA
2019    7   13  2   n
2019    8   3   3   n
2019    8   20  4   n
2019    9   3   5   n
2019    9   4   6   n
2019    9   8   7   n
2019    9   11  8   p
2019    9   17  9   n
2019    10  3   10  n
2019    10  3   11  n
2019    10  11  12  c
2019    10  22  13  n
2019    10  25  14  n
2019    10  28  15  p
2019    11  6   16  c
2019    11  9   17  n
2019    11  25  18  n
2019    12  4   19  n
2019    12  8   20  n
2019    12  14  21  p
2019    12  17  22  n
2019    12  20  23  n

This is called ‘ntest.csv’.

Here’s my code:

ntest <- read.csv('ntest.csv', header = TRUE)
n_date <- ymd(paste(ntest$year, ntest$month, ntest$day, sep="-"))
ggplot(ntest, aes(n_date, y=attempt)) +
    geom_point(aes(colour = n_test), size = 3.5) +
    labs(x=NULL) +
    theme(legend.position="none",
          axis.text.x = element_text(color = "black", size = 10, angle=45),
          axis.text.y = element_text(color = "black", size = 10),
          axis.title.y = element_text(size = 13, vjust = 2)) +
    scale_x_date(date_breaks = "months" , date_labels = "%b-%y")

This gives the attached graph.

ntestplot

I want to only show the rows in my geom_point graph where n_test equals “p”.
So the same graph, with only the blue points.
I’ve tried using

ntest %>% 
filter(n_test=="p")

before ggplot, but this results in:

“Error: Aesthetics must be either length 1 or the same as the data (3): x”

Any help would be greatly appreciated.

python – Pandas: Combine and average data in a column based on length of month

I have a dataframe which consists of departments, year, the month of invoice, the invoice date and the value.

I have offset the Invoice dates by business days and now what I am trying to achieve is to combine all the months that have the same number of working days (so the ‘count’ of each month by year) and average the value for each day.

The data I have is as follows:

                    Department  Year      Month      Invoice Date   Value
0                Sales          2019      March       2019-03-25   1000.00
1                Sales          2019      March       2019-03-26   2000.00
2                Sales          2019      March       2019-03-27   3000.00
3                Sales          2019      March       2019-03-28   4000.00
4                Sales          2019      March       2019-03-29   5000.00
...                        ...   ...        ...              ...       ...
2435            Specialist      2020     August       2020-08-27   6000.00
2436            Specialist      2020     August       2020-08-28   7000.00
2437            Specialist      2020  September       2020-09-01   8000.00
2438            Specialist      2020  September       2020-09-02   9000.00
2439            Specialist      2020  September       2020-09-07   1000.00

The count of each month is as follows:

Year  Month
2019  April        21
      August       21
      December     20
      July         23
      June         20
      March         5
      May          21
      November     21
      October      23
      September    21
2020  April        21
      August       20
      February     20
      January      22
      July         23
      June         22
      March        22
      May          19
      September     5

My hope is that using this count I could aggregate the data from the original df and average for example April, August, May, November, September (2019) along with April (2020) as they all have 21 working days in the month.

Producing one dataframe with each day of the month an average of the months combined for each # of days.

I hope that makes sense.

Note: Please ignore the 5 days length, just incomplete data for those months…

Thank you

buffer overflow – Is there any security risk in not setting a maximum password length?

A limit is recommended simply to avoid exhausting resources on the server.

Without a limit, an attacker could call the login endpoint with an extremely large password, say a gigabyte (let’s ignore whether it’s practical to send that much at once. You could instead send 10MB at a time, but more quickly).

Any work the server needs to do on the password will now be that much more expensive. This applies not just to password hashing but every level of processing to reassemble the packets and get them to the application. Memory usage on the server also increases considerably.

Just a few concurrent 10MB login requests will start having an impact on server performance, perhaps to the point of exhausting resources and triggering a denial of service.

These may not be security issues in the sense of password/data leakage but crippling a service by DOS or crashing definitely is. Note that I make no mention of buffer overflow: decent code can handle arbitrarily big passwords without overflowing.

To wrap up, I think when someone says “there’s no reason to limit the number of characters of a password”, they are talking about commonly seen small limits (eg: 10 or 20 characters). There is indeed no reason for those other than laziness or working with old systems. A limit of 256 characters which is larger than desired by most people (except those testing those limits) is reasonable and can prevent some of the issues related to arbitrarily-large payloads.