Asynchronous – Python async request using Tor and User Agent rotation

The following code scraps data from three APIs. The APIs are used to list properties. For every listing I have a request. There will be about 20,000 inquiries. That's why I use ayncio to make the request. However, I am banned from their server if I request frequently. So I use Tor and User Agent to get rid of it.

The data I scrapped is stored in MongoDB.

I learned Python myself. I wonder if the code can be further improved.

import time
import math
import json
from bson import json_util #for import into mongodb
from datetime import datetime

import requests
from stem import Signal
from stem.control import Controller
from fake_useragent import UserAgent
import aiohttp
from aiohttp import ClientSession
import asyncio
from aiohttp_socks import SocksConnector, SocksVer

import pandas as pd

import pymongo
from pymongo import MongoClient

#https://boredhacking.com/tor-webscraping-proxy/
#https://www.unixmen.com/run-tor-service-arch-linux/
#https://www.sylvaindurand.org/use-tor-with-python/
#https://stem.torproject.org/faq.html (autheticate)

client = pymongo.MongoClient('localhost',37017)#
db = client.centaline
db_propertyInfo = db.propertyInfo
db_propertyDetails = db.propertyDetails
db_buildingInfo = db.buldingInfo

current_time = datetime.now()

propertyInfo = "https://hkapi.centanet.com/api/FindProperty/MapV2.json?postType={}&order=desc&page={}&pageSize={}&pixelHeight=2220&pixelWidth=1080&sort=score&wholeTerr=1&platform=android"

propertyDetails = "https://hkapi.centanet.com/api/FindProperty/DetailV2.json?id={}&platform=android"

buildingInfo = "https://hkapi.centanet.com/api/PropertyInfo/Detail.json?cblgcode={}&cestcode={}&platform=android"

headers = {"User_Agent":UserAgent().random, 
           "Host":"hkapi.centanet.com", "Content-Type":"application/json; charset=UTF-8"}

def switchIP():
    with Controller.from_port(port = 9051) as controller:
        controller.authenticate()
        controller.signal(Signal.NEWNYM)

def numOfRecords(postType):
    switchIP()
    session = requests.session()
    session.proxies = {}
    session.proxies('http') = 'socks5://localhost:9050'
    session.proxies('https') = 'socks5://localhost:9050'

    data = session.post(propertyInfo.format(postType, 1,1), headers = headers) #page=1 pageSize=1
    data = json.loads(data.text)
    data = data("DItems")
    data = pd.DataFrame.from_records(data)
    totalRecords = data("Count").sum()
    print("Total number of records for {}: {}".format(postType, totalRecords))

    return totalRecords

async def getPropertyInfo(postType, page, pageSize, session):
    while True:
        try:
            async with session.post(propertyInfo.format(postType, page, pageSize), headers = headers) as resp:
                print("started inserting property info")
                data = await resp.text()
                data = json.loads(data)
                for idx, item in enumerate(data('AItems')):
                    item("DateTime") = current_time
                    item("Source") = "centaline"
                    db_propertyInfo.insert_one(item)
                switchIP()
                break
        except Exception as e:
            print(str(e))
            print("Retry Property Info")

async def bound_getPropertyInfo(semaphore, postType, page, pageSize, session):
    async with semaphore:
        await getPropertyInfo(postType, page, pageSize, session)

pageSize = 1
#noOfPages_s = math.ceil(numOfRecords('s')/pageSize)
#noOfPages_r = math.ceil(numOfRecords('r')/pageSize)
noOfPages_s = 1
noOfPages_r = 0

async def run_info():
    tasks = ()
    socks = 'socks5://localhost:9050'
    connector = SocksConnector.from_url(socks)
    semaphore = asyncio.Semaphore(50)
    async with ClientSession(connector = connector) as session:
        for page in range(noOfPages_s):
            task = asyncio.ensure_future(bound_getPropertyInfo(semaphore, 's', page+1, pageSize, session))
            tasks.append(task)

        for page in range(noOfPages_r):
            task = asyncio.ensure_future(bound_getPropertyInfo(semaphore, 'r', page+1, pageSize, session))
            tasks.append(task)

        response = await asyncio.gather(*tasks)

loop = asyncio.get_event_loop()
future = asyncio.ensure_future(run_info())
loop.run_until_complete(future)

id_list = db_propertyInfo.distinct("ID", {"DateTime": current_time})
cblgcode_list = db_propertyInfo.distinct("CblgCode", {"DateTime": current_time})
cestcode_list = db_propertyInfo.distinct("Cestcode", {"DateTime": current_time})

async def fetch(url, session, socks, db):
    while True:
        try:
            async with session.post(url, headers = headers) as resp:
                data = await resp.text()
                switchIP()   
                data = json.loads(data)
                data("DateTime") = current_time
                data("Source") = "centaline"
                db.insert_one(data)
                print("Inserting Data")
                break
                return data
        except Exception as e:
            print(str(e))
            if data.find("Sequence contains no elements") != -1:
                break
            else:
                print(url)
                print("Retry One Property Details")

async def bound_fetch(semaphore, url, session, socks, db):
    async with semaphore:
        return await fetch(url, session, socks, db)

async def run():
    tasks = ()
    socks = 'socks5://localhost:9050'
    connector = SocksConnector.from_url(socks)
    semaphore = asyncio.Semaphore(100)
    async with ClientSession(connector = connector) as session:
        for i in id_list:
            url = propertyDetails.format(i)
            task = asyncio.ensure_future(bound_fetch(semaphore, url, session, socks, db_propertyDetails))
            tasks.append(task)

        for j,k in zip(cblgcode_list, cestcode_list):
            url = buildingInfo.format(j,k)
            task = asyncio.ensure_future(bound_fetch(semaphore, url, session, socks, db_buildingInfo))
            tasks.append(task)

        response = await asyncio.gather(*tasks)

loop = asyncio.get_event_loop()
future = asyncio.ensure_future(run())
loop.run_until_complete(future)

AC.Commutative Algebra – Artinian Tor Module (Reference Request)

I look for a reference for the following basic fact:

To let $ R $ be a noetherian ring, let it be $ M $ to be an Artinian $ R $Module, let $ N $ finally be generated $ R $Module and let $ i in mathbb {N} $, Then, $ Tor_i ^ R (M, N) $ is Artinian.

(I know it's easy to prove, but I still suspect that this is written somewhere in the standard literature on homologous algebra, and I'd like to know where.)

Note 1: The above conclusion also applies if $ R $ is coherent, $ M $ is Artinian and $ N $ is of finite representation. A reference for this generalization would be alright.

Note 2: Leamer proves in his thesis (2010) a generalization to the case where $ R $ is noetherian, $ M $ is Artinian and $ N $ is minimax. But I think there must be an earlier hint for the less general case.

Compromising Tor Browser – Information Security Stack Exchange

I'm just starting to do Tor-related things, wondering if anyone is using them ./start-tor-browser.desktop Can an attacker not change the script to start a backdoor in the background after starting the script? I'm about to do the same for research purposes.

The first idea has a drawback, unless you endanger the Tor Browser download website. (This would not work because Tor is signed.) (This only works if you download and verify the signature file.) You must already have access to the computer to be able to target and have write access to the Tor directory.

If you have ideas or know who has already done so, please leave a comment.

Second Idea: This has been done on a larger scale using Tor Exit nodes, which used malicious nodes to patch downloaded binaries with malicious versions. So my idea is that you have a local area network that you control and have the means to identify Tor users. You could specifically control the traffic and patch the binaries of their downloads with malware.

Links: https://www.leviathansecurity.com/blog/the-case-of-the-modified-binaries

My second idea is not to infect Tor browser downloads. I'm talking about disrupting Tor user downloads by checking the local network traffic targeting Tor users and then patching the binaries with malware from these downloads.

How does Lightning Network work via TOR?

Running Lightning Node over TOR is no different than running over a normal IP connection. Sending payments, completing payments, sending error messages etc. would be the same in both cases. The only difference is that the above messages that you send to your peer are now sent over the TOR network instead of a direct IP packet.

If you use just TOR without public IP address. In order to forward your payment to a node that uses only a public IP address, there must be a node in your path to the recipient on which (1) TOR and public IP or (2) or public IP addresses can and Use the socks5 proxy to connect to TOR nodes. If you do not have this node in between, you will not be able to send the payment.

When the Tor service is started, a socks5 proxy is created, which by default has the address 127.0.0.1:9050. When a node with public IP starts with the option --proxy=127.0.0.1:9050 (or include in the configuration file) The node can connect to nodes running TOR (as you do).

When you execute TOR and If you have a public IP address, you can connect directly to nodes running tor or to public IP nodes through the socks5 proxy of the tor service.

I am the sender, but I have activated TOR in my node. What does sending to a node R outside the TOR network look like?

Network routing is done according to the instructions above. However, the path calculation to send the payment to the receiver is done on your node so that it does not take into account the network routing you are using. You would create the onion routing package with the path to the recipient (the channels you use to send the payment) and try to send that onion and the one payment_hash to your colleague about the update_add_htlc Embassy. This message then goes through TOR nodes before it reaches your peer, instead of reaching your peer directly.

I am the recipient node and do not advertise an IP address but an onion address. How do I receive payments?

You can receive payments directly from nodes running TOR. If you want to receive payments from nodes that have only a public IP address, you must have a node in your path that has the proxy option set to connect to TOR nodes through socks5 proxy.

The sender node S is a regular LN node without gate connections. As the sender, I want to send a payment to node R, and my LN node finds the best way to send my payment to R. Are the nodes on the selected path onions or do they have access to Tor? And if so, what does the routing look like?

Suppose the path from S to R looks like this: S -> T -> U -> V -> R, Number of cases can arise:

  • S and R do not execute TOR: It depends on
    • All nodes could be on a public IP and your payment will go through.
    • T could be a node running public IP and TOR. There is a public IP channel for you and a TOR channel for U. U can then set a proxy option that allows him to use TOR-based channels with T and a public IP channel with U. V is a public IP node and U forwards the payment to V in the normal way.
  • R is running TOR: At least one node in between should execute / understand TOR
    • T / U / V has a public IP and TOR so that they can create channels with TOR nodes and public IP nodes
    • T / U / V are all public IP nodes, but V has a proxy option that allows it to establish a Tor-based channel connection with R.

centos – Tor Browser can not find a .onion website

My Tor Browser does not access a .onion link. Even the popular ones like facebookcorewwwi.onion. Normally, no .onion link will work and the message "Server Not Found" will be returned.

-I have a VirtualBox CentOS 7 with the Tor browser binary file downloaded from the Tor website, with the Tor service enabled (system startup gate installed).

Internet connection is OK. I can access any "Surface" website: Duckduckgo and any other;

Network Confs are: "Manualy defined", Proxy 127.0.0.1, Port: 9050.

– I have an OK status on the Tor network check and the IP address of the exit node is correct.

Clock is enabled and OK, either for host or guest systems (currently being tested.). I received OK for guest and 0.7 delay on host.

What do I miss?

APACHE 2.4 and ToR

I've hosted an information and forums site and many users have asked for TOR access, but I have a problem adding it, when a connection is made from the server IP, everything is fine, the site works completely when connected Hostname The page is completely broken and all links point directly to 127.0.0.1 instead of the correct domain. All assets are dynamic.

I have no .htaccess and VirtualHost is only changed in

DocumentRoot / var / www / html / forum

apache2.conf has
LogLevel debug
/ var / www /
AllowOverride All

Mod_rewrite is enabled, but an attempt to create a working argument failed. Do I have to use VirtualHost, .htaccess, or a combination to redirect all 127.0.0.1 to the correct domain? And how can I do this and maintain the ability to have multiple mirrors?

[Cloud Firewall] Proxy, VPN, Tor, Spam and Bot detection.

Fire Mason (https://firemason.io) is an IP reference site with proxy, VPN, spam, gate and bot detection. With our data you can easily carry out fraud checks in your online store, detect malicious players in your online game and much more!

Currently in beta, but the product works very well. Look for feedback so we can improve :)

integrations

We are happy to help you with the integration Fire Mason with your service or product. Just contact us here by private message.

Discounts

We are pleased to offer discounts to DigitalPoint members. Write me a message and we can discuss your specific business needs :)
SEMrush

Some questions about Lightning and Tor

When you read BOLT 07, you will generally find that lightning nodes and channels can be either private or public.

This is regardless of whether they are running on goal or not.

The Node Announcment message explicitly supports the announcement that it is running on the port described in BOLT 07

The following address descriptor Types are defined:

  • 1: ipv4; data = (4:ipv4_addr)(2:port) (Length 6)
  • 2: ipv6; data = (16:ipv6_addr)(2:port) (Length 18)
    • 3: Gate v2 onion service; data = (10:onion_addr)(2:port) (Length 12)
      • Onion service addresses of version 2; Encodes a truncated 80-bit file SHA-1
        Hash of 1024-bit RSA public key for onion service (a.k.a.
        hidden service).
    • 4: Gate v3 onion service; data = (35:onion_addr)(2:port) (Length 37)
      • Version 3 (prop224)
        Onion service addresses; coded:
        (32:32_byte_ed25519_pubkey) || (2:checksum) || (1:version), from where
        checksum = sha3(".onion checksum" | pubkey || version)(:2),

However, I think that most users who run on Tor like their privacy and do not announce their node.

In general, nodes can only be announced if they have at least one public channel. This is to prevent spam and DoS attacks on the Gossip protocol. Since some people only have private channels, the nodes are not announced.

In addition, most mobile nodes such as Eclair open private channels by default, as this may not be as useful to a user that their cell phone uses all the data from its operator's data plan to become a routing node.

Privacy – How to buy Bitcoin anonymously and privately via Tor?

First of all, anonymity and privacy are two different things. Bitcoin is designed to be pseudo-simple – which comes close to anonymity, but instead of being completely anonymous, you are linked to your Bitcoin address or pseudonym.
You can not make a private transaction on the Bitcoin network, but you can try to be as anonymous as possible.

It is possible to run Bitcoin Core as a hidden Tor service and connect to such services.
The first step is to run Bitcoin Core behind a Tor proxy. This will make all outgoing connections anonymous, but more is possible.

Further instructions at Github:
https://github.com/bitcoin/bitcoin/blob/master/doc/tor.md