SEO – Google can not retrieve a large sitemap with 50,000 URLs and is not rendered by browsers

My sitemap contains 50,000 URLs / 7.8 MB and the following URL syntax:, maquiagem,   2019-10-03T17:12:01-03:00 

The problems are:

• The search console reports that the sitemap could not be read.

• Sitemap loading takes 1 hour and Chrome stops working.

Enter image description here

• In Firefox, the sitemap was downloaded in 1483 ms and fully loaded after 5 minutes.

Things that I have done without success:

• Disable GZip compression.

• Delete my .htaccess file;

• Create a test sitemap with 1 KB URLs and the same syntax, and send it to Search Console. However, the sitemap with 50 KB URLs still displays "" that no sitemap can be retrieved.

Enter image description here

• An attempt was made to directly check the URL, but an error occurred and you are asked to try again later while the 1KB URLs worked.

• An attempt was made to validate the sitemap on five different websites (YANDEX, ETC), and all worked without error / warning

Any light?

Privacy – How to combat the fingerprint of browsers?

The fingerprinting technology used by the EFF is nothing more than the "normal" Javascript functions used by websites to function properly. It is possible to report untrue information to the outside, but then you risk either "falling behind":

  • the untrue information that you should send amendments and not yours, which makes you unique – and suspicious;
  • The detection techniques are changing and you are unaware of it. So become unique again.

or with a really complicated navigation.

Assuming that you can use Tor or a VPN or OpenShell to tunnel your IP address, I think it's the safest way to boot a virtual machine, install Windows Seven on it, and install it for everyone to use data protection-sensitive operation. Do not install anything unusual on the computer, and it is a standard Windows Seven computer that belongs to a horde of similar computers.

They also have the advantage that the machine is isolated in your real system and you can quickly snapshots / reinstall them. What you can do from time to time – the "you" who has done all the navigation before disappears and a fresh "you" appears with a clean story.

This can be very useful because you can keep a "clean" snapshot and always restore it before you perform sensitive operations like home banking. Some VMs also allow & # 39; sandboxing & # 39; d. H., Nothing made in the VM is constantly changing its content – all system changes, downloaded malware, installed viruses, keyloggers infiltrated disappear as soon as the virtual machine shuts down.

Any other technique would be no less intrusiveand would involve a considerable amount of work on the browser or on an anonymizing proxy, which not only serves to clean up your headers and your javascript answers (as well as the fonts!), but to do it in a credible way,

In my opinion not only would the total work be the same (or even more), but also a much more complicated and less stable kind of work.

Install the most common operating system, stick to the included browser and software, and resist the temptation to pimp it hundreds of thousands of similar, just-installed, never-serviced computers on the Internet?

Update – browser behavior and side channels

Now I've installed a virtual Windows 7 machine and even upgraded to Windows 10, as Joe Q. Average would do. I do not use Tor or VPN. All an external site can see is that I'm connecting from Florence, Italy. There are just like my thirty thousand connections. Even if I know my provider, there are still about nine thousand candidates left. Is that sufficiently anonymous?

It turns out that this is not the case. It could still exist correlations that could be investigated, with sufficient access. For example, I play an online game and my input is sent immediately (character-buffered, not line-buffered). It becomes possible to fingerprint digram and trigram delays, and if the corpus is large enough, determine that online user A is the same person as online user B (of course, within the same online game). The same problem could occur elsewhere.

When I surf the net, I usually always meet the same websites in the same order. And of course I called my "personal pages" on several websites, eg. Stack overflow, regular. A customized distribution of images is already in my browser and will not be downloaded or bypassed at all HTTP If-Modified-Since or If-None-Match Inquiry. This combination of habit and helpfulness of the browser is also a signature.

Given the abundance of tag methods available to websites, it is unlikely that only cookies and passive data could be collected. For example, a site may promote the need to install a font named Tracking-ff0a7a.otfand the browser would dutifully download it. This file is not necessarily deleted when deleting the cache. If it is not downloaded again on subsequent visits, this is proof that I have already visited the site. The font may not be the same for all users, but contains a unique combination of glyphs (for example, the character "1" may contain a "d", "2" may include an "e" and "4" may include a "d"). again – or this could be done with infrequently used font code points), and HTML5 can be used to draw a glyph string "12345678" onto an invisible canvas and upload the result as an image, which would then create the unique hex sequence & #. 39; deadbeef & spell, and this is a cookie in every sense.

To fight this, I may have to:

  • After each browser session, take another snapshot of the VM (and reset the modem if I do). It would not be enough to always have the same VM.
  • Use several different virtual machines or browsers, as well as known proxy services or Tor (I would not mind using a proxy that is unique to me or for which I'm the only user in Florence for anonymity reasons)).
  • Empty and / or purge the browser cache routinely and remember this Not for example, to always open XKCD immediately after questionable content.
  • Accept two or more different "personas" for the services for which I want anonymity, and for which I'm not interested, and make sure that they stay separate in separate VMs so that a permanent connection may be made by a savvy external Agency is made.

This also shows that I would better have a good reason to want anonymity: because it will be a royal pain in the back to accomplish this reliably.

Network – ERR_SSL_PROTOCOL_ERROR from multiple websites in multiple browsers and devices

In Chrome, I get the dreaded ERR_SSL_PROTOCOL_ERROR when I visit some websites, two of which are Facebook and Instagram. This is the case with browsers (Chrome and similar errors in Firefox) both on my phone and on my laptop.

I've tried the suggestions here except for those who use Chrome to access the proxy settings since I have the same message get here So I do not know how clear the SSL status is. Instead, I deleted all Chrome configuration files and reinstalled Chrome.

Interestingly, both the Facebook Messenger and Instagram apps were unable to access their servers until I switched to mobile data. On mobile data I could access the Facebook browser on my phone, but the apps also worked. They continued to work after using Wi-Fi again.

Buoyed by this result, I tried to load Facebook while using my phone as a Wi-Fi hotspot. I could then access Facebook. Unfortunately, the fix did not last this time when I went back to WiFi.

In short, it's obviously something special for my local network or connection. I reset the router to factory settings and that did not help. The only thing I can imagine is that a TV headend server is running on a Raspberry Pi. Otherwise, I have my phone, my laptop and a smart TV in the same network.

I have no more ideas and can not contact TalkTalk to see if it's their problem. Suggestions would be very grateful. If it is the fault of TalkTalk, it would be useful to know this so that I can threaten to terminate my contract. Thank you so much!

Root Access – Blocks a web site from all web browsers

To prevent addictive behavior on a particular news site, I'd like to ban this site on my OnePlus 6 smartphone (latest software) for all browsers.

My main browser for smartphones is Free Adblocker Browser (version 64.0). I also delivered Google Chrome with the smartphone, which is now disabled, but I could use it in the future.

How can I block a website globally via Android (maybe over /etc/hosts)?

JS does not work properly in IE and Edge browsers

I have a small JS-based music player on my home page in the upper left corner (just above the logo). And it works in all browsers except IE and Edge. These will display NaN, where the total time of the title should be after loading the page. I can not attach a print screen here, but you can either follow the link on my website or link to a photo bank where it has been successfully uploaded.
But if you try to load my website in FF or Chrome, then …

JS does not work properly in IE and Edge browsers

Honeypot for hunting zero-days browsers, rootkits and malware

I want to create a honeypot (bot) for hunting Browser Zero-Days (and Browser extensions). What is the best way to find it? 0days automatically (I want to create one Sandbox Bot visit the Web sites and see if they are trying to exploit the vulnerability of the browser). I think just sandboxing this bot and connecting a debugger would not be enough. Is there anything else I should consider?

My inspiration was this article from The Honeypot Incident – How strong is your UF (Reversing FU) of 2011?

If Exploit is well written and tested then should not crash and therefore the attached debugger be pretty useless?

TO EDIT: A little bit confused too comprehensive Status, so explain my idea (I do not think there's anything new in it). The idea is that we have our honey pot that would recognize Zero-day, This should be done by visiting websites, clicking links, and copying the usual user behavior. Our honey pot is actually a bot. Our bot may not download executable binaries (we still want to be able to open PDF Documents, Excel spreadsheets in a browser (Google Chrome allow it)). This is a difficult task because we have to detect something undetectable, so we have to recognize it by its tracks and by subtraction (Hello, Sherlock !;)). If we notice outgoing requests from our computer, e.g. to C & C server or discover some Malware (we will use behavior-based malware detectionNot only do we recognize some hard-coded signatures, but we must remember that this is very resource-intensive), then we know that we prohibit downloading and executing executable files, and therefore the only way to get rid of this sandbox machine was through a to infect Zero-Day, Therefore we have to keep an eye on the memory states (dumps every X seconds, but obviously very resource-intensive) we need to create log files, run our browsers in debug mode to track even more. Because if we notice strange behavior on our machine, we have to check the initial data for traces of a Zero-Daybecause only this way attackers could invade the system.

I think this is the only sensible way to detect it without executing executable files (click_me_to_see_secret_documents.exewould be too easy;)).

EDIT 2: This question is only about recognizing a browser Zero-day, Not about any Hardware zero-day or anyone else Software zero-day, Only for browsers.

Malware – Attacking multiple browsers / operating system combinations with just one link

I am relatively new to this topic. So pardon an obvious lack of understanding.

For example, suppose an attacker had a malicious Web site designed to place a Trojan on a user's device that clicks on a specific link.

Assuming that the website can be accessed by different browsers and / or operating systems and only a malicious link should be included in the website, are you limited to using only a combination of browser and operating system? Or could several Trojans be contained in a link, each aimed at a different browser / operating system combination?

Computer Networks – How do web browsers discover the MAC address?

As I read my textbook, I noticed that your web browser is discovering the IP address of the web server hosting the webpage and trying to connect to it if you want to access a particular webpage. Then a copy of the web page will be transferred to your computer so that you can view it.

However, I have noticed that the IP address allows sending data to the device's local area network. However, in order for the data to be sent to the device itself, the MAC address required for data transfer is the local network to the device.

BUT in my textbook is nowhere, as the web browser of the computer determines the MAC address of the web server. The computer determines the IP address of the Web server by using the Domain Name System (DNS).

How does the computer determine the MAC address of the server?

Answers would be very grateful.

macos – control browsers with headphones on the Mac

I have a Beats Solo 3 headphone. When connected to my iPhone, you can pause, play, rewind, and rewind videos on YouTube in Safari. When you connect to my MacBook 2016 with High Sierra 10.13.2, iTunes opens with the Pause button and interaction with websites in browsers is not possible. Can I control websites like Netflix and YouTube in Firefox, Safari or Chrome on my Mac with Beats headphones?