I have a customer website <
This is related to SP 2013 On-Premises.
We have a custom search page and use keyword query to filter the result.
The associated document, which can not be found by this page, has at least two assigned metadata fields (both managed metadata) and can be found through regular search and also through the search API, as follows:
https://company.com/_api/search/query?Querytext='PRDBGerKlasse:"306"'&RowLimit='10'&SelectProperties='Title,Path,PRDBGerKlasse , PRDBGerType, PRDBGerArt & # 39;
The document has "306" for PRDBGer class and "TypeA 300" for PRDBGerType.
When we call the page from
https://company.com/search/pages/prdb.aspx# k = PRDBGerTyp% 3aTypeA% 20300
The document is not in the search result.
If we call the page without filters, the document will be found.
What could cause such a problem?
Please note that we added article 306 to the appropriate term set two weeks ago.
Could this be a problem with the crawler, even if the document can be found by regular search?
I have used Amazon pre-signed URL to share content.
Can Google crawl this URL? I share this URL with just one customer. What about other services? There are some topics that you can use to share content by creating a seemingly random URL (or even using hashes), such as: For example, www.somedomain.com/something/15b8b348ea1d895d753d1acb57683bd9
Will this URL be crawled by Google or other search engines?
Thank you very much
Google stopped crawling images for our site when we moved images to the cdn subdomain.
The status image of the old search console can be crawled, the new one indicates that it is blocked by robots.txt (for the same link).
Our robots.txt only has the following:
Does anyone know why that is?
I have a website where I publish articles that I started just 2 weeks ago. I'd like to keep the pages as clean as possible and load more content (links to other articles) from AJAX requests to user action (for now, clicks). I read a bit. Most of the articles and blog posts on this topic were outdated. I understand that Google used to support crawling AJAX requests, but not anymore. Some papers also recommend using methods that provide content by pagination. I also read about sitemaps. I know that it gives search engine crawlers an indication of which pages to search.
However, will crawlers find inconsistencies because these links are out of reach and can only be accessed by clicking the Load More button? Does a sitemap make sure the crawlers visit the URL?
Crawled, currently not indexed, sustainably indexed sites are decreasing! What could that cause?
These pages are slowly being deindicated. I lose a few hundred pages a week. These pages are indexed by Submitted and Currently Unindexed.
Checking "Crawled, not currently indexed" displays everything in green, with a referring page and the message "Indexable". "The URL will only be indexed if certain conditions are met". There is nothing wrong with the pages. I am not sure what to do to stop the deindication. The site is ncservo.com if it helps.
I have been fighting for some time against the bug in Google's canonical page algorithm. One piece of advice I got is to set the GSC URL parameters to "Any URL," since we're using a Page Generation script with the Page parameter and not "Let Googlebot decide." If I stop this and click "Show sample URLs", GSC will display the following for recently crawled URLs:
I also attached a screenshot: Certainly none of these pages are available on our web server. As far as I can tell, our GSC account has not been hacked, at least I do not see any evidence that anyone except me has made indexing requests. Entering one of these parameters causes our site to return a hard 404 value. Why would Google crawl with random page parameter values? And another question: Could this affect Google's canonical site selection?
About a month ago, I sent my Shopify site to the Google search console and set the sitemap.xml path, which points to products, blogs, and collections. I used the site: Google prefix as follows: site: nodosperu.com, but I do not see any of my product pages there. So I checked the configurations again and for some reason the products were not listed. That's why I manually added the sitemap of the products along with the site's sitemap because it was not recognized.
Unfortunately, the problem could not be solved. I checked the robots.txt file and do not exclude any of the files.
Is there any other setting I need to change? Currently I can only see 4 URLs in Google Search.
Any help would be appreciated, thanks.
Why is my site not being crawled by Google?
I have a website with about 100,000 URLs. I do not want to allow crawling for all URLs that have an ID with this pattern:
but not those without the ID:
How can I do that in a robots.txt?