How is it possible that Google indexed more URLs than a Sitemap?

This question already has an answer here:

Google has processed my Sitemaps. Webmaster Tools claims to have indexed 44,797 links to one of the files, even though it contains only 4,582 links.

Here is a cap:

I'm not worried about it, but it's a strange condition and I'm sure you can learn something from it. What's happening?

TO UPDATE: This is not a duplicate of the question, "Why is there a difference between URLs sent to a Sitemap and URLs in the Google index?" Here is the reason, as I explained in the comment below:

I understand that Google may index many pages that are not in my sitemap. The webmaster tools state that there are many thousands of such sites. It is strange that the table above shows how many links there are in a particular sitemap file therefore, it seems impossible for this number to exceed the number of links in the file. Unless, of course, I miss something.

One theory: Could it be that many versions of the same pages – possibly with different parameters – have been indexed?

Google – Is a URL required when creating an image sitemap?

The reason I'm asking is that I'm writing a script that scans a folder for pictures so I do not necessarily know where they're used or the exact page. Example from Google



  
    http://example.com/sample.html
    
      http://example.com/image.jpg
    
    
      http://example.com/photo.jpg
    
   

Is it possible to do that and would google / other search engines still read it correctly?


    
      
        
          http://example.com/image.jpg
        
        
          http://example.com/photo.jpg
        
       
    

How do I submit an Sitemap to Google Webmaster? | Forum Promotion

The first step would be to generate a sitemap, then add your site to the Google Search Console and paste the link into your sitemap.

Sitemap is indeed a great idea to get your pages indexed faster and better. If you use WordPress, there are some plugins that support the creation of sitemaps. Apart from that, you can also use your RSS feed as a sitemap :)

Google Search Console – Sitemap could not be retrieved

I'm trying to submit my sitemap on my Google console, but every time I add it, I've always been told that the fetch was not possible. I've uploaded my sitemap.xml to my domain root, but Google still does not see it. Please, how can I solve this problem? I spent the last two days with it and it's like I'm going into the circle.

Below is my full URL
https://winnerrslounge.com

My sitemap is in the public_html folder named sitemap.xml

sitemap – In the case of 404 Page, the URL should be 404 or we can change the content to 404 and the URL stays the same

I'm confused about the 404 page

Suppose I have a landing page

https://www.example.com/testingpage

The page does not exist anymore. What should I do with it?

Case 1: Should I redirect the URL?

https://www.example.com/404

Or
Case 2: Should I simply replace the content with a 404 page?
URL stays the same
https://www.example.com/testingpage
Content:
404

Which case should be used

Which Vimeo URLs can I include in a video sitemap that are not redirected or banned?

I'm not sure which Vimeo URLs I can use for video sitemaps because they are either "redirect" or "forbidden" – both of which would not be really acceptable in Google Video XML Sitemaps.

How to create video sitemaps / Which URLs do you use for Vimeo hosted videos?

In other words, my whole:

  • http://vimeo.com/moogaloop.swf?clip_id=XXX … redirects below
  • https://vimeo.com/moogaloop.swf?clip_id=XXX … is not found
  • https://player.vimeo.com/video/XXX … is prohibited

seo – pages with irrelevant content on a sitemap bad signals to search engines?

I have many WordPress sites with content that automatically generates a sitemap.xml file. By default, the sitemap includes pages such as "categories" or "author" that are either empty or of irrelevant content.

Is it harmful to allow these pages as part of the sitemap.xml file? Is this a bad signal for Google?

Removing them from the sitemap is possible, but requires some work and it wants to know if it's worth the effort or not.

I guess this looks like a stupid question, but in fact I've done some research and it seems that sending a sitemap does not mean that all pages are indexed. Google will decide which one is best, and the rest will be ignored. But I could not tell if ignoring is harmless or a bad signal.

Thank you, Mihai

seo – Is another sitemap per language ok? How can I tell Google about this?

You can have multiple sitemaps per site. This is an excellent example of when this makes sense.

Make sure you have a sitemap index listing each of your sitemaps. It will probably look like this:



    
        http://website.net/sitemap_fr.xml
        2004-10-01
    
    
        http://website.net/sitemap_de.xml
        2005-01-01
    
    
        http://website.net/sitemap_es.xml
        2005-01-01
    

Remember to link this index in your robots.txt file, for example:

Sitemap: http://website.net/sitemapindex.xml

There is also the option to specify alternative pages in your sitemap itself. The setup is a bit more complicated and inherently does not answer the original question about the user-suggested settings.