The clever solution is to create two sitemaps. The first is for your competitors, the second for your favorite search engines. In military parlance, this first sitemap is a pretense.
The & # 39; feint & # 39; contains your basic website structure, homepage, contact, about us, main categories. It looks like the real deal and works well in dark search engines that you do not care about. It will not benefit your competitors either. Let it be indexed to find it, and give it a unique name like sitemap.xml.
Create your real sitemap with code now. Give it a name like "product-information-sitemap.xml" so it's a reasonable name, but not easier to guess than your password.
Add something to your Sitemap folder in your Apache configuration so that this second Sitemap can be called by search engines, but not indexed:
Header X-Robots tag "noindex"
Now create the code to maintain this update and look at a third sitemap for images. Dowwngrade it as needed to get the & # 39; feint & # 39; to create. Also pay attention to the timestamps, Google is aware of these and this is important if your sitemap is large.
Now create a & # 39; cron & # 39; job to periodically submit your Sitemap to Google. Add something in your crontab entry to submit your real sitemap every week:
0 0 * * 0 wget www.google.com/webmasters/tools/ping?sitemap=http%3A%2F%2Fwww.example.com%2Fsitemaps%2Fproduct-information-sitemap.xml
Note that the URL is URL encoded.
You can also gzip your sitemap if size is a problem, even though your web server should provide the gzip, if you have enabled it.
Your robots.txt need not be fancy, as long as it does not allow access to your sitemaps, this should be fine. There is really no need to send different robots.txt files based on user agent strings or other such complicated elements. Just drag your valuable content into an additional unpromoted file and send it to Google on a cron job (instead of waiting for the bot). Easy.