I can't add an image here so I'll add it afterwards...
A question for those with more experience perhaps..
In GSC, "Crawled currently not indexed" is a new entry that
If you do not recognize it, could the text be scraped from another page while crawling?? Or you could have something on your website that is connected to third-party/external sources.
See more comments
I can't add an image here so I'll add it afterwards...
A question for those with more experience perhaps..
In GSC, "Crawled currently not indexed" is a new entry that
Hey Richard,
The "feed" you're seeing is your RSS feed (see 2nds screen print below...).
Perhaps you are blocking that access,?
There is a fix... searching Google for help is your best bet... see 1st screen print below...
Hope you find this helpful.
Thank you Trish.
Actually, the feeds are a different challenge that I will explain in a moment.
My image shows the peculiar entry that I show in my message and highlight in the image, which has nothing to do with feeds.
On the subject of feeds... their presence, or not, make no difference to SEO and I've been trying to get them removed from GSC. Google themselves say that if you add " Disallow /feeds/ " in the robots.txt file, that they'll stop trying to follow feeds and then they shouldn't show up in GSC. Now, I know it can take some time for some types of Google Bot to get around to crawling, but after 6 months or so all my feeds are still showing in GSC! They're just a nuisance, but add to the time it takes when I check for crawled not indexed URLs.
;-)
Richard
If you do not recognize it, could the text be scraped from another page while crawling?? Or you could have something on your website that is connected to third-party/external sources.
See more comments
I want to update my robots.txt to stop Google unnecessarily crawling /feed/ and to remove mentions of /feed/ from showing up in GSC.
I can't find the robots.txt using Filezi
it is normally a virtual file - some seo plugins will create one if you tell them to...and others have an editor....
all you need do is to create a text only file and place it in the root directory...
in the ftp you maybe - as i was - suprised that all the docs are under the http directory rather than the https - they still do a redirect on a lot of sites and this can cause issues occasionally...
but if you see index.php, wp-content, wp-admin, and wp-includes you should be in the right directory.
getting it wrong can crash your site, just as a warning....
Thanks for that info, Phil.
The thing is I use Rank Math which (as do other similar tools) provides a method for creating a robots.txt which I have updated.
This is what I've not figured out yet...
My Rank Math robots.txt is not working. Research implies that I need to delete the existing robots.txt from the root for the Rank Math one to work. But I can't find the old one to get rid of it. I can see what is in place when I render the robots.txt, but have no idea where it is rendered from...
;-)
Richard
i do wonder whether it is a script that wa have inserted as all the siterubix sites are now noindex and can't have a robots.txt added...
it maybe worth installing the robots.txt editor plugin as this can append info to the current one or has an option to overwrite ....
Thanks for that thought, Phil. I'll check it out. In the meantime I've contacted Rank Math support.
;-)
Richard
Hey Richard,
All you need is your FTP access to the site where the robot.txt file would be found, if the site has one.
Hope you find this helpful.
Thank you Trish.
I use FileZilla for that and can’t find the robots.txt. It’s time to contact Site Support.
;-)
Richard
Thank you Trish.
My reply to Phil (feigner) probably describes my dilemma best.
;-)
Richard
First, download a copy of your existing robots.txt file from your website. You can do this in a few ways:
Navigate to your robots.txt file directly (e.g., example.com/robots.txt) and copy its contents into a new text file on your computer.
Use a tool like cURL to download an actual copy of your robots.txt file. For example:
curl https://example.com/robots.txt -o robots.txt
Alternatively, you can use the robots.txt report in Google Search Console to copy the content of your robots.txt file and paste it into a local file on your computer.
Edit Your Robots.txt File:
Open the downloaded robots.txt file in a text editor.
Make the necessary edits to the rules. In your case, you want to prevent Google from crawling /feed/. You can achieve this by adding the following lines to your robots.txt file:
User-agent: *
Disallow: /feed/
Ensure that you use the correct syntax and save the file with UTF-8 encoding.
Upload the Updated Robots.txt File:
Upload your new robots.txt file to the root directory of your website. The process for uploading files depends on your hosting platform and server.
If you don’t have permission to upload files to the root directory, contact your domain manager or website administrator. They can make the necessary changes for you.
Refresh Google’s Robots.txt Cache:
Google’s crawlers automatically notice changes to your robots.txt file during their crawling process. The cached version is updated approximately every 24 hours.
If you need to update the cache faster, use the “Request a recrawl” function in the robots.txt report within Google Search Console
Thank you for your comprehensive reply Oscar.
I know most of this already... but nonetheless your reply is appreciated.
However, it seems that I will need to get Site Support to handle this since although I can copy and paste into a new text file I cannot upload via Filezilla. because I have no idea where to upload it within the folder structure.
;-)
Richard
Hi Richard it is a long time since I have changed a robots.txt. From memory this applies
The "robots.txt" file should be placed in the root directory of your website, which is the same directory where you can find the "wp-admin", "wp-content", and "wp-includes" folders.
For example, if your WordPress site is installed in the root directory of your domain (e.g., "https://www.example.com/"), the "robots.txt" file should be placed directly in that root directory:
/
├── index.php
├── wp-admin/
├── wp-content/
├── wp-includes/
├── robots.txt
└── ...
If your WordPress site is installed in a subdirectory (e.g., "https://www.example.com/wordpress/"), the "robots.txt" file should be placed in that subdirectory:
/wordpress/
├── index.php
├── wp-admin/
├── wp-content/
├── wp-includes/
├── robots.txt
└── ...
By placing the "robots.txt" file in the root directory of your WordPress installation, search engine crawlers and other bots will be able to easily locate and access the file to determine which pages or sections of your site should be crawled or ignored.
Thank you Catherine Unfortunately there is no robots.txt in the root directory.
;-)
Richard
See more comments
I want to update my robots.txt to stop Google unnecessarily crawling /feed/ and to remove mentions of /feed/ from showing up in GSC.
I can't find the robots.txt using Filezi
it is normally a virtual file - some seo plugins will create one if you tell them to...and others have an editor....
all you need do is to create a text only file and place it in the root directory...
in the ftp you maybe - as i was - suprised that all the docs are under the http directory rather than the https - they still do a redirect on a lot of sites and this can cause issues occasionally...
but if you see index.php, wp-content, wp-admin, and wp-includes you should be in the right directory.
getting it wrong can crash your site, just as a warning....
Thanks for that info, Phil.
The thing is I use Rank Math which (as do other similar tools) provides a method for creating a robots.txt which I have updated.
This is what I've not figured out yet...
My Rank Math robots.txt is not working. Research implies that I need to delete the existing robots.txt from the root for the Rank Math one to work. But I can't find the old one to get rid of it. I can see what is in place when I render the robots.txt, but have no idea where it is rendered from...
;-)
Richard
i do wonder whether it is a script that wa have inserted as all the siterubix sites are now noindex and can't have a robots.txt added...
it maybe worth installing the robots.txt editor plugin as this can append info to the current one or has an option to overwrite ....
Thanks for that thought, Phil. I'll check it out. In the meantime I've contacted Rank Math support.
;-)
Richard
Hey Richard,
All you need is your FTP access to the site where the robot.txt file would be found, if the site has one.
Hope you find this helpful.
Thank you Trish.
I use FileZilla for that and can’t find the robots.txt. It’s time to contact Site Support.
;-)
Richard
Thank you Trish.
My reply to Phil (feigner) probably describes my dilemma best.
;-)
Richard
First, download a copy of your existing robots.txt file from your website. You can do this in a few ways:
Navigate to your robots.txt file directly (e.g., example.com/robots.txt) and copy its contents into a new text file on your computer.
Use a tool like cURL to download an actual copy of your robots.txt file. For example:
curl https://example.com/robots.txt -o robots.txt
Alternatively, you can use the robots.txt report in Google Search Console to copy the content of your robots.txt file and paste it into a local file on your computer.
Edit Your Robots.txt File:
Open the downloaded robots.txt file in a text editor.
Make the necessary edits to the rules. In your case, you want to prevent Google from crawling /feed/. You can achieve this by adding the following lines to your robots.txt file:
User-agent: *
Disallow: /feed/
Ensure that you use the correct syntax and save the file with UTF-8 encoding.
Upload the Updated Robots.txt File:
Upload your new robots.txt file to the root directory of your website. The process for uploading files depends on your hosting platform and server.
If you don’t have permission to upload files to the root directory, contact your domain manager or website administrator. They can make the necessary changes for you.
Refresh Google’s Robots.txt Cache:
Google’s crawlers automatically notice changes to your robots.txt file during their crawling process. The cached version is updated approximately every 24 hours.
If you need to update the cache faster, use the “Request a recrawl” function in the robots.txt report within Google Search Console
Thank you for your comprehensive reply Oscar.
I know most of this already... but nonetheless your reply is appreciated.
However, it seems that I will need to get Site Support to handle this since although I can copy and paste into a new text file I cannot upload via Filezilla. because I have no idea where to upload it within the folder structure.
;-)
Richard
Hi Richard it is a long time since I have changed a robots.txt. From memory this applies
The "robots.txt" file should be placed in the root directory of your website, which is the same directory where you can find the "wp-admin", "wp-content", and "wp-includes" folders.
For example, if your WordPress site is installed in the root directory of your domain (e.g., "https://www.example.com/"), the "robots.txt" file should be placed directly in that root directory:
/
├── index.php
├── wp-admin/
├── wp-content/
├── wp-includes/
├── robots.txt
└── ...
If your WordPress site is installed in a subdirectory (e.g., "https://www.example.com/wordpress/"), the "robots.txt" file should be placed in that subdirectory:
/wordpress/
├── index.php
├── wp-admin/
├── wp-content/
├── wp-includes/
├── robots.txt
└── ...
By placing the "robots.txt" file in the root directory of your WordPress installation, search engine crawlers and other bots will be able to easily locate and access the file to determine which pages or sections of your site should be crawled or ignored.
Thank you Catherine Unfortunately there is no robots.txt in the root directory.
;-)
Richard
See more comments
If you have, do you have any recommendations for using AI to generate videos from posts?
Thank you
;-)
Richard
Great question.
synthesia .io, invideo.io and Canva are the ones I tried.
Thanks for sharing guys.
Mel B
I have the one that does blog posts to ai videos is from
pictory. ai
videotoblog.ai (reverse videos to blogs)
videogen.io
veed.io
Canva Pro does AI videos too
That the one I know of, I tried it a few times, not bad! I like it better than the others that Abie mentioned.
I have those bookmarked in my resources:
capcut .com
invideo .io
synthesia .io
trymaverick .com
I would also reach out to Phil @Phil1944 for more info here :)
See more comments
Have you tried any ai video generation yet?
If you have, do you have any recommendations for using AI to generate videos from posts?
Thank you
;-)
Richard
Great question.
synthesia .io, invideo.io and Canva are the ones I tried.
Thanks for sharing guys.
Mel B
I have the one that does blog posts to ai videos is from
pictory. ai
videotoblog.ai (reverse videos to blogs)
videogen.io
veed.io
Canva Pro does AI videos too
That the one I know of, I tried it a few times, not bad! I like it better than the others that Abie mentioned.
I have those bookmarked in my resources:
capcut .com
invideo .io
synthesia .io
trymaverick .com
I would also reach out to Phil @Phil1944 for more info here :)
See more comments
Hey Richard,
The "feed" you're seeing is your RSS feed (see 2nds screen print below...).
Perhaps you are blocking that access,?
There is a fix... searching Google for help is your best bet... see 1st screen print below...
Hope you find this helpful.
Thank you Trish.
Actually, the feeds are a different challenge that I will explain in a moment.
My image shows the peculiar entry that I show in my message and highlight in the image, which has nothing to do with feeds.
On the subject of feeds... their presence, or not, make no difference to SEO and I've been trying to get them removed from GSC. Google themselves say that if you add " Disallow /feeds/ " in the robots.txt file, that they'll stop trying to follow feeds and then they shouldn't show up in GSC. Now, I know it can take some time for some types of Google Bot to get around to crawling, but after 6 months or so all my feeds are still showing in GSC! They're just a nuisance, but add to the time it takes when I check for crawled not indexed URLs.
;-)
Richard