SEO Crawling & metadata extraction with R & RCrawler

SEO Crawling & metadata extraction with R & RCrawler

It will be a long article so I added a Table of content 👇 Fancy, right? Table Of ContentsCrawl an entire website with RcrawlerThe INDEX variableHTML FilesSo how to extract metadata while crawling?Explore Crawled Data with rpivottableExtract more data without having to...
Crawling with R’ using rvest package

Crawling with R’ using rvest package

If you want to crawl a couple of URLs for SEO purposes, there are many many ways to do it but one of the most reliable and versatile packages you can use is rvest Here is a simple demo from the package documentation using the IMDb website: # Package installation,...
Export your data from R’

Export your data from R’

R’ and RStudio are great but sometimes it’s better the just export your data to exploit them elsewhere or just show them to other people. Here is a review of possible techniques: Export your data into a CSV assuming your data is store inside df var, fairly...
Perform automatic browser tests with Selenium & R

Perform automatic browser tests with Selenium & R

Selenium is a very classic tool for QA and it can help perform automatic checks on a website. This is an intro of how to use it:The first step is, as always, to install and load the RSelenium package #install to run once install.packages("RSelenium")...
Hunt down keyword cannibalization using R’

Hunt down keyword cannibalization using R’

What the hell is keyword cannibalization? if you put a lot of articles out there, at some point, some article will compete with one another for the same keywords in Google result pages. it’s what SEO people call ‘keyword cannibalization’. Does it...
Use R’ to crawl XML sitemaps

Use R’ to crawl XML sitemaps

XML sitemap is a fantastic tool but you have to do it properly otherwise it can definitely backfire. I can’t count the number of times while doing SEO audits, I discovered completely abandoned XML sitemaps asking Googlebot to index empty or 404 pages. This...
* *