Hey everyone! I am delighted that my article about scraping Google with Elixir and Crawly has gathered 200 claps! And as I have promised, we will take extra steps to follow google’s pagination and get more meaningful data from the results. So let’s do it practically!

Setting the task

Many online businesses rely on Search Engines as their primary revenue stream, especially eCommerce websites, tools, and service providers. It’s known that 80% of the clicks are taken by the website, which holds the first position in the search results.

Usually, digital companies would start from the paid search advertisement to…


Image by Pete Linforth from Pixabay

In one of my previous articles, we discussed why you might want to scrape data from the Internet. We have shown how to extract data from multiple websites to organize a price monitoring solution for a real estate agency.

Here we want to show how to create a web scraper even if you don’t know how to program it!

Data from the real estate market

As in the previous example, we will be interested in data from one of the Swedish real estate websites. For this example, we will take the Hemnet website. And we want to get: URLs, prices, addresses, images. …


Extracting data from newspapers? RSS is your best friend. (Image by MichaelGaida from Pixabay)

In one of my previous articles, we discussed why one might want to extract data from the Internet. In another article, we were expanding the case and demonstrating how to use scraping as a part of the price monitoring solution. Now, let's dig a bit toward the area of journalism. It might sound a bit unexpected, but journalists rely on web scraping quite a bit these days. I may refer to the following articles as examples:

From our side, we can think of the following use case: Let’s imagine you’re running a local news website, and you’re…


In one of my previous articles, we discussed why one might want to extract data from the Internet. Also, I have created a small article that demonstrates how to extract data for the real estate agency; Now, we would take a step in a slightly different direction — our new target is Google.

Imagine we’re a marketing agency trying to work for a company and promote it in google search results. In this situation, it’s important to trace the results of marketing activities to understand how different marketing activities influence clients' positions in search results.

Note: According to the robots.txt…


Monitoring real estate data with Elixir. (Picture was taken from https://pixabay.com/)

In one of my previous articles, we discussed why one might want to extract data from the Internet. Now it’s time to be more specific to showcase one of the possible use cases — “The real estate market.”

A bit about the real estate market

The real estate market is a highly competitive market, with many companies trying to approach the same client. As soon as the amount of properties advertised through the internet is constantly growing, the importance of publicly available data is also playing a big role. …


Scraping spider is about to extract data from the web (Image from Pixabay)

This article will describe what web scraping is, how it’s different from web crawling, and finally, who and why uses it.

Starting from definitions

For not to have one meaning is to have no meaning, and if words have no meaning our reasoning with one another, and indeed with ourselves, has been annihilated— Aristotle

Web Scraping is a process of extracting data structured data from a given web page (or a group of pages).

Crawling is a process of visiting (literally crawling) web pages which potentially will be extracted during the Web Scraping process.

These two processes are normally complimenting each other, though…


Entering a private area might be a challenge. Well, not for us.

In our previous articles, we have covered the topic of web scraping with Elixir, and have shown how Crawly can simplify this work if you need to render dynamic content.

I will demonstrate how Crawly can extract data that is behind a login (this is not a trivial problem to solve without the right tools).

This might be required in several cases. For example when you want to extract profile data from another social network. Or for example, when you want to extract and analyze special offers that are only available to members of a specific website.

In this article…


Image by Gerd Altmann from Pixabay

Introduction

In one of our previous articles, we were discussing how to perform web scraping with the help of Elixir. In this article, I will take another step into the interesting world of data scraping and will investigate how to handle the interactivity of the modern web.

Since the number of websites using Javascript for content rendering grows, the demand for extracting data from those also grows. Interactivity itself adds some complexity to the data extraction process, as it’s not possible to get the full content with a regular request made from a command-line HTTP client.

One of the ways…

Oleg Tarasenko

Software developer at Erlang Solutions.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store