YouTube Video Scraping Python With Beautiful Soup

In this post, we are going to see how you can scrape a website for video links with Beautiful Soup and Python. We will do YouTube Video scraping because it’s easy and will give you confidence as well. Do note, we are only processing the first page of search results. Let us go ahead and look at our first Youtube Video Scraping Python With Beautiful Soup tutorial.

YouTube Video Scraping

YouTube Video Scraping Python With Beautiful Soup


This article is for Educational Purposes only. Please check the laws for web scraping for your country and the website you are scraping. We are not responsible for companies suing you or law enforcement, intelligence, or secret services knocking at your door.

YouTube Video Scraping Python With Beautiful Soup


You can find the source code for the Python Script here.

Let’s Dive In

In this example, we will be scraping Youtube, based on the search term provided by us. You would need to know basic HTML tags. We will be using Beautiful Soup, a python library for getting the data we want from html and XML files or sources. As this is our first Youtube Web scraping example we decided to chose an easy one.

We need to import two Python libraries in our code.

import requests
from bs4 import BeautifulSoup

If you haven’t installed these libraries you can find steps on how to do that here.

Step 1 – Open in your Browser

We go to on our browser. We prefer Firefox as it’s easier for what we do in Step 3, but you can also use Chrome.

Python code to open, and get it’s HTML will be

sb_get = requests.get("")

To see the HTML, we need to use


In this case, youtube doesn’t know what browser the request is coming from so we may be blocked out. Hence we define a variable with details of firefox, and then use it in the above command (general terminology headers). You can choose the header for any browser you want.

mozhdr = {'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv: Gecko/2008092417 Firefox/3.0.3'}

requests.get("", headers = mozhdr)

Step 2 – Enter Search Term

As we are Game of Thrones Fan, we enter the search term as Game of Thrones and then click the search icon. A new page opens up and we get a list of results. The important thing to note here is how the url changes from

Video Scraping 002

Let us now search for Breaking Bad (another one of our favorite TV Shows). The url changes to

So, we know that a space is replaced by a + and the search term is added after below url. Spaces are replaced with a + sign as url’s cannot have spaces.

Equipped with all this knowledge, we can define three variables in our Python code.

search_hardcode = "game+of+thrones"

We can now combine all the above terms, and we will get our search url.

sb_url = scrape_url + search_url + search_hardcode

Now, if we want the html for the search url page, we will give below command.

sb_get = requests.get(sb_url, headers = mozhdr)

sb_get now has the response from (200 if all is good), and we can find the HTML in


The HTML source has a lot of stuff, what we need is the link to the video. This is where HTML basics come into the picture. All links on a page are enclosed with <a> tag. We suggest you learn more about the <a> tag from the below link.

HTML a tag Explained

We go into Firefox while we are on the search results page, and enable Inspector.

Tools – Web Developer – Inspector

Video Scraping 003

Now, we hover the mouse cursor over the link to a video, we get below details.

Video Scraping 004 1024x576 1

Notice, the <a> tag has the all the details we need

  • the link to the video (in href)
  • the Title of the video (in title)

Step 4 – Filter out <a> Tags with Videos

Our page has a lot of <a> tags, but we only need the ones which have our video content. So, we need to figure out a way to identify all these <a> tags and filter them out. We need to look for something common in all these <a> tags. It is easy for youtube, but for some websites, you need to filter the <div> tag within which the <a> tag is enclosed. Again, this varies from site to site.

When we move our mouse over the video links we get details of the <a> tag. All the video links we need have below <a> tag details (marked in green below).

Video Scraping 005

The details marked in green rectangle are actually class of the <a> tag.

Video scraping 007

This is what we are going to use to pull out all the <a> tags we need from the HTML source.

Step 5 – Beautiful Soup and find_all

To use Beautiful Soup functions we first need to ensure that the HTML we have is in a format recognized by Beautiful Soup. Below command takes care of it.

 soupeddata = BeautifulSoup(sb_get.content, "html.parser")

The variable soupeddata has HTML content in a format which is recognized by Beautiful Soup.

We now need to find all <a> tags with a specific class, as those are the <a> tags of interest to us.

 yt_links = soupeddata.find_all("a", class_ = "yt-uix-tile-link")

We will use the find_all function to get all <a> tags, which have class as yt-uix-tile-link. All these <a> tags are stored in a variable called yt_links (which will eventually be a list in Python).

We now have a Python list of <a> tags which has all the information we need. We still have some information to filter out as we only need the URL and title. So, we need to get href and title attributes of <a> tag. Since yt_links is a list, we use a Python for loop to process the list and grab the href and title attributes.

for x in yt_links:
 yt_href = x.get("href")
 yt_title = x.get("title")

Let’s pick one of the <a> tags we filtered out.

<a href=”/watch?v=pE2wcBeyNdk” class=”yt-uix-tile-link yt-ui-ellipsis yt-ui-ellipsis-2 yt-uix-sessionlink spf-link ” data-sessionlink=”itct=CIwBENwwGAEiEwiIhOnNt8rVAhXZxFUKHe9uDzMo9CRSD2dhbWUgb2YgdGhyb25lcw” title=”Game of Thrones: The Loot Train Attack (HBO)” aria-describedby=”description-id-568845″ rel=”spf-prefetch” dir=”ltr”>Game of Thrones: The Loot Train Attack (HBO)</a>

In this case

yt_href will be /watch?v=pE2wcBeyNdk


yt_title will be Game of Thrones: The Loot Train Attack (HBO)

Our video url is still not complete, but all we need to do is add

before it, which we have in variable scrape_url

yt_final = scrape_url + yt_href

Will give us the complete link in variable yt_final

Step 7 – Done

That’s it peeps, we now have the Youtube link, and Youtube Title of the Video. If you execute the code in IDLE and print the variables yt_final and yt_title you should get an output similar like below.

Video Scraping 009


You May Like-> Indian Kodi Addon

Add a Comment

Your email address will not be published. Required fields are marked *