Project 01

Unleashing YouTube's Data: Harvesting and Warehousing with SQL, MongoDB, and Streamlit

Project Description:

Discover and analyze YouTube channel data effortlessly with our Streamlit application. Harvest, store, and query data from multiple channels using Google API, MongoDB, and SQL database integration.

Domain:

Social Media

Technologies:

  • Python
  • SQL
  • MongoDB
  • Streamlit
  • YouTube API
  • Jupyter notebook

Problem Statement:

The problem statement is to create a Streamlit application that allows users to access and analyze data from multiple YouTube channels. The application should have the following features.

1. Ability to input a YouTube channel ID and retrieve all the relevant data (Channel name, subscribers, total video count, playlist ID, video ID, likes, dislikes, comments of each video) using Google API.

2. Option to store the data in a MongoDB database as a data lake.

3. Ability to collect data for up to 10 different YouTube channels and store them in the data lake by clicking a button.

4. Option to select a channel name and migrate its data from the data lake to a SQL database as tables.

5. Ability to search and retrieve data from the SQL database using different search options, including joining tables to get channel details.

Results:

This project aims to develop a user-friendly Streamlit application that utilizes the Google API to extract information on a YouTube channel, stores it in a MongoDB database, migrates it to a SQL data warehouse, and enables users to search for channel details and join tables to view data in the Streamlit app.