π Earthquake-Data-Engineering-Pipeline-on-Azure - Analyze Earthquake Data Easily

π Overview
The Earthquake Data Engineering Pipeline on Azure provides a full solution for analyzing real-time earthquake data. This app connects to the USGS API, processes the data, and stores it using a BronzeβSilverβGold lakehouse architecture. You can run it manually or set it up for daily automated updates.
π οΈ Features
- Real-Time Data Ingestion: Get live earthquake data from the USGS API.
- Lakehouse Architecture: Efficiently manage and analyze data at three stages: Bronze, Silver, and Gold.
- Azure Services: Utilizes Azure Data Factory, Databricks, ADLS Gen2, and Synapse Analytics for powerful data processing.
- Manual and Automated Workflows: Choose between manual runs and scheduled daily executions.
π Getting Started
Follow these steps to set up your Earthquake Data Pipeline:
π₯οΈ System Requirements
- Windows, macOS, or Linux.
- An internet connection.
- An Azure account for services setup.
- Basic knowledge of using a web browser.
π₯ Download & Install
To download the application, please visit the link below:
Download Earthquake Data Engineering Pipeline
- Click the link above to go to the Releases page.
- Find the latest version listed at the top.
- Download the ZIP file or other available packages for your operating system.
- Extract the files to a location of your choice.
π Setting Up
- Open the extracted folder.
- Follow the included instructions for setting up an Azure environment.
- Make sure you configure any necessary Azure services before proceeding.
π οΈ Configuration
βοΈ Azure Setup
- Azure Data Factory: Set up a Data Factory instance to orchestrate the data pipeline.
- Databricks: Create a Databricks workspace for data analysis and transformation.
- ADLS Gen2: Configure Azure Data Lake Storage Gen2 to store your raw and processed data.
- Synapse Analytics: Set up a Synapse instance to facilitate data querying and visualization.
π API Key
Obtain an API key from USGS to access the earthquake data:
- Visit the USGS API website.
- Register for an API key if required.
- Save your API key in a secure location for later use.
π How to Use
πββοΈ Running the Pipeline
- Open your command line or terminal.
- Navigate to the directory where you extracted the files.
- Use the following command to run the application:
- Monitor the output for any errors, and check your Azure configurations.
π
Scheduling Jobs
To set up daily-triggered workflows:
- Use Azure Data Factoryβs scheduling features.
- Configure triggers based on your requirements for data ingestion.
π Monitoring and Analysis
You can visualize the processed data using Azure Synapse Analytics or Power BI. Set up dashboards to analyze trends and gain insights into earthquake activities.
π Documentation
A detailed guide on all pipeline configurations and parameters can be found in the documentation folder included in the download.
π§βπ€βπ§ Support and Community
If you need help using the Earthquake Data Engineering Pipeline, you can:
- Open an issue on the GitHub page.
- Check the FAQ in the documentation.
- Join our community on platforms like Discord or Slack for real-time help.
π Additional Resources
βοΈ License
This project is licensed under the MIT License. Please see the LICENSE file for more details.
π§Ύ Acknowledgements
Thanks to the Azure community for providing amazing tools and support. Thanks also to the USGS for the valuable earthquake data.
Get started today by visiting the Download page and launching your analysis of earthquake data with ease!