8000
Skip to content

liapsps/data-science-python-libraries-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Data Science Libraries in Python

This repository is written in English to reach a wider audience.

Welcome to the Data Science Libraries in Python repository! This project is designed as a didactic resource to explore and understand essential Python libraries to help with your Data Science studies. Here, you'll find useful informations that'll help you master these tools.

Objectives

  • Learn the purpose of each library: Understand what each library is used for and its importance in the data science workflow.
  • Practice with code examples: Explore clear, didactic examples using fictitious datasets.
  • Access curated resources: Find links to documentation, books, and courses for deeper learning.

Repository Structure

The repository is organized into folders by theme to make navigation intuitive:

├── README.md
├── Data Science Workflow/
    ├── Data Manipulation/
    │   ├── numpy_basics.ipynb
    │   ├── pandas_data_cleaning.ipynb
    ├── Data Visualization/
    │   ├── matplotlib_basics.ipynb
    │   ├── seaborn_heatmaps.ipynb
    │   ├── plotly_interactive_charts.ipynb
    ├── Machine Learning/
        ├── sklearn_regression.ipynb
        ├── sklearn_classification.ipynb
        ├── xgboost_basics.ipynb

Each folder contains Jupyter notebooks that:

  • Introduce the library: Highlight its main features and applications.
  • Provide code examples: Demonstrate common tasks and workflows.
  • Include comments: Explain each step of the code for better understanding.

Themes Covered

  1. Data Manipulation

    • numpy: Numerical computing with multi-dimensional arrays.
    • pandas: Data manipulation and analysis with DataFrames.
    • Data Visualization

      • matplotlib: Creating static, animated, and interactive plots.
      • seaborn: Statistical data visualization built on Matplotlib.
      • plotly: Interactive visualizations and dashboards.
    • Machine Learning

      • scikit-learn: Essential tools for machine learning (classification, regression, clustering, etc.).
      • xgboost: Gradient boosting for structured data.

Resources


Roadmap

This is a dynamic project with ongoing updates. Here's the plan:

  1. Initial Setup

    • ✅ Create folders and templates for each theme.
    • ✅ Add basic examples for NumPy and Pandas.
  2. Expand Visualization Examples

    • ✅ Create interactive dashboards with Plotly.
    • 🔄 Add advanced plots in Seaborn. (In Progress ⬅️)
  3. Machine Learning Use Cases

    • ✅ Include examples with Scikit-learn (regression and classification).
    • ✅ Add examples with XGBoost.
  4. Polishing and Documentation

    • 🔄 Refine code comments and add detailed explanations. (In Progress ⬅️)
    • 🔄 Add Markdown explanations for workflows. (In Progress ⬅️)

Happy learning! 🚀

About

A guide to Data Science Libraries

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

0