Data Science Libraries in Python

This repository is written in English to reach a wider audience.

Welcome to the Data Science Libraries in Python repository! This project is designed as a didactic resource to explore and understand essential Python libraries to help with your Data Science studies. Here, you'll find useful informations that'll help you master these tools.

Objectives

Learn the purpose of each library: Understand what each library is used for and its importance in the data science workflow.
Practice with code examples: Explore clear, didactic examples using fictitious datasets.
Access curated resources: Find links to documentation, books, and courses for deeper learning.

Repository Structure

The repository is organized into folders by theme to make navigation intuitive:

├── README.md
├── Data Science Workflow/
    ├── Data Manipulation/
    │   ├── numpy_basics.ipynb
    │   ├── pandas_data_cleaning.ipynb
    ├── Data Visualization/
    │   ├── matplotlib_basics.ipynb
    │   ├── seaborn_heatmaps.ipynb
    │   ├── plotly_interactive_charts.ipynb
    ├── Machine Learning/
        ├── sklearn_regression.ipynb
        ├── sklearn_classification.ipynb
        ├── xgboost_basics.ipynb

Each folder contains Jupyter notebooks that:

Introduce the library: Highlight its main features and applications.
Provide code examples: Demonstrate common tasks and workflows.
Include comments: Explain each step of the code for better understanding.

Themes Covered

Data Manipulation
- numpy: Numerical computing with multi-dimensional arrays.
- pandas: Data manipulation and analysis with DataFrames.
- Data Visualization
  - matplotlib: Creating static, animated, and interactive plots.
  - seaborn: Statistical data visualization built on Matplotlib.
  - plotly: Interactive visualizations and dashboards.
- Machine Learning
  - scikit-learn: Essential tools for machine learning (classification, regression, clustering, etc.).
  - xgboost: Gradient boosting for structured data.

Resources

Official Documentation:
Books:
- "Python for Data Analysis" by Wes McKinney
- "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" by Aurélien Géron
Courses:
- Data Science with Python on Coursera
- Interactive Python for Data Science on Kaggle

Roadmap

This is a dynamic project with ongoing updates. Here's the plan:

Initial Setup
- ✅ Create folders and templates for each theme.
- ✅ Add basic examples for NumPy and Pandas.
Expand Visualization Examples
- ✅ Create interactive dashboards with Plotly.
- 🔄 Add advanced plots in Seaborn. (In Progress ⬅️)
Machine Learning Use Cases
- ✅ Include examples with Scikit-learn (regression and classification).
- ✅ Add examples with XGBoost.
Polishing and Documentation
- 🔄 Refine code comments and add detailed explanations. (In Progress ⬅️)
- 🔄 Add Markdown explanations for workflows. (In Progress ⬅️)

Happy learning! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Data Science Workflow		Data Science Workflow
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Science Libraries in Python

Objectives

Repository Structure

Themes Covered

Resources

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Data Science Libraries in Python

Objectives

Repository Structure

Themes Covered

Resources

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages