8000
Skip to content
7FFF
This repository was archived by the owner on Apr 4, 2026. It is now read-only.

rostam/pisa-data-set-analysis

Repository files navigation

PISA Dataset Analysis

by Ali Rostami

Dataset

Here, we are going to analyze a data set from the PISA program which is an international assessment program of education and schooloing systems through different countries. Particularly, this data, which we analyze here, contian the assessments of performance of students in Math, Science, and Reading. The dataset, which is used here, is available under this address: https://s3.amazonaws.com/udacity-hosted-downloads/ud507/pisa2012.csv.zip Since the file is big, we did not put the file in this git. In the main jupyeter notebook "Investigate_a_Dataset", there are two main parts of data cleaning and the analysis. For the data cleaning part, you need to download the mentioned file and save it locally. However, you can already use the clean data file available in the following link for the analysis part which is working independently as the data cleaning part: https://www.dropbox.com/s/kid78cnsgkwge0l/pisa_data_clean.csv?dl=0

We will consider the following research questions:

  • How is the data distributed regarding the size over different countries?s
  • Do the birth countries with different approach to education have high influences in the students' score?
  • How do different birth countries for the student and his/her parents will have an effect on the scores?
  • How related are the math, reading, and science score together?
  • How is the effect of teacher behavior on the scores?

Summary of Findings

At first, we observed that there is a strong correlation between the scores of math, reading, and science. Therefore, we just focused mostly on the math scores. We could then infer how students are performing in math based on their birth countries as well as their parents' birth countries. We have measured this performance based on three scores of math, reading, and science. Also, the distribution of the math score of the students for all countries showed that there are always good scores even in the countries in which the students averagely do not have good scores.

Key Insights for Presentation

We could infer how students are performing in math based on their birth countries and their parents' birth countries as well as their gender. We have measured this performance based on three scores of math, reading, and science. Also, we finally shortly look at the teacher behavior distribution through the countries. For the sake of better presentation, we do not show the results for all the birth countries of students. We focus mostly on the countries that we have most of the data from. Also,

About

An analysis of the PISA data set

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

0