Extracts and saves all words a user has learned on Lingvist, including the following additional information:
- The word's translation
- When it was last practiced
- How many times it was practiced in total
- Whether or not the user was correct when he/she last practiced the word
This code is mainly for learning purposes to illustrate how data can be scraped from a dynamic web application that requires a user to log in.
Please be aware that according to section (3) of the Lingvist Terms of Service, scraping data from their service is not allowed and using this script will happen at your own responsibility!
-
Make sure Python (including pip) is installed.
-
Install Selenium if it is not already installed:
$ pip install selenium
-
Download the geckodriver executable (e.g. from here) and place it in the project folder.
-
Run the script from the project folder using:
$ python lingvist_word_scraper.py
and enter your email address as well as the password for your account. Alternatively, you can also pass just your email address or both email and password as command line arguments to the script. The program will then run to retrieve your words.
- When the program is finished, you can find all your learned words in the vocabulary.json file.