Hello there! I am Ekaterina (Katja), I am a software engineer coming from organic chemistry and protein NMR.
I have a soft spot for interactive charting, so I am learning D3 on the side.
π In my doctoral years, I focussed on feature engineering in protein structural biology. I adapted of existing models of structure - proterty relationship for a specific case of highly statically disordered samples. I explored and proposed scores for the degree of local disorder. This was a largely interdisciplinary project that combined the domains of quantum physics, molecular biology as well as data science.
π In spring 2024, I graduated from the Data Science Bootcamp @ neue fische, where I practiced my old skills and learn plently of new ones.
π§βπ¬ Since 2024, I am working at AI|ffinity to finally advance academic NMR software to the next level and make automated protein assignments a reality.
π¨ If you have a technical problem to solve - drop me an email! burakova.ek@gmail.com
π Bremen, DE
-
π€ sort GIT out! - AI-Agentic app to chat with your Git repository. For CheffTreff Hackathon 2025, solution for Finanz Informatik challange. ππΈ
- Analyses commit messages and difference logs.
- Powered by Gemini API (state on April 2025)
- Simple, sleek UI (made with Streamlit)
- Developed in <24 hours!
-
π₯ sustAIn - Extracting sustainability-related information from PDFs for Bremen AI-Hackathon in summer 2024. Challenge provided by Encoway / Lenze group.
- Highly heterogeneous documentation on electric devices is converted into a parseable database.
- New PDFs can be uploaded and analyzed on the fly.
- Powered by openAI API
- Developed in 48 hours by the team of five
-
π₯ DataScience for Production Pipelines - Berlin, September 2024. Challenge provided by Bayer. (NON-PUBLIC)
- Identified and linked patterns of errors on the pilot plant packaging liquid formulations into vials.
- Presented a Markov chain model to the non-technical stakeholders.
nmr_utils- simple scripts for all things NMR (mostly, for proteins): handy tools for visualization and NUS data handling.protein_heterogeneity_ssnmr- the tools related to the paper in J.Biomol.NMR 2022- Clean
visualisationof protein chemical shift distributions from BMRB.
- ποΈ
d-drivers- Data-driven search for traffic drivers. This is the graduation project at the neue fische Data science bootcamp (Apr 2024). Our team of five analyzed the internal content data of EFAHRER.com. We modelled the page impressions in the news feed and built a data app for the editorial managers.
- π₯
fraud-detection- Analyzing energy consumption patterns to detect which clients have meddled with the electrical and gas counters. - ποΈ
eda-kc-housing- Analysis of price defining factors on the King County housing dataset for a mock client interested in investment into property development. EDA showcase prepared as a part of the DS bootcamp.
- Built my own data processing software to handle highly heterogeneous and poorly structred transaction histories - from ground up. Used it to fill out my own tax declarations 2022-2025. Feel free to ask me about this, as well as my experience with real-time transaction analysis. _The repositories will remain private. _