arXiv:2506.22639 [pdf, ps, other]

Fingerprinting SDKs for Mobile Apps and Where to Find Them: Understanding the Market for Device Fingerprinting

Authors: Michael A. Specter, Mihai Christodorescu, Abbie Farr, Bo Ma, Robin Lassonde, Xiaoyang Xu, Xiang Pan, Fengguo Wei, Saswat Anand, Dave Kleidermacher

Abstract: This paper presents a large-scale analysis of fingerprinting-like behavior in the mobile application ecosystem. We take a market-based approach, focusing on third-party tracking as enabled by applications' common use of third-party SDKs. Our dataset consists of over 228,000 SDKs from popular Maven repositories, 178,000 Android applications collected from the Google Play store, and our static analy… ▽ More This paper presents a large-scale analysis of fingerprinting-like behavior in the mobile application ecosystem. We take a market-based approach, focusing on third-party tracking as enabled by applications' common use of third-party SDKs. Our dataset consists of over 228,000 SDKs from popular Maven repositories, 178,000 Android applications collected from the Google Play store, and our static analysis pipeline detects exfiltration of over 500 individual signals. To the best of our knowledge, this represents the largest-scale analysis of SDK behavior undertaken to date. We find that Ads SDKs (the ostensible focus of industry efforts such as Apple's App Tracking Transparency and Google's Privacy Sandbox) appear to be the source of only 30.56% of the fingerprinting behaviors. A surprising 23.92% originate from SDKs whose purpose was unknown or unclear. Furthermore, Security and Authentication SDKs are linked to only 11.7% of likely fingerprinting instances. These results suggest that addressing fingerprinting solely in specific market-segment contexts like advertising may offer incomplete benefit. Enforcing anti-fingerprinting policies is also complex, as we observe a sparse distribution of signals and APIs used by likely fingerprinting SDKs. For instance, only 2% of exfiltrated APIs are used by more than 75% of SDKs, making it difficult to rely on user permissions to control fingerprinting behavior. △ Less

Submitted 27 June, 2025; originally announced June 2025.

Comments: To appear in ACM CCS 2025. Extended from conference version; has added appendices more inclusive author list

arXiv:2506.10104 [pdf, ps, other]

Expert-in-the-Loop Systems with Cross-Domain and In-Domain Few-Shot Learning for Software Vulnerability Detection

Authors: David Farr, Kevin Talty, Alexandra Farr, John Stockdale, Iain Cruickshank, Jevin West

Abstract: As cyber threats become more sophisticated, rapid and accurate vulnerability detection is essential for maintaining secure systems. This study explores the use of Large Language Models (LLMs) in software vulnerability assessment by simulating the identification of Python code with known Common Weakness Enumerations (CWEs), comparing zero-shot, few-shot cross-domain, and few-shot in-domain promptin… ▽ More As cyber threats become more sophisticated, rapid and accurate vulnerability detection is essential for maintaining secure systems. This study explores the use of Large Language Models (LLMs) in software vulnerability assessment by simulating the identification of Python code with known Common Weakness Enumerations (CWEs), comparing zero-shot, few-shot cross-domain, and few-shot in-domain prompting strategies. Our results indicate that while zero-shot prompting performs poorly, few-shot prompting significantly enhances classification performance, particularly when integrated with confidence-based routing strategies that improve efficiency by directing human experts to cases where model uncertainty is high, optimizing the balance between automation and expert oversight. We find that LLMs can effectively generalize across vulnerability categories with minimal examples, suggesting their potential as scalable, adaptable cybersecurity tools in simulated environments. However, challenges such as model reliability, interpretability, and adversarial robustness remain critical areas for future research. By integrating AI-driven approaches with expert-in-the-loop (EITL) decision-making, this work highlights a pathway toward more efficient and responsive cybersecurity workflows. Our findings provide a foundation for deploying AI-assisted vulnerability detection systems in both real and simulated environments that enhance operational resilience while reducing the burden on human analysts. △ Less

Submitted 11 June, 2025; originally announced June 2025.

arXiv:2503.19570 [pdf]

Improved tissue sodium concentration quantification in breast cancer by reducing partial volume effects: a preliminary study

Authors: Olgica Zaric, Carmen Leser, Vladimir Juras, Alex Farr, Pavol Szomolanyi, Malina Gologan, Stanislas Rapacchi, Laura Villazan Garcia, Haider Ali, Christian Singer, Siegfried Trattnig, Christian Licht, Ramona Woitek

Abstract: Introduction: In sodium (23Na) magnetic resonance imaging (MRI), partial volume effects (PVE) are one of the most common causes of errors in the in vivo quantification of tissue sodium concentration (TSC). Advanced image reconstruction algorithms, such as compressed sensing (CS), have the potential to reduce PVE. Therefore, we investigated the feasibility of using CS-based methods to improve image… ▽ More Introduction: In sodium (23Na) magnetic resonance imaging (MRI), partial volume effects (PVE) are one of the most common causes of errors in the in vivo quantification of tissue sodium concentration (TSC). Advanced image reconstruction algorithms, such as compressed sensing (CS), have the potential to reduce PVE. Therefore, we investigated the feasibility of using CS-based methods to improve image quality and TSC quantification accuracy in patients with breast cancer. Subjects and methods: In this study, three healthy participants and 12 female participants with breast cancer were examined on a 7T MRI scanner. 23Na-MRI images were reconstructed using weighted total variation (wTV), directional total variation (dTV), anatomically guided total variation (AG-TV) and adaptive combine (ADC) methods. The consistency of tumor volume delineations based on sodium data was assessed using the Dice score, and TSC quantification was performed for various image reconstruction methods. Pearsons correlation coefficients were calculated to assess the relationships between wTV, dTV, AG-TV, and ADC values. Results: All methods provided breast MRI images with well-preserved sodium signal and tissue structures. The mean Dice scores for wTV, dTV, and AG-TV were 65%, 72%, and 75%, respectively. Average TSC values in breast tumors were 61.0, 72.0, 73.0, and 88.0 mmol/L for wTV, dTV, AG-TV, and ADC, respectively. A strong negative correlation was observed between wTV and dTV (r = -0.78, 95% CI [-0.94, -0.31], p = 0.0076) and a strong positive correlation between dTV and AG-TV (r = 0.71, 95% CI [0.16, 0.92], p = 0.0207) was found. Conclusion: The results of this study showed that differences in tumor appearance and TSC estimations may depend on the type of image reconstruction and the parameters used. This is most likely due to differences in their ability to reduce PVE. △ Less

Submitted 16 November, 2025; v1 submitted 25 March, 2025; originally announced March 2025.

arXiv:2411.09844 [pdf, other]

Deep Autoencoders for Unsupervised Anomaly Detection in Wildfire Prediction

Authors: İrem Üstek, Miguel Arana-Catania, Alexander Farr, Ivan Petrunin

Abstract: Wildfires pose a significantly increasing hazard to global ecosystems due to the climate crisis. Due to its complex nature, there is an urgent need for innovative approaches to wildfire prediction, such as machine learning. This research took a unique approach, differentiating from classical supervised learning, and addressed the gap in unsupervised wildfire prediction using autoencoders and clust… ▽ More Wildfires pose a significantly increasing hazard to global ecosystems due to the climate crisis. Due to its complex nature, there is an urgent need for innovative approaches to wildfire prediction, such as machine learning. This research took a unique approach, differentiating from classical supervised learning, and addressed the gap in unsupervised wildfire prediction using autoencoders and clustering techniques for anomaly detection. Historical weather and normalised difference vegetation index datasets of Australia for 2005 - 2021 were utilised. Two main unsupervised approaches were analysed. The first used a deep autoencoder to obtain latent features, which were then fed into clustering models, isolation forest, local outlier factor and one-class SVM for anomaly detection. The second approach used a deep autoencoder to reconstruct the input data and use reconstruction errors to identify anomalies. Long Short-Term Memory (LSTM) autoencoders and fully connected (FC) autoencoders were employed in this part, both in an unsupervised way learning only from nominal data. The FC autoencoder outperformed its counterparts, achieving an accuracy of 0.71, an F1-score of 0.74, and an MCC of 0.42. These findings highlight the practicality of this method, as it effectively predicts wildfires in the absence of ground truth, utilising an unsupervised learning technique. △ Less

Submitted 14 November, 2024; originally announced November 2024.

Comments: 33 pages, 18 figure, 16 tables. To appear in Earth and Space Science

arXiv:1507.04998 [pdf, other]

doi 10.1063/1.4928871

Experimental demonstration of a surface-electrode multipole ion trap

Authors: Mark Maurice, Curtis Allen, Dylan Green, Andrew Farr, Timothy Burke, Russell Hilleke, Robert Clark

Abstract: We report on the design and experimental characterization of a surface-electrode multipole ion trap. Individual microscopic sugar particles are confined in the trap. The trajectories of driven particle motion are compared with a theoretical model, both to verify qualitative predictions of the model, and to measure the charge-to-mass ratio of the confined particle. The generation of harmonics of th… ▽ More We report on the design and experimental characterization of a surface-electrode multipole ion trap. Individual microscopic sugar particles are confined in the trap. The trajectories of driven particle motion are compared with a theoretical model, both to verify qualitative predictions of the model, and to measure the charge-to-mass ratio of the confined particle. The generation of harmonics of the driving frequency is observed as a key signature of the nonlinear nature of the trap. We remark on possible applications of our traps, including to mass spectrometry. △ Less

Submitted 17 July, 2015; originally announced July 2015.

Comments: Preprint format, 15 pages, 9 figures. Accepted into Journal of Applied Physics

Showing 1–5 of 5 results for author: Farr, A