Data reduction for X-ray serial crystallography using machine learning-Reference-Cited by-同舟云学术

Data reduction for X-ray serial crystallography using machine learning

Published:2023-02-01 Issue:1 Volume:56 Page:200-213
ISSN:1600-5767
Container-title:Journal of Applied Crystallography
language:
Short-container-title:J Appl Cryst

Author:

Rahmani Vahid,Nawaz Shah^ORCID,Pennicard David,Setty Shabarish Pala Ramakantha,Graafsma Heinz^ORCID

Abstract

Serial crystallography experiments produce massive amounts of experimental data. Yet in spite of these large-scale data sets, only a small percentage of the data are useful for downstream analysis. Thus, it is essential to differentiate reliably between acceptable data (hits) and unacceptable data (misses). To this end, a novel pipeline is proposed to categorize the data, which extracts features from the images, summarizes these features with the `bag of visual words' method and then classifies the images using machine learning. In addition, a novel study of various feature extractors and machine learning classifiers is presented, with the aim of finding the best feature extractor and machine learning classifier for serial crystallography data. The study reveals that the oriented FAST and rotated BRIEF (ORB) feature extractor with a multilayer perceptron classifier gives the best results. Finally, the ORB feature extractor with multilayer perceptron is evaluated on various data sets including both synthetic and experimental data, demonstrating superior performance compared with other feature extractors and classifiers.

Funder

Bundesministerium für Bildung und Forschung

Publisher

International Union of Crystallography (IUCr)

Subject

General Biochemistry, Genetics and Molecular Biology

Link

https://journals.iucr.org/j/issues/2023/01/00/te5101/te5101.pdf

Reference46 articles.

1. Cheetah: software for high-throughput reduction and analysis of serial femtosecond X-ray diffraction data

2. Becker, D. & Streit, A. (2014). 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, 3-5 December 2014, Sydney, Australia, pp. 71-76. New York: IEEE.

3. Linac Coherent Light Source: The first five years

4. The New Macromolecular Femtosecond Crystallography (MFX) Instrument at LCLS

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Bragg Spot Finder (BSF): a new machine-learning-aided approach to deal with spot finding for rapidly filtering diffraction pattern images;Journal of Applied Crystallography;2024-04-26

2. Robust image descriptor for machine learning based data reduction in serial crystallography;Journal of Applied Crystallography;2024-03-26

3. Data reduction activities at European XFEL: early results;Frontiers in Physics;2024-02-27

4. Data reduction and processing for photon science detectors;Frontiers in Physics;2024-02-05

5. Introduction to the virtual collection of papers on Artificial neural networks: applications in X-ray photon science and crystallography;Journal of Applied Crystallography;2024-02-01