Affiliation:
1. Shiga University
2. Keio University
3. RIKEN Center for Computational Science
Abstract
Abstract
Membrane permeability is an in vitro parameter that represents a compounds apparent permeability (Papp) and is one of the key ADME parameters in drug development. Caco-2 cell lines are the cell lines most commonly used to measure Papp. Other cell lines, such as the Madin-Darby Canine Kidney (MDCK), LLC-Pig Kidney 1 (LLC-PK1), and Ralph Russ Canine Kidney (RRCK) cell lines, have also been used to estimate Papp. Therefore, constructing in silico models to estimate Papp using the MDCK, LLC-PK1, and RRCK cell lines is necessary. Collecting extensive amounts of in vitro Papp data using these cell lines is crucial to construct in silico models. An open database helps in the collection of extensive measurements of various compounds covering a vast chemical space; however, concerns have been reported on the use of data published in open databases without checking their accuracy and quality. We developed a new workflow supporting for automatic curating Papp data measured in the MDCK, LLC-PK1, and RRCK cell lines collected from ChEMBL using KNIME. The workflow consisted of four main phases. Data were extracted from ChEMBL and filtered to identify the target protocols. A total of 1680 high-quality entries were retained after checking 436 articles. As a result, the cost of building highly accurate predictive models has been significantly reduced by automating the collection of reliable measurement data. The workflow is freely available, can be easily updated by anyone, and has high reusability. Our study provides an opportunity for researchers to analyze data quality and accelerate the development of helpful in silico models for effective drug discovery.
Publisher
Research Square Platform LLC