Affiliation:
1. Northwestern University, USA
Abstract
The authors analyze the lung cancer data available from the SEER program with the aim of identifying hotspots using association rule mining techniques. A subset of 13 patient attributes from the SEER data were recently linked with the survival outcome using prediction models, which is used in this study for segmentation. The goal here is to identify characteristics of patient segments where average survival is significantly higher/lower than average survival across the entire dataset. Automated association rule mining techniques resulted in hundreds of rules, from which many redundant rules were manually removed based on domain knowledge. Further, association rule mining based hotspot analysis was also conducted for conditional survival patient data, i.e., in cases where patients have already survived for a year after diagnosis. The resulting rules conform with existing biomedical knowledge and provide interesting insights into lung cancer survival.
Reference25 articles.
1. Agrawal, A., & Choudhary, A. (2011). Identifying HotSpots in lung cancer data using association rule mining. In Proceedings of the 2nd IEEE ICDM Workshop on Biological Data Mining and its Applications in Healthcare.
2. Agrawal, A., Misra, S., Narayanan, R., Polepeddi, L., & Choudhary, A. (2011). A lung cancer outcome calculator using ensemble data mining on SEER data. In Proceedings of the Tenth International Workshop on Data Mining in Bioinformatics (pp. 1-9).
3. Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data.
4. Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases.
5. Bellaachia, A., & Guven, E. (2006). Predicting breast cancer survivability using data mining techniques. In Proceedings of the Ninth Workshop on Mining Scientific and Engineering Datasets in conjunction with the Sixth SIAM International Conference on Data Mining.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献